# Pydantic AI > GenAI Agent Framework, the Pydantic way Pydantic AI is a Python agent framework designed to make it less painful to build production grade applications with Generative AI. --- # [Image, Audio, Video & Document Input](https://pydantic.dev/docs/ai/advanced-features/input/) # Image, Audio, Video & Document Input ## Image Input Note Some models do not support image input. Please check the model's documentation to confirm whether it supports image input. If you have a direct URL for the image, you can use [`ImageUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ImageUrl): image\_input.py ```python from pydantic_ai import Agent, ImageUrl agent = Agent(model='openai:gpt-5.2') result = agent.run_sync( [ 'What company is this logo from?', ImageUrl(url='https://iili.io/3Hs4FMg.png'), ] ) print(result.output) #> This is the logo for Pydantic, a data validation and settings management library in Python. ``` If you have the image locally, you can also use [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent): local\_image\_input.py ```python import httpx from pydantic_ai import Agent, BinaryContent image_response = httpx.get('https://iili.io/3Hs4FMg.png') # Pydantic logo agent = Agent(model='openai:gpt-5.2') result = agent.run_sync( [ 'What company is this logo from?', BinaryContent(data=image_response.content, media_type='image/png'), # (1) ] ) print(result.output) #> This is the logo for Pydantic, a data validation and settings management library in Python. ``` To ensure the example is runnable we download this image from the web, but you can also use `Path().read_bytes()` to read a local file's contents. ## Audio Input Note Some models do not support audio input. Please check the model's documentation to confirm whether it supports audio input. You can provide audio input using either [`AudioUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AudioUrl) or [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent). The process is analogous to the examples above. ## Video Input Note Some models do not support video input. Please check the model's documentation to confirm whether it supports video input. You can provide video input using either [`VideoUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.VideoUrl) or [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent). The process is analogous to the examples above. ## Document Input Note Some models do not support document input. Please check the model's documentation to confirm whether it supports document input. You can provide document input using either [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl) or [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent). The process is similar to the examples above. If you have a direct URL for the document, you can use [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl): document\_input.py ```python from pydantic_ai import Agent, DocumentUrl agent = Agent(model='anthropic:claude-sonnet-4-6') result = agent.run_sync( [ 'What is the main content of this document?', DocumentUrl(url='https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf'), ] ) print(result.output) #> This document is the technical report introducing Gemini 1.5, Google's latest large language model... ``` The supported document formats vary by model. You can also use [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) to pass document data directly: binary\_content\_input.py ```python from pathlib import Path from pydantic_ai import Agent, BinaryContent pdf_path = Path('document.pdf') agent = Agent(model='anthropic:claude-sonnet-4-6') result = agent.run_sync( [ 'What is the main content of this document?', BinaryContent(data=pdf_path.read_bytes(), media_type='application/pdf'), ] ) print(result.output) #> The document discusses... ``` Tip If neither `DocumentUrl` nor `BinaryContent` is suitable for your use case (e.g., the model doesn't support `DocumentUrl`, or you want to provide a document in a non-binary format), you can still provide document content as text input by extracting the text yourself and passing it as a string or [`TextContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextContent). ## Text Input You can use [`TextContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextContent) to provide text input with additional metadata: text\_content\_input.py ```python from pydantic_ai import Agent, TextContent agent = Agent(model='openai:gpt-5.2') result = agent.run_sync([ 'Summarize the key points from this text.', TextContent( content=( 'Pydantic AI is a Python agent framework. ' 'It supports text, image, audio, video, and document input.' ), metadata={'source': 'pydantic_ai_inputs.txt'}, ), ]) ``` This is equivalent to passing the text as a `str`, but allows you to include additional `metadata` that can be accessed programmatically in your agent logic. Note The `content` field is treated as input to the model, but the `metadata` is **not sent to the model**. It is preserved in messages for programmatic access. ## User-side download vs. direct file URL When using one of `ImageUrl`, `AudioUrl`, `VideoUrl` or `DocumentUrl`, Pydantic AI will default to sending the URL to the model provider, so the file is downloaded on their side. Support for file URLs varies depending on type and provider: Model Send URL directly Download and send bytes Unsupported [`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel) `ImageUrl` `AudioUrl`, `DocumentUrl` `VideoUrl`. `DocumentUrl` [not supported with `AzureProvider`](/docs/ai/models/openai#using-azure-with-the-responses-api) [`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel) `ImageUrl`, `AudioUrl`, `DocumentUrl` -- `VideoUrl` [`AnthropicModel`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModel) `ImageUrl`, `DocumentUrl` (PDF) `DocumentUrl` (`text/plain`) `AudioUrl`, `VideoUrl` [`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) (Vertex) All URL types -- -- [`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) (GLA) [YouTube](/docs/ai/models/google#document-image-audio-and-video-input), [Files API](/docs/ai/models/google#document-image-audio-and-video-input) All other URLs -- [`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel) `ImageUrl` `DocumentUrl` `AudioUrl`, `VideoUrl` [`MistralModel`](/docs/ai/api/models/mistral/#pydantic_ai.models.mistral.MistralModel) `ImageUrl`, `DocumentUrl` (PDF) -- `AudioUrl`, `VideoUrl`, `DocumentUrl` (non-PDF) [`BedrockConverseModel`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockConverseModel) S3 URLs (`s3://`) `ImageUrl`, `DocumentUrl`, `VideoUrl` `AudioUrl` [`OpenRouterModel`](/docs/ai/api/models/openrouter/#pydantic_ai.models.openrouter.OpenRouterModel) `ImageUrl`, `DocumentUrl`, `VideoUrl` `AudioUrl` -- A model API may be unable to download a file (e.g., because of crawling or access restrictions) even if it supports file URLs. For example, [`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) on Vertex AI limits YouTube video URLs to one URL per request. In such cases, you can instruct Pydantic AI to download the file content locally and send that instead of the URL by setting `force_download` on the URL object: force\_download.py ```py from pydantic_ai import ImageUrl, AudioUrl, VideoUrl, DocumentUrl ImageUrl(url='https://example.com/image.png', force_download=True) AudioUrl(url='https://example.com/audio.mp3', force_download=True) VideoUrl(url='https://example.com/video.mp4', force_download=True) DocumentUrl(url='https://example.com/doc.pdf', force_download=True) ``` ## Uploaded Files Some model providers have their own file storage APIs where you can upload files and reference them by ID or URL. Use [`UploadedFile`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.UploadedFile) to reference files that have been uploaded to a provider's file storage API. Tip For providers that return a file URL (like Google Files API or S3 URLs for Bedrock), you can also use [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl), [`ImageUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ImageUrl), or [`VideoUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.VideoUrl) directly. However, we recommend using `UploadedFile` for a unified API across providers and consistent provider name validation. ### Supported Models Model Support [`AnthropicModel`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModel) ✅ via [Anthropic Files API](https://docs.anthropic.com/en/docs/build-with-claude/files) [`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel) ✅ via [OpenAI Files API](https://platform.openai.com/docs/api-reference/files) [`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel) ✅ via [OpenAI Files API](https://platform.openai.com/docs/api-reference/files) [`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) ✅ via [Google Files API](https://ai.google.dev/gemini-api/docs/files) [`BedrockConverseModel`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockConverseModel) ✅ via S3 URLs (`s3://bucket/key`) [`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel) ✅ via [xAI Files API](https://docs.x.ai/docs/guides/files) Other models ❌ Not supported ### Provider Name Requirement When using [`UploadedFile`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.UploadedFile) you must set the `provider_name`. Uploaded files are specific to the system they are uploaded to and are not transferable across providers. Trying to use a message that contains an `UploadedFile` with a different provider will result in an error. Getting the provider name Use [`model.system`](/docs/ai/api/models/base/#pydantic_ai.models.Model.system) to get the correct provider name dynamically. This ensures your code works correctly even if the provider name changes. All examples below demonstrate this pattern. If you want to introduce portability into your agent logic to allow the same prompt history to work with different provider backends, you can use a [history processor](/docs/ai/core-concepts/message-history#processing-message-history) to remove or rewrite `UploadedFile` parts from messages before sending them to a provider that does not support them. Be aware that stripping out `UploadedFile` instances might confuse the model, especially if references to those files remain in the text. ### Media Type Inference The `media_type` parameter is optional for [`UploadedFile`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.UploadedFile). If not specified, Pydantic AI will attempt to infer it from the `file_id`: 1. If `file_id` is a URL or path with a recognizable file extension (e.g., `.pdf`, `.png`), the media type is inferred automatically 2. For opaque file IDs (e.g., `'file-abc123'`), the media type defaults to `'application/octet-stream'` Tip While `media_type` is optional, we recommend explicitly setting it when known to ensure correct handling by the model provider. ### Anthropic Follow the [Anthropic Files API docs](https://docs.anthropic.com/en/docs/build-with-claude/files) to upload files. You can access the underlying Anthropic client via `provider.client`. Beta Feature The Anthropic Files API is currently in beta. You need to include the beta header `anthropic-beta: files-api-2025-04-14` when making requests. uploaded\_file\_anthropic.py ```py import asyncio from pydantic_ai import Agent, ModelSettings, UploadedFile from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider async def main(): provider = AnthropicProvider() model = AnthropicModel('claude-sonnet-4-5', provider=provider) # Upload a file using the provider's client (Anthropic client) with open('document.pdf', 'rb') as f: uploaded_file = await provider.client.beta.files.upload(file=f) # Reference the uploaded file, including the required beta header agent = Agent(model) result = await agent.run( [ 'Summarize this document', UploadedFile(file_id=uploaded_file.id, provider_name=model.system), ], model_settings=ModelSettings(extra_headers={'anthropic-beta': 'files-api-2025-04-14'}), ) print(result.output) #> The document discusses the main topics and key findings... asyncio.run(main()) ``` ### OpenAI Follow the [OpenAI Files API docs](https://platform.openai.com/docs/api-reference/files/create) to upload files. You can access the underlying OpenAI client via `provider.client`. uploaded\_file\_openai.py ```py import asyncio from pydantic_ai import Agent, UploadedFile from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider async def main(): provider = OpenAIProvider() model = OpenAIChatModel('gpt-5', provider=provider) # Upload a file using the provider's client (OpenAI client) with open('document.pdf', 'rb') as f: uploaded_file = await provider.client.files.create(file=f, purpose='user_data') # Reference the uploaded file agent = Agent(model) result = await agent.run( [ 'Summarize this document', UploadedFile(file_id=uploaded_file.id, provider_name=model.system), ] ) print(result.output) #> The document discusses the main topics and key findings... asyncio.run(main()) ``` ### Google Follow the [Google Files API docs](https://ai.google.dev/gemini-api/docs/files) to upload files. You can access the underlying Google GenAI client via `provider.client`. uploaded\_file\_google.py ```py import asyncio from pydantic_ai import Agent, UploadedFile from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider async def main(): provider = GoogleProvider() model = GoogleModel('gemini-2.5-flash', provider=provider) # Upload a file using the provider's client (Google GenAI client) with open('document.pdf', 'rb') as f: file = await provider.client.aio.files.upload(file=f) assert file.uri is not None # Reference the uploaded file by URI (media_type is optional for Google) agent = Agent(model) result = await agent.run( [ 'Summarize this document', UploadedFile(file_id=file.uri, media_type=file.mime_type, provider_name=model.system), ] ) print(result.output) #> The document discusses the main topics and key findings... asyncio.run(main()) ``` ### Bedrock (S3) For Bedrock, files must be uploaded to S3 separately (e.g., using [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/put_object.html)). The assumed role must have `s3:GetObject` permission on the bucket. `media_type` may be required Bedrock requires `media_type` when the file extension is ambiguous or missing. For S3 URLs with clear extensions like `.pdf`, `.png`, etc., it can be inferred automatically. uploaded\_file\_bedrock.py ```py import asyncio from pydantic_ai import Agent, UploadedFile from pydantic_ai.models.bedrock import BedrockConverseModel async def main(): model = BedrockConverseModel('us.anthropic.claude-sonnet-4-20250514-v1:0') agent = Agent(model) result = await agent.run([ 'Summarize this document', UploadedFile( file_id='s3://my-bucket/document.pdf', provider_name=model.system, # 'bedrock' media_type='application/pdf', # Optional for .pdf, but recommended ), ]) print(result.output) #> The document discusses the main topics and key findings... asyncio.run(main()) ``` Note You can optionally specify a `bucketOwner` query parameter if the bucket is not owned by the account making the request: `s3://my-bucket/document.pdf?bucketOwner=123456789012` ### xAI Follow the [xAI Files API docs](https://docs.x.ai/docs/guides/files) to upload files. You can access the underlying xAI client via `provider.client`. uploaded\_file\_xai.py ```py import asyncio from pydantic_ai import Agent, UploadedFile from pydantic_ai.models.xai import XaiModel from pydantic_ai.providers.xai import XaiProvider async def main(): provider = XaiProvider() model = XaiModel('grok-4-fast', provider=provider) # Upload a file using the provider's client (xAI client) with open('document.pdf', 'rb') as f: uploaded_file = await provider.client.files.upload(f, filename='document.pdf') # Reference the uploaded file agent = Agent(model) result = await agent.run( [ 'Summarize this document', UploadedFile(file_id=uploaded_file.id, provider_name=model.system), ] ) print(result.output) #> The document discusses the main topics and key findings... asyncio.run(main()) ``` --- # [HTTP Request Retries](https://pydantic.dev/docs/ai/advanced-features/retries/) # HTTP Request Retries Pydantic AI provides retry functionality for HTTP requests made by model providers through custom HTTP transports. This is particularly useful for handling transient failures like rate limits, network timeouts, or temporary server errors. The retry functionality is built on top of the [tenacity](https://github.com/jd/tenacity) library and integrates seamlessly with httpx clients. You can configure retry behavior for any provider that accepts a custom HTTP client. ## Installation To use the retry transports, you need to install `tenacity`, which you can do via the `retries` dependency group: - [pip](#tab-panel-4) - [uv](#tab-panel-5) Terminal ```bash pip install 'pydantic-ai-slim[retries]' ``` Terminal ```bash uv add 'pydantic-ai-slim[retries]' ``` ## Usage Example Here's an example of adding retry functionality with smart retry handling: smart\_retry\_example.py ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after def create_retrying_client(): """Create a client with smart retry handling for multiple error types.""" def should_retry_status(response): """Raise exceptions for retryable HTTP status codes.""" if response.status_code in (429, 502, 503, 504): response.raise_for_status() # This will raise HTTPStatusError transport = AsyncTenacityTransport( config=RetryConfig( # Retry on HTTP errors and connection issues retry=retry_if_exception_type((HTTPStatusError, ConnectionError)), # Smart waiting: respects Retry-After headers, falls back to exponential backoff wait=wait_retry_after( fallback_strategy=wait_exponential(multiplier=1, max=60), max_wait=300 ), # Stop after 5 attempts stop=stop_after_attempt(5), # Re-raise the last exception if all retries fail reraise=True ), validate_response=should_retry_status ) return AsyncClient(transport=transport) # Use the retrying client with a model client = create_retrying_client() model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(http_client=client)) agent = Agent(model) ``` ## Wait Strategies ### wait\_retry\_after The `wait_retry_after` function is a smart wait strategy that automatically respects HTTP `Retry-After` headers: wait\_strategy\_example.py ```python from tenacity import wait_exponential from pydantic_ai.retries import wait_retry_after # Basic usage - respects Retry-After headers, falls back to exponential backoff wait_strategy_1 = wait_retry_after() # Custom configuration wait_strategy_2 = wait_retry_after( fallback_strategy=wait_exponential(multiplier=2, max=120), max_wait=600 # Never wait more than 10 minutes ) ``` This wait strategy: - Automatically parses `Retry-After` headers from HTTP 429 responses - Supports both seconds format (`"30"`) and HTTP date format (`"Wed, 21 Oct 2015 07:28:00 GMT"`) - Falls back to your chosen strategy when no header is present - Respects the `max_wait` limit to prevent excessive delays ## Transport Classes ### AsyncTenacityTransport For asynchronous HTTP clients (recommended for most use cases): async\_transport\_example.py ```python from httpx import AsyncClient from tenacity import stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig def validator(response): """Treat responses with HTTP status 4xx/5xx as failures that need to be retried. Without a response validator, only network errors and timeouts will result in a retry. """ response.raise_for_status() # Create the transport transport = AsyncTenacityTransport( config=RetryConfig(stop=stop_after_attempt(3), reraise=True), validate_response=validator ) # Create a client using the transport: client = AsyncClient(transport=transport) ``` ### TenacityTransport For synchronous HTTP clients: sync\_transport\_example.py ```python from httpx import Client from tenacity import stop_after_attempt from pydantic_ai.retries import RetryConfig, TenacityTransport def validator(response): """Treat responses with HTTP status 4xx/5xx as failures that need to be retried. Without a response validator, only network errors and timeouts will result in a retry. """ response.raise_for_status() # Create the transport transport = TenacityTransport( config=RetryConfig(stop=stop_after_attempt(3), reraise=True), validate_response=validator ) # Create a client using the transport client = Client(transport=transport) ``` ## Common Retry Patterns ### Rate Limit Handling with Retry-After Support rate\_limit\_handling.py ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after def create_rate_limit_client(): """Create a client that respects Retry-After headers from rate limiting responses.""" transport = AsyncTenacityTransport( config=RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after( fallback_strategy=wait_exponential(multiplier=1, max=60), max_wait=300 # Don't wait more than 5 minutes ), stop=stop_after_attempt(10), reraise=True ), validate_response=lambda r: r.raise_for_status() # Raises HTTPStatusError for 4xx/5xx ) return AsyncClient(transport=transport) # Example usage client = create_rate_limit_client() # Client is now ready to use with any HTTP requests and will respect Retry-After headers ``` The `wait_retry_after` function automatically detects `Retry-After` headers in 429 (rate limit) responses and waits for the specified time. If no header is present, it falls back to exponential backoff. ### Network Error Handling network\_error\_handling.py ```python import httpx from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig def create_network_resilient_client(): """Create a client that handles network errors with retries.""" transport = AsyncTenacityTransport( config=RetryConfig( retry=retry_if_exception_type(( httpx.TimeoutException, httpx.ConnectError, httpx.ReadError )), wait=wait_exponential(multiplier=1, max=10), stop=stop_after_attempt(3), reraise=True ) ) return httpx.AsyncClient(transport=transport) # Example usage client = create_network_resilient_client() # Client will now retry on timeout, connection, and read errors ``` ### Custom Retry Logic custom\_retry\_logic.py ```python import httpx from tenacity import retry_if_exception, stop_after_attempt, wait_exponential from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after def create_custom_retry_client(): """Create a client with custom retry logic.""" def custom_retry_condition(exception): """Custom logic to determine if we should retry.""" if isinstance(exception, httpx.HTTPStatusError): # Retry on server errors but not client errors return 500 <= exception.response.status_code < 600 return isinstance(exception, httpx.TimeoutException | httpx.ConnectError) transport = AsyncTenacityTransport( config=RetryConfig( retry=retry_if_exception(custom_retry_condition), # Use wait_retry_after for smart waiting on rate limits, # with custom exponential backoff as fallback wait=wait_retry_after( fallback_strategy=wait_exponential(multiplier=2, max=30), max_wait=120 ), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) return httpx.AsyncClient(transport=transport) client = create_custom_retry_client() # Client will retry server errors (5xx) and network errors, but not client errors (4xx) ``` ## Using with Different Providers The retry transports work with any provider that accepts a custom HTTP client: ### OpenAI openai\_with\_retries.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(http_client=client)) agent = Agent(model) ``` ### Anthropic anthropic\_with\_retries.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = AnthropicModel('claude-sonnet-4-5-20250929', provider=AnthropicProvider(http_client=client)) agent = Agent(model) ``` ### Any OpenAI-Compatible Provider openai\_compatible\_with\_retries.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = OpenAIChatModel( 'your-model-name', # Replace with actual model name provider=OpenAIProvider( base_url='https://api.example.com/v1', # Replace with actual API URL api_key='your-api-key', # Replace with actual API key http_client=client ) ) agent = Agent(model) ``` ## Best Practices 1. **Start Conservative**: Begin with a small number of retries (3-5) and reasonable wait times. 2. **Use Exponential Backoff**: This helps avoid overwhelming servers during outages. 3. **Set Maximum Wait Times**: Prevent indefinite delays with reasonable maximum wait times. 4. **Handle Rate Limits Properly**: Respect `Retry-After` headers when possible. 5. **Log Retry Attempts**: Add logging to monitor retry behavior in production. (This will be picked up by Logfire automatically if you instrument httpx.) 6. **Consider Circuit Breakers**: For high-traffic applications, consider implementing circuit breaker patterns. Monitoring Retries in Production Excessive retries can indicate underlying issues and increase costs. [Logfire](/docs/ai/integrations/logfire) helps you track retry patterns: - See which requests triggered retries - Understand retry causes (rate limits, server errors, timeouts) - Monitor retry frequency over time - Identify opportunities to reduce retries With [HTTPX instrumentation](/docs/ai/integrations/logfire#monitoring-http-requests) enabled, retry attempts are automatically captured in your traces. ## Error Handling The retry transports will re-raise the last exception if all retry attempts fail. Make sure to handle these appropriately in your application: error\_handling\_example.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(http_client=client)) agent = Agent(model) ``` ## Performance Considerations - Retries add latency to requests, especially with exponential backoff - Consider the total timeout for your application when configuring retry behavior - Monitor retry rates to detect systemic issues - Use async transports for better concurrency when handling multiple requests For more advanced retry configurations, refer to the [tenacity documentation](https://tenacity.readthedocs.io/). ## Provider-Specific Retry Behavior ### AWS Bedrock The AWS Bedrock provider uses boto3's built-in retry mechanisms instead of httpx. To configure retries for Bedrock, use boto3's `Config`: ```python from botocore.config import Config config = Config(retries={'max_attempts': 5, 'mode': 'adaptive'}) ``` See [Bedrock: Configuring Retries](/docs/ai/models/bedrock#configuring-retries) for complete examples. --- # [Thinking](https://pydantic.dev/docs/ai/advanced-features/thinking/) # Thinking Thinking (or reasoning) is the process by which a model works through a problem step-by-step before providing its final answer. This capability is typically disabled by default and depends on the specific model being used. See the sections below for how to enable thinking for each provider. ## Unified thinking settings The simplest way to enable thinking across any supported provider is the `thinking` field in [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings): unified\_thinking.py ```python from pydantic_ai import Agent agent = Agent('anthropic:claude-opus-4-6', model_settings={'thinking': 'high'}) ``` Or using the [`Thinking`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Thinking) capability: thinking\_capability.py ```python from pydantic_ai import Agent from pydantic_ai.capabilities import Thinking agent = Agent('anthropic:claude-opus-4-6', capabilities=[Thinking(effort='high')]) ``` The `thinking` setting accepts: - `True` -- enable thinking with the provider's default effort level - `False` -- disable thinking (silently ignored on always-on models) - `'minimal'` / `'low'` / `'medium'` / `'high'` / `'xhigh'` -- enable thinking at a specific effort level (unsupported levels map to the closest available value) When omitted, the model uses its default behavior. Provider-specific settings (documented in the sections below) take precedence when both are set. ### Provider translation The unified `thinking` setting maps to each provider's native format: Provider `thinking=True` `thinking='high'` Notes Anthropic (Opus 4.6+) `anthropic_thinking={'type': 'adaptive'}` `{type: 'adaptive'}` + `effort='high'` All truthy values → adaptive; effort via output\_config Anthropic (older) `anthropic_thinking={'type': 'enabled', 'budget_tokens': 10000}` `budget_tokens=16384` Budget-based; `'low'` → 2048 tokens OpenAI `reasoning_effort='medium'` `reasoning_effort='high'` Google (Gemini 3+) `include_thoughts=True` `thinking_level='HIGH'` Google (Gemini 2.5) `include_thoughts=True` `thinking_budget=24576` Groq `reasoning_format='parsed'` `reasoning_format='parsed'` `thinking=False` → `'hidden'` (no true disable) OpenRouter `reasoning.effort='medium'` `reasoning.effort='high'` Via `extra_body` Cerebras `disable_reasoning=False` `disable_reasoning=False` `thinking=False` → `disable_reasoning=True` xAI `reasoning_effort='high'` `reasoning_effort='high'` Only `'low'` and `'high'` Bedrock (Claude) `thinking.type='enabled'` `budget_tokens=16384` No adaptive support Bedrock (OpenAI) `reasoning_effort='medium'` `reasoning_effort='high'` ## OpenAI When using the [`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel), text output inside `` tags are converted to [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) objects. You can customize the tags using the [`thinking_tags`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile.thinking_tags) field on the [model profile](/docs/ai/models/openai#model-profile). Some [OpenAI-compatible model providers](/docs/ai/models/openai#openai-compatible-models) might also support native thinking parts that are not delimited by tags. Instead, they are sent and received as separate, custom fields in the API. Typically, if you are calling the model via the `:` shorthand, Pydantic AI handles it for you. Nonetheless, you can still configure the fields with [`openai_chat_thinking_field`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.openai.OpenAIModelProfile.openai_chat_thinking_field). If your provider recommends to send back these custom fields not changed, for caching or interleaved thinking benefits, you can also achieve this with [`openai_chat_send_back_thinking_parts`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.openai.OpenAIModelProfile.openai_chat_send_back_thinking_parts). ### OpenAI Responses The [`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel) can generate native thinking parts. To enable this functionality, you need to set the `OpenAIResponsesModelSettings.openai_reasoning_effort` and [`OpenAIResponsesModelSettings.openai_reasoning_summary`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_reasoning_summary) [model settings](/docs/ai/core-concepts/agent#model-run-settings). By default, the unique IDs of reasoning, text, and function call parts from the message history are sent to the model, which can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](/docs/ai/core-concepts/message-history#processing-message-history). To disable this, you can disable the [`OpenAIResponsesModelSettings.openai_send_reasoning_ids`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_send_reasoning_ids) [model setting](/docs/ai/core-concepts/agent#model-run-settings). openai\_thinking\_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model = OpenAIResponsesModel('gpt-5.2') settings = OpenAIResponsesModelSettings( openai_reasoning_effort='low', openai_reasoning_summary='detailed', ) agent = Agent(model, model_settings=settings) ... ``` Raw reasoning without summaries Some OpenAI-compatible APIs (such as LM Studio, vLLM, or OpenRouter with gpt-oss models) may return raw reasoning content without reasoning summaries. In this case, [`ThinkingPart.content`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart.content) will be empty, but the raw reasoning is available in `provider_details['raw_content']`. Following [OpenAI's guidance](https://cookbook.openai.com/examples/responses_api/reasoning_items) that raw reasoning should not be shown directly to users, we store it in `provider_details` rather than in the main `content` field. ## Anthropic To enable thinking, use the [`AnthropicModelSettings.anthropic_thinking`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_thinking) [model setting](/docs/ai/core-concepts/agent#model-run-settings). Note Extended thinking (`type: 'enabled'` with `budget_tokens`) is deprecated on `claude-opus-4-6`+. For those models, use [adaptive thinking](#adaptive-thinking--effort) instead. anthropic\_thinking\_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings model = AnthropicModel('claude-sonnet-4-5') settings = AnthropicModelSettings( anthropic_thinking={'type': 'enabled', 'budget_tokens': 1024}, ) agent = Agent(model, model_settings=settings) ... ``` ### Interleaved Thinking To enable [interleaved thinking](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#interleaved-thinking), you need to include the beta header in your model settings: anthropic\_interleaved\_thinking.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings model = AnthropicModel('claude-sonnet-4-5') settings = AnthropicModelSettings( anthropic_thinking={'type': 'enabled', 'budget_tokens': 10000}, extra_headers={'anthropic-beta': 'interleaved-thinking-2025-05-14'}, ) agent = Agent(model, model_settings=settings) ... ``` ### Adaptive Thinking & Effort Starting with `claude-opus-4-6`, Anthropic supports [adaptive thinking](https://docs.anthropic.com/en/docs/build-with-claude/adaptive-thinking), where the model dynamically decides when and how much to think based on the complexity of each request. This replaces extended thinking (`type: 'enabled'` with `budget_tokens`) which is deprecated on Opus 4.6. Adaptive thinking also automatically enables interleaved thinking. anthropic\_adaptive\_thinking.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings model = AnthropicModel('claude-opus-4-6') settings = AnthropicModelSettings( anthropic_thinking={'type': 'adaptive'}, anthropic_effort='high', ) agent = Agent(model, model_settings=settings) ... ``` The [`anthropic_effort`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_effort) setting controls how much effort the model puts into its response (independent of thinking). See the [Anthropic effort docs](https://docs.anthropic.com/en/docs/build-with-claude/effort) for details. Note Older models (`claude-sonnet-4-5`, `claude-opus-4-5`, etc.) do not support adaptive thinking and require `{'type': 'enabled', 'budget_tokens': N}` as shown [above](#anthropic). ## Google To enable thinking, use the [`GoogleModelSettings.google_thinking_config`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings.google_thinking_config) [model setting](/docs/ai/core-concepts/agent#model-run-settings). google\_thinking\_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings model = GoogleModel('gemini-3-pro-preview') settings = GoogleModelSettings(google_thinking_config={'include_thoughts': True}) agent = Agent(model, model_settings=settings) ... ``` ## xAI xAI reasoning models (Grok) support native thinking. To preserve the thinking content for multi-turn conversations, enable [`XaiModelSettings.xai_include_encrypted_content`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModelSettings.xai_include_encrypted_content). xai\_thinking\_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.xai import XaiModel, XaiModelSettings model = XaiModel('grok-4-fast-reasoning') settings = XaiModelSettings(xai_include_encrypted_content=True) agent = Agent(model, model_settings=settings) ... ``` ## Bedrock Although Bedrock Converse doesn't provide a unified API to enable thinking, you can still use [`BedrockModelSettings.bedrock_additional_model_requests_fields`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_additional_model_requests_fields) [model setting](/docs/ai/core-concepts/agent#model-run-settings) to pass provider-specific configuration: - [Claude](#tab-panel-0) - [OpenAI](#tab-panel-1) - [Qwen](#tab-panel-2) - [Deepseek](#tab-panel-3) bedrock\_claude\_thinking\_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings model = BedrockConverseModel('us.anthropic.claude-sonnet-4-5-20250929-v1:0') model_settings = BedrockModelSettings( bedrock_additional_model_requests_fields={ 'thinking': {'type': 'enabled', 'budget_tokens': 1024} } ) agent = Agent(model=model, model_settings=model_settings) ``` bedrock\_openai\_thinking\_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings model = BedrockConverseModel('openai.gpt-oss-120b-1:0') model_settings = BedrockModelSettings( bedrock_additional_model_requests_fields={'reasoning_effort': 'low'} ) agent = Agent(model=model, model_settings=model_settings) ``` bedrock\_qwen\_thinking\_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings model = BedrockConverseModel('qwen.qwen3-32b-v1:0') model_settings = BedrockModelSettings( bedrock_additional_model_requests_fields={'reasoning_config': 'high'} ) agent = Agent(model=model, model_settings=model_settings) ``` Reasoning is [always enabled](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-reasoning.html) for Deepseek model bedrock\_deepseek\_thinking\_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel model = BedrockConverseModel('us.deepseek.r1-v1:0') agent = Agent(model=model) ``` ## Groq Groq supports different formats to receive thinking parts: - `"raw"`: The thinking part is included in the text content inside `` tags, which are automatically converted to [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) objects. - `"hidden"`: The thinking part is not included in the text content. - `"parsed"`: The thinking part has its own structured part in the response which is converted into a [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) object. To enable thinking, use the [`GroqModelSettings.groq_reasoning_format`](/docs/ai/api/models/groq/#pydantic_ai.models.groq.GroqModelSettings.groq_reasoning_format) [model setting](/docs/ai/core-concepts/agent#model-run-settings): groq\_thinking\_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel, GroqModelSettings model = GroqModel('qwen/qwen3-32b') settings = GroqModelSettings(groq_reasoning_format='parsed') agent = Agent(model, model_settings=settings) ... ``` Note Groq does not support truly disabling thinking. When `thinking=False` is set via the unified setting, Pydantic AI sends `reasoning_format='hidden'`, which suppresses reasoning output but the model may still reason internally. ## OpenRouter To enable thinking, use the [`OpenRouterModelSettings.openrouter_reasoning`](/docs/ai/api/models/openrouter/#pydantic_ai.models.openrouter.OpenRouterModelSettings.openrouter_reasoning) [model setting](/docs/ai/core-concepts/agent#model-run-settings). openrouter\_thinking\_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.openrouter import OpenRouterModel, OpenRouterModelSettings model = OpenRouterModel('openai/gpt-5.2') settings = OpenRouterModelSettings(openrouter_reasoning={'effort': 'high'}) agent = Agent(model, model_settings=settings) ... ``` ## Mistral Thinking is supported by the `magistral` family of models. It does not need to be specifically enabled. ## Cohere Thinking is supported by the `command-a-reasoning-08-2025` model. It does not need to be specifically enabled. ## Hugging Face Text output inside `` tags is automatically converted to [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) objects. You can customize the tags using the [`thinking_tags`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile.thinking_tags) field on the [model profile](/docs/ai/models/openai#model-profile). ## Outlines Some local models run through Outlines include in their text output a thinking part delimited by tags. In that case, it will be handled by Pydantic AI that will separate the thinking part from the final answer without the need to specifically enable it. The thinking tags used by default are `""` and `""`. If your model uses different tags, you can specify them in the [model profile](/docs/ai/models/openai#model-profile) using the [`thinking_tags`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile.thinking_tags) field. Outlines currently does not support thinking along with structured output. If you provide an `output_type`, the model text output will not contain a thinking part with the associated tags, and you may experience degraded performance. --- # [fasta2a](https://pydantic.dev/docs/ai/api/fasta2a/) # fasta2a ### Storage **Bases:** `ABC`, `Generic[ContextT]` A storage to retrieve and save tasks, as well as retrieve and save context. The storage serves two purposes: 1. Task storage: Stores tasks in A2A protocol format with their status, artifacts, and message history 2. Context storage: Stores conversation context in a format optimized for the specific agent implementation #### Methods ##### load\_task `@abstractmethod` `@async` ```python def load_task(task_id: str, history_length: int | None = None) -> Task | None ``` Load a task from storage. If the task is not found, return None. ###### Returns `Task` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### submit\_task `@abstractmethod` `@async` ```python def submit_task(context_id: str, message: Message) -> Task ``` Submit a task to storage. ###### Returns `Task` ##### update\_task `@abstractmethod` `@async` ```python def update_task( task_id: str, state: TaskState, new_artifacts: list[Artifact] | None = None, new_messages: list[Message] | None = None, ) -> Task ``` Update the state of a task. Appends artifacts and messages, if specified. ###### Returns `Task` ##### load\_context `@abstractmethod` `@async` ```python def load_context(context_id: str) -> ContextT | None ``` Retrieve the stored context given the `context_id`. ###### Returns `ContextT` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### update\_context `@abstractmethod` `@async` ```python def update_context(context_id: str, context: ContextT) -> None ``` Updates the context for a `context_id`. Implementing agent can decide what to store in context. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ### Broker **Bases:** `ABC` The broker class is in charge of scheduling the tasks. The HTTP server uses the broker to schedule tasks. The simple implementation is the `InMemoryBroker`, which is the broker that runs the tasks in the same process as the HTTP server. That said, this class can be extended to support remote workers. #### Methods ##### run\_task `@abstractmethod` `@async` ```python def run_task(params: TaskSendParams) -> None ``` Send a task to be executed by the worker. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### cancel\_task `@abstractmethod` `@async` ```python def cancel_task(params: TaskIdParams) -> None ``` Cancel a task. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### receive\_task\_operations `@abstractmethod` ```python def receive_task_operations() -> AsyncIterator[TaskOperation] ``` Receive task operations from the broker. On a multi-worker setup, the broker will need to round-robin the task operations between the workers. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`TaskOperation`\] ### Worker **Bases:** `ABC`, `Generic[ContextT]` A worker is responsible for executing tasks. #### Methods ##### run `@async` ```python def run() -> AsyncIterator[None] ``` Run the worker. It connects to the broker, and it makes itself available to receive commands. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ### FastA2A **Bases:** `Starlette` The main class for the FastA2A library. ### Skill **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Skills are a unit of capability that an agent can perform. #### Attributes ##### id A unique identifier for the skill. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### name Human readable name of the skill. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### description A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### tags Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### examples The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\] ##### input\_modes Supported mime types for input data. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### output\_modes Supported mime types for output data. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] This module contains the schema for the agent card. ### AgentCard **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) The card that describes an agent. #### Attributes ##### name Human readable name of the agent e.g. "Recipe Agent". **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### description A human-readable description of the agent. Used to assist users and other agents in understanding what the agent can do. (e.g. "Agent that helps users with recipes and cooking.") **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### url A URL to the address the agent is hosted at. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### version The version of the agent - format is up to the provider. (e.g. "1.0.0") **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### protocol\_version The version of the A2A protocol this agent supports. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider The service provider of the agent. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[`AgentProvider`\] ##### documentation\_url A URL to documentation for the agent. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### icon\_url A URL to an icon for the agent. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### preferred\_transport The transport of the preferred endpoint. If empty, defaults to JSONRPC. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### additional\_interfaces Announcement of additional supported transports. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`AgentInterface`\]\] ##### capabilities The capabilities of the agent. **Type:** `AgentCapabilities` ##### security Security requirements for contacting the agent. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\]\]\] ##### security\_schemes Security scheme definitions. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `SecurityScheme`\]\] ##### default\_input\_modes Supported mime types for input data. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### default\_output\_modes Supported mime types for output data. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### skills The set of skills, or distinct capabilities, that the agent can perform. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Skill`\] ### AgentProvider **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) The service provider of the agent. #### Attributes ##### organization The name of the agent provider's organization. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### url A URL for the agent provider's website or relevant documentation. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### AgentCapabilities **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) The capabilities of the agent. #### Attributes ##### streaming Whether the agent supports streaming. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\] ##### push\_notifications Whether the agent can notify updates to client. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\] ##### state\_transition\_history Whether the agent exposes status change history for tasks. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\] ### HttpSecurityScheme **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) HTTP security scheme. #### Attributes ##### type The type of the security scheme. Must be 'http'. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['http'\] ##### scheme The name of the HTTP Authorization scheme. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### bearer\_format A hint to the client to identify how the bearer token is formatted. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### description Description of this security scheme. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ### ApiKeySecurityScheme **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) API Key security scheme. #### Attributes ##### type The type of the security scheme. Must be 'apiKey'. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['apiKey'\] ##### name The name of the header, query or cookie parameter to be used. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### in\_ The location of the API key. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['query', 'header', 'cookie'\] ##### description Description of this security scheme. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ### OAuth2SecurityScheme **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) OAuth2 security scheme. #### Attributes ##### type The type of the security scheme. Must be 'oauth2'. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['oauth2'\] ##### flows An object containing configuration information for the flow types supported. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### description Description of this security scheme. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ### OpenIdConnectSecurityScheme **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) OpenID Connect security scheme. #### Attributes ##### type The type of the security scheme. Must be 'openIdConnect'. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['openIdConnect'\] ##### open\_id\_connect\_url OpenId Connect URL to discover OAuth2 configuration values. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### description Description of this security scheme. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ### AgentInterface **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) An interface that the agent supports. #### Attributes ##### transport The transport protocol (e.g., 'jsonrpc', 'websocket'). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### url The URL endpoint for this transport. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### description Description of this interface. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ### AgentExtension **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) A declaration of an extension supported by an Agent. #### Attributes ##### uri The URI of the extension. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### description A description of how this agent uses this extension. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### required Whether the client must follow specific requirements of the extension. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\] ##### params Optional configuration for the extension. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ### Skill **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Skills are a unit of capability that an agent can perform. #### Attributes ##### id A unique identifier for the skill. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### name Human readable name of the skill. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### description A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### tags Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### examples The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\] ##### input\_modes Supported mime types for input data. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### output\_modes Supported mime types for output data. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ### Artifact **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Agents generate Artifacts as an end result of a Task. Artifacts are immutable, can be named, and can have multiple parts. A streaming response can append parts to existing Artifacts. A single Task can generate many Artifacts. For example, "create a webpage" could create separate HTML and image Artifacts. #### Attributes ##### artifact\_id Unique identifier for the artifact. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### name The name of the artifact. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### description A description of the artifact. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### parts The parts that make up the artifact. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Part`\] ##### metadata Metadata about the artifact. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ##### extensions Array of extensions. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\] ##### append Whether to append this artifact to an existing one. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\] ##### last\_chunk Whether this is the last chunk of the artifact. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\] ### PushNotificationConfig **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Configuration for push notifications. A2A supports a secure notification mechanism whereby an agent can notify a client of an update outside a connected session via a PushNotificationService. Within and across enterprises, it is critical that the agent verifies the identity of the notification service, authenticates itself with the service, and presents an identifier that ties the notification to the executing Task. The target server of the PushNotificationService should be considered a separate service, and is not guaranteed (or even expected) to be the client directly. This PushNotificationService is responsible for authenticating and authorizing the agent and for proxying the verified notification to the appropriate endpoint (which could be anything from a pub/sub queue, to an email inbox or other service, etc.). For contrived scenarios with isolated client-agent pairs (e.g. local service mesh in a contained VPC, etc.) or isolated environments without enterprise security concerns, the client may choose to simply open a port and act as its own PushNotificationService. Any enterprise implementation will likely have a centralized service that authenticates the remote agents with trusted notification credentials and can handle online/offline scenarios. (This should be thought of similarly to a mobile Push Notification Service). #### Attributes ##### id Server-assigned identifier. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### url The URL to send push notifications to. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### token Token unique to this task/session. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### authentication Authentication details for push notifications. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[`SecurityScheme`\] ### TaskPushNotificationConfig **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Configuration for task push notifications. #### Attributes ##### id The task id. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### push\_notification\_config The push notification configuration. **Type:** `PushNotificationConfig` ### Message **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) A Message contains any content that is not an Artifact. This can include things like agent thoughts, user context, instructions, errors, status, or metadata. All content from a client comes in the form of a Message. Agents send Messages to communicate status or to provide instructions (whereas generated results are sent as Artifacts). A Message can have multiple parts to denote different pieces of content. For example, a user request could include a textual description from a user and then multiple files used as context from the client. #### Attributes ##### role The role of the message. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['user', 'agent'\] ##### parts The parts of the message. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Part`\] ##### kind Event type. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['message'\] ##### metadata Metadata about the message. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ##### message\_id Identifier created by the message creator. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### context\_id The context the message is associated with. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### task\_id Identifier of task the message is related to. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### reference\_task\_ids Array of task IDs this message references. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\] ##### extensions Array of extensions. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\] ### TextPart **Bases:** `_BasePart` A part that contains text. #### Attributes ##### kind The kind of the part. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text'\] ##### text The text of the part. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### FileWithBytes **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) File with base64 encoded data. #### Attributes ##### bytes The base64 encoded content of the file. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### mime\_type Optional mime type for the file. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ### FileWithUri **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) File with URI reference. #### Attributes ##### uri The URI of the file. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### mime\_type The mime type of the file. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ### FilePart **Bases:** `_BasePart` A part that contains a file. #### Attributes ##### kind The kind of the part. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['file'\] ##### file The file content - either bytes or URI. **Type:** `FileWithBytes` | `FileWithUri` ### DataPart **Bases:** `_BasePart` A part that contains structured data. #### Attributes ##### kind The kind of the part. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['data'\] ##### data The data of the part. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ### TaskStatus **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Status and accompanying message for a task. #### Attributes ##### state The current state of the task. **Type:** `TaskState` ##### message Additional status updates for client. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[`Message`\] ##### timestamp ISO datetime value of when the status was updated. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ### Task **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) A Task is a stateful entity that allows Clients and Remote Agents to achieve a specific outcome. Clients and Remote Agents exchange Messages within a Task. Remote Agents generate results as Artifacts. A Task is always created by a Client and the status is always determined by the Remote Agent. #### Attributes ##### id Unique identifier for the task. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### context\_id The context the task is associated with. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### kind Event type. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['task'\] ##### status Current status of the task. **Type:** `TaskStatus` ##### history Optional history of messages. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`Message`\]\] ##### artifacts Collection of artifacts created by the agent. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`Artifact`\]\] ##### metadata Extension metadata. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ### TaskStatusUpdateEvent **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Sent by server during message/stream requests. #### Attributes ##### task\_id The id of the task. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### context\_id The context the task is associated with. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### kind Event type. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['status-update'\] ##### status The status of the task. **Type:** `TaskStatus` ##### final Indicates the end of the event stream. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### metadata Extension metadata. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ### TaskArtifactUpdateEvent **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Sent by server during message/stream requests. #### Attributes ##### task\_id The id of the task. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### context\_id The context the task is associated with. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### kind Event type identification. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['artifact-update'\] ##### artifact The artifact that was updated. **Type:** `Artifact` ##### append Whether to append to existing artifact (true) or replace (false). **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\] ##### last\_chunk Indicates this is the final chunk of the artifact. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\] ##### metadata Extension metadata. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ### TaskIdParams **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Parameters for a task id. #### Attributes ##### id The unique identifier for the task. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### metadata Optional metadata associated with the request. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ### TaskQueryParams **Bases:** `TaskIdParams` Query parameters for a task. #### Attributes ##### history\_length Number of recent messages to be retrieved. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`int`](https://docs.python.org/3/library/functions.html#int)\] ### MessageSendConfiguration **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Configuration for the send message request. #### Attributes ##### accepted\_output\_modes Accepted output modalities by the client. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### blocking If the server should treat the client as a blocking request. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\] ##### history\_length Number of recent messages to be retrieved. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`int`](https://docs.python.org/3/library/functions.html#int)\] ##### push\_notification\_config Where the server should send notifications when disconnected. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[`PushNotificationConfig`\] ### MessageSendParams **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Parameters for message/send method. #### Attributes ##### configuration Send message configuration. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[`MessageSendConfiguration`\] ##### message The message being sent to the server. **Type:** `Message` ##### metadata Extension metadata. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ### TaskSendParams **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Internal parameters for task execution within the framework. Note: This is not part of the A2A protocol - it's used internally for broker/worker communication. #### Attributes ##### id The id of the task. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### context\_id The context id for the task. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### message The message to process. **Type:** `Message` ##### history\_length Number of recent messages to be retrieved. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`int`](https://docs.python.org/3/library/functions.html#int)\] ##### metadata Extension metadata. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ### ListTaskPushNotificationConfigParams **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Parameters for getting list of pushNotificationConfigurations associated with a Task. #### Attributes ##### id Task id. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### metadata Extension metadata. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ### DeleteTaskPushNotificationConfigParams **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Parameters for removing pushNotificationConfiguration associated with a Task. #### Attributes ##### id Task id. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### push\_notification\_config\_id The push notification config id to delete. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### metadata Extension metadata. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ### JSONRPCMessage **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) A JSON RPC message. #### Attributes ##### jsonrpc The JSON RPC version. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['2.0'\] ##### id The request id. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ### JSONRPCRequest **Bases:** `JSONRPCMessage`, `Generic[Method, Params]` A JSON RPC request. #### Attributes ##### method The method to call. **Type:** `Method` ##### params The parameters to pass to the method. **Type:** `Params` ### JSONRPCError **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict), `Generic[CodeT, MessageT]` A JSON RPC error. #### Attributes ##### code A number that indicates the error type that occurred. **Type:** `CodeT` ##### message A string providing a short description of the error. **Type:** `MessageT` ##### data A primitive or structured value containing additional information about the error. **Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ### JSONRPCResponse **Bases:** `JSONRPCMessage`, `Generic[ResultT, ErrorT]` A JSON RPC response. ### SecurityScheme A security scheme for authentication. **Default:** `Annotated[Union[HttpSecurityScheme, ApiKeySecurityScheme, OAuth2SecurityScheme, OpenIdConnectSecurityScheme], pydantic.Field(discriminator='type')]` ### Part A fully formed piece of content exchanged between a client and a remote agent as part of a Message or an Artifact. Each Part has its own content type and metadata. **Default:** `Annotated[Union[TextPart, FilePart, DataPart], pydantic.Field(discriminator='kind')]` ### TaskState The possible states of a task. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['submitted', 'working', 'input-required', 'completed', 'canceled', 'failed', 'rejected', 'auth-required', 'unknown']` ### JSONParseError A JSON RPC error for a parse error. **Default:** `JSONRPCError[Literal[-32700], Literal['Invalid JSON payload']]` ### InvalidRequestError A JSON RPC error for an invalid request. **Default:** `JSONRPCError[Literal[-32600], Literal['Request payload validation error']]` ### MethodNotFoundError A JSON RPC error for a method not found. **Default:** `JSONRPCError[Literal[-32601], Literal['Method not found']]` ### InvalidParamsError A JSON RPC error for invalid parameters. **Default:** `JSONRPCError[Literal[-32602], Literal['Invalid parameters']]` ### InternalError A JSON RPC error for an internal error. **Default:** `JSONRPCError[Literal[-32603], Literal['Internal error']]` ### TaskNotFoundError A JSON RPC error for a task not found. **Default:** `JSONRPCError[Literal[-32001], Literal['Task not found']]` ### TaskNotCancelableError A JSON RPC error for a task not cancelable. **Default:** `JSONRPCError[Literal[-32002], Literal['Task not cancelable']]` ### PushNotificationNotSupportedError A JSON RPC error for a push notification not supported. **Default:** `JSONRPCError[Literal[-32003], Literal['Push notification not supported']]` ### UnsupportedOperationError A JSON RPC error for an unsupported operation. **Default:** `JSONRPCError[Literal[-32004], Literal['This operation is not supported']]` ### ContentTypeNotSupportedError A JSON RPC error for incompatible content types. **Default:** `JSONRPCError[Literal[-32005], Literal['Incompatible content types']]` ### InvalidAgentResponseError A JSON RPC error for invalid agent response. **Default:** `JSONRPCError[Literal[-32006], Literal['Invalid agent response']]` ### SendMessageRequest A JSON RPC request to send a message. **Default:** `JSONRPCRequest[Literal['message/send'], MessageSendParams]` ### SendMessageResponse A JSON RPC response to send a message. **Default:** `JSONRPCResponse[Union[Task, Message], JSONRPCError[Any, Any]]` ### StreamMessageRequest A JSON RPC request to stream a message. **Default:** `JSONRPCRequest[Literal['message/stream'], MessageSendParams]` ### StreamMessageResponse A JSON RPC response to a StreamMessageRequest. **Default:** `JSONRPCResponse[Union[Task, Message, TaskStatusUpdateEvent, TaskArtifactUpdateEvent], JSONRPCError[Any, Any]]` ### GetTaskRequest A JSON RPC request to get a task. **Default:** `JSONRPCRequest[Literal['tasks/get'], TaskQueryParams]` ### GetTaskResponse A JSON RPC response to get a task. **Default:** `JSONRPCResponse[Task, TaskNotFoundError]` ### CancelTaskRequest A JSON RPC request to cancel a task. **Default:** `JSONRPCRequest[Literal['tasks/cancel'], TaskIdParams]` ### CancelTaskResponse A JSON RPC response to cancel a task. **Default:** `JSONRPCResponse[Task, Union[TaskNotCancelableError, TaskNotFoundError]]` ### SetTaskPushNotificationRequest A JSON RPC request to set a task push notification. **Default:** `JSONRPCRequest[Literal['tasks/pushNotification/set'], TaskPushNotificationConfig]` ### SetTaskPushNotificationResponse A JSON RPC response to set a task push notification. **Default:** `JSONRPCResponse[TaskPushNotificationConfig, PushNotificationNotSupportedError]` ### GetTaskPushNotificationRequest A JSON RPC request to get a task push notification. **Default:** `JSONRPCRequest[Literal['tasks/pushNotification/get'], TaskIdParams]` ### GetTaskPushNotificationResponse A JSON RPC response to get a task push notification. **Default:** `JSONRPCResponse[TaskPushNotificationConfig, PushNotificationNotSupportedError]` ### ResubscribeTaskRequest A JSON RPC request to resubscribe to a task. **Default:** `JSONRPCRequest[Literal['tasks/resubscribe'], TaskIdParams]` ### ListTaskPushNotificationConfigRequest A JSON RPC request to list task push notification configs. **Default:** `JSONRPCRequest[Literal['tasks/pushNotificationConfig/list'], ListTaskPushNotificationConfigParams]` ### DeleteTaskPushNotificationConfigRequest A JSON RPC request to delete a task push notification config. **Default:** `JSONRPCRequest[Literal['tasks/pushNotificationConfig/delete'], DeleteTaskPushNotificationConfigParams]` ### A2ARequest A JSON RPC request to the A2A server. **Default:** `Annotated[Union[SendMessageRequest, StreamMessageRequest, GetTaskRequest, CancelTaskRequest, SetTaskPushNotificationRequest, GetTaskPushNotificationRequest, ResubscribeTaskRequest, ListTaskPushNotificationConfigRequest, DeleteTaskPushNotificationConfigRequest], Discriminator('method')]` ### A2AResponse A JSON RPC response from the A2A server. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Union[SendMessageResponse, StreamMessageResponse, GetTaskResponse, CancelTaskResponse, SetTaskPushNotificationResponse, GetTaskPushNotificationResponse]` ### A2AClient A client for the A2A protocol. #### Methods ##### send\_message `@async` ```python def send_message( message: Message, metadata: dict[str, Any] | None = None, configuration: MessageSendConfiguration | None = None, ) -> SendMessageResponse ``` Send a message using the A2A protocol. Returns a JSON-RPC response containing either a result (Task) or an error. ###### Returns `SendMessageResponse` ### UnexpectedResponseError **Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception) An error raised when an unexpected response is received from the server. --- # [pydantic_ai.models.anthropic](https://pydantic.dev/docs/ai/api/models/anthropic/) # pydantic\_ai.models.anthropic ## Setup For details on how to set up authentication with this model, see [model configuration for Anthropic](/docs/ai/models/anthropic). ### AnthropicModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings used for an Anthropic model request. #### Attributes ##### anthropic\_metadata An object describing metadata about the request. Contains `user_id`, an external identifier for the user who is associated with the request. **Type:** `BetaMetadataParam` ##### anthropic\_thinking Determine whether the model should generate a thinking block. See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking) for more information. **Type:** `BetaThinkingConfigParam` ##### anthropic\_cache\_tool\_definitions Whether to add `cache_control` to the last tool definition. When enabled, the last tool in the `tools` array will have `cache_control` set, allowing Anthropic to cache tool definitions and reduce costs. If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. See [https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) for more information. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\] ##### anthropic\_cache\_instructions Whether to add `cache_control` to the last system prompt block. When enabled, the last system prompt will have `cache_control` set, allowing Anthropic to cache system instructions and reduce costs. If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. See [https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) for more information. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\] ##### anthropic\_cache\_messages Convenience setting to enable caching for the last user message. When enabled, this automatically adds a cache point to the last content block in the final user message, which is useful for caching conversation history or context in multi-turn conversations. If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. Note: Uses 1 of Anthropic's 4 available cache points per request. Any additional CachePoint markers in messages will be automatically limited to respect the 4-cache-point maximum. See [https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) for more information. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\] ##### anthropic\_effort The effort level for the model to use when generating a response. See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/effort) for more information. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high', 'max'\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### anthropic\_container Container configuration for multi-turn conversations. By default, if previous messages contain a container\_id (from a prior response), it will be reused automatically. Set to `False` to force a fresh container (ignore any `container_id` from history). Set to a dict (e.g. `{'id': 'container_xxx'}`) to explicitly specify a container. **Type:** `BetaContainerParams` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\] ##### anthropic\_eager\_input\_streaming Whether to enable eager input streaming on tool definitions. When enabled, all tool definitions will have `eager_input_streaming` set to `True`, allowing Anthropic to stream tool call arguments incrementally instead of buffering the entire JSON before streaming. This reduces latency for tool calls with large inputs. See [https://platform.claude.com/docs/en/agents-and-tools/tool-use/fine-grained-tool-streaming](https://platform.claude.com/docs/en/agents-and-tools/tool-use/fine-grained-tool-streaming) for more information. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### anthropic\_betas List of Anthropic beta features to enable for API requests. Each item can be a known beta name (e.g. 'interleaved-thinking-2025-05-14') or a custom string. Merged with auto-added betas (e.g. structured-outputs, builtin tools) and any betas from extra\_headers\['anthropic-beta'\]. See the Anthropic docs for available beta features. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`AnthropicBetaParam`\] ##### anthropic\_context\_management Context management configuration for automatic compaction. When configured, Anthropic will automatically compact older context when the input token count exceeds the configured threshold. The compaction produces a summary that replaces the compacted messages. See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/compaction) for more details. **Type:** `BetaContextManagementConfigParam` ### AnthropicModel **Bases:** `Model[AsyncAnthropicClient]` A model that uses the Anthropic API. Internally, this uses the [Anthropic Python client](https://github.com/anthropics/anthropic-sdk-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. #### Attributes ##### model\_name The model name. **Type:** `AnthropicModelName` ##### system The model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: AnthropicModelName, provider: Literal['anthropic', 'gateway'] | Provider[AsyncAnthropicClient] = 'anthropic', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Initialize an Anthropic model. ###### Parameters **`model_name`** : `AnthropicModelName` The name of the Anthropic model to use. List of model names available [here](https://docs.anthropic.com/en/docs/about-claude/models). **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['anthropic', 'gateway'\] | `Provider`\[`AsyncAnthropicClient`\] _Default:_ `'anthropic'` The provider to use for the Anthropic API. Can be either the string 'anthropic' or an instance of `Provider[AsyncAnthropicClient]`. Defaults to 'anthropic'. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider based on the model name. The default 'anthropic' provider will use the default `..profiles.anthropic.anthropic_model_profile`. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default model settings for this model instance. ##### supported\_builtin\_tools `@classmethod` ```python def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]] ``` The set of builtin tool types this model can handle. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] ### AnthropicCompaction **Bases:** `AbstractCapability[AgentDepsT]` Compaction capability for Anthropic models. Configures automatic context management via Anthropic's `context_management` API parameter. Compaction triggers server-side when input tokens exceed the configured threshold. Example usage:: from pydantic\_ai import Agent from pydantic\_ai.models.anthropic import AnthropicCompaction agent = Agent( 'anthropic:claude-sonnet-4-6', capabilities=\[AnthropicCompaction(token\_threshold=100\_000)\], ) #### Methods ##### \_\_init\_\_ ```python def __init__( token_threshold: int = 150000, instructions: str | None = None, pause_after_compaction: bool = False, ) -> None ``` Initialize the Anthropic compaction capability. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`token_threshold`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `150000` Compact when input tokens exceed this threshold. Minimum 50,000. **`instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom instructions for the compaction summarization. **`pause_after_compaction`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` If `True`, the response will stop after the compaction block with `stop_reason='compaction'`, allowing explicit handling. ### AnthropicStreamedResponse **Bases:** `StreamedResponse` Implementation of `StreamedResponse` for Anthropic models. #### Attributes ##### model\_name Get the model name of the response. **Type:** `AnthropicModelName` ##### provider\_name Get the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ### LatestAnthropicModelNames Latest Anthropic models. **Default:** `ModelParam` ### AnthropicModelName Possible Anthropic model names. Since Anthropic supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Anthropic docs](https://docs.anthropic.com/en/docs/about-claude/models) for a full list. **Default:** `str | LatestAnthropicModelNames` --- # [pydantic_ai.models](https://pydantic.dev/docs/ai/api/models/base/) # pydantic\_ai.models Logic related to making requests to an LLM. The aim here is to make a common interface for different LLMs, so that the rest of the code can be agnostic to the specific LLM being used. ### ModelRequestParameters Configuration for an agent's request to a model, specifically related to tools and output handling. #### Attributes ##### instruction\_parts Structured instruction parts with metadata about their origin (static vs dynamic). Static instructions (`dynamic=False`) come from literal strings passed to `Agent(instructions=...)`. Dynamic instructions (`dynamic=True`) come from `@agent.instructions` functions, `TemplateStr`, or toolset `get_instructions()` methods. Models that support granular caching (e.g. Anthropic, Bedrock) use this to place cache boundaries at the static/dynamic instruction boundary. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### thinking Resolved thinking/reasoning configuration for this request. `None` means the model should use its default behavior. Set by the base `Model.prepare_request()` from the unified `thinking` field in `ModelSettings`, after checking that the model's profile supports thinking. **Type:** `ThinkingLevel` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` #### Methods ##### with\_default\_output\_mode ```python def with_default_output_mode( output_mode: StructuredOutputMode, ) -> ModelRequestParameters ``` Set the default output mode if the current mode is 'auto', atomically updating allow\_text\_output. No-op if the current output\_mode is not 'auto'. This ensures the two fields stay in sync -- output\_mode='tool' implies allow\_text\_output=False, while 'native' and 'prompted' imply allow\_text\_output=True. ###### Returns `ModelRequestParameters` ### Model **Bases:** `ABC`, `Generic[InterfaceClient]` Abstract class for a model. #### Attributes ##### provider The provider for this model, if any. **Type:** `Provider`\[`InterfaceClient`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### settings Get the model settings. **Type:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### model\_name The model name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### model\_id The fully qualified model name in `'provider:model_name'` format. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### label Human-friendly display label for the model. Handles common patterns: - gpt-5 -> GPT 5 - claude-sonnet-4-5 -> Claude Sonnet 4.5 - gemini-2.5-pro -> Gemini 2.5 Pro - meta-llama/llama-3-70b -> Llama 3 70b (OpenRouter style) **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### profile The model profile. We use this to compute the intersection of the profile's supported\_builtin\_tools and the model's implemented tools, ensuring model.profile.supported\_builtin\_tools is the single source of truth for what builtin tools are actually usable. **Type:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) ##### system The model provider, ex: openai. Use to populate the `gen_ai.system` OpenTelemetry semantic convention attribute, so should use well-known values listed in [https://opentelemetry.io/docs/specs/semconv/attributes-registry/gen-ai/#gen-ai-system](https://opentelemetry.io/docs/specs/semconv/attributes-registry/gen-ai/#gen-ai-system) when applicable. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### base\_url The base URL for the provider API, if available. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) #### Methods ##### \_\_init\_\_ ```python def __init__( settings: ModelSettings | None = None, profile: ModelProfileSpec | None = None, ) -> None ``` Initialize the model with optional settings and profile. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific settings that will be used as defaults for this model. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. ##### \_\_aenter\_\_ `@async` ```python def __aenter__() -> Self ``` Enter the model context, delegating to the provider to manage its HTTP client lifecycle. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) ##### \_\_aexit\_\_ `@async` ```python def __aexit__( exc_type: type[BaseException] | None, exc_val: BaseException | None, exc_tb: TracebackType | None, ) -> bool | None ``` Exit the model context, closing the provider's HTTP client if it owns one. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### request `@abstractmethod` `@async` ```python def request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a request to the model. This is ultimately called by `pydantic_ai._agent_graph.ModelRequestNode._make_request(...)`. ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### count\_tokens `@async` ```python def count_tokens( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage ``` Make a request to the model for counting tokens. ###### Returns [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) ##### compact\_messages `@async` ```python def compact_messages( request_context: ModelRequestContext, instructions: str | None = None, ) -> ModelResponse ``` Compact messages to reduce conversation context size. This method is optional and only supported by specific providers (e.g. OpenAI Responses API). Providers that support compaction override this method with their implementation. ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### request\_stream `@async` ```python def request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a request to the model and return a streaming response. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\] ##### customize\_request\_parameters ```python def customize_request_parameters( model_request_parameters: ModelRequestParameters, ) -> ModelRequestParameters ``` Customize the request parameters for the model. This method can be overridden by subclasses to modify the request parameters before sending them to the model. In particular, this method can be used to make modifications to the generated tool JSON schemas if necessary for vendor/model-specific reasons. ###### Returns `ModelRequestParameters` ##### prepare\_request ```python def prepare_request( model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters] ``` Prepare request inputs before they are passed to the provider. This merges the given `model_settings` with the model's own `settings` attribute and ensures `customize_request_parameters` is applied to the resolved [`ModelRequestParameters`](/docs/ai/api/models/base/#pydantic_ai.models.ModelRequestParameters). Subclasses can override this method if they need to customize the preparation flow further, but most implementations should simply call `self.prepare_request(...)` at the start of their `request` (and related) methods. ###### Returns [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None), `ModelRequestParameters`\] ##### supported\_builtin\_tools `@classmethod` ```python def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]] ``` Return the set of builtin tool types this model class can handle. Subclasses should override this to reflect their actual capabilities. Default is empty set - subclasses must explicitly declare support. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] ### StreamedResponse **Bases:** `ABC` Streamed response from an LLM when calling a tool. #### Attributes ##### model\_name Get the model name of the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_name Get the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) #### Methods ##### \_\_aiter\_\_ ```python def __aiter__() -> AsyncIterator[ModelResponseStreamEvent] ``` Stream the response as an async iterable of [`ModelResponseStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponseStreamEvent)s. This proxies the `_event_iterator()` and emits all events, while also checking for matches on the result schema and emitting a [`FinalResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FinalResultEvent) if/when the first match is found. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`ModelResponseStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponseStreamEvent)\] ##### get ```python def get() -> ModelResponse ``` Build a [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) from the data received from the stream so far. ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### usage ```python def usage() -> RequestUsage ``` Get the usage of the response so far. This will not be the final usage until the stream is exhausted. ###### Returns [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) ### check\_allow\_model\_requests ```python def check_allow_model_requests() -> None ``` Check if model requests are allowed. If you're defining your own models that have costs or latency associated with their use, you should call this in [`Model.request`](/docs/ai/api/models/base/#pydantic_ai.models.Model.request) and [`Model.request_stream`](/docs/ai/api/models/base/#pydantic_ai.models.Model.request_stream). #### Returns [`None`](https://docs.python.org/3/library/constants.html#None) #### Raises - `RuntimeError` -- If model requests are not allowed. ### override\_allow\_model\_requests ```python def override_allow_model_requests(allow_model_requests: bool) -> Iterator[None] ``` Context manager to temporarily override [`ALLOW_MODEL_REQUESTS`](/docs/ai/api/models/base/#pydantic_ai.models.ALLOW_MODEL_REQUESTS). #### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] #### Parameters **`allow_model_requests`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) Whether to allow model requests within the context. ### KnownModelName Known model names that can be used with the `model` parameter of [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent). `KnownModelName` is provided as a concise way to specify a model. **Default:** `TypeAliasType('KnownModelName', Literal['anthropic:claude-3-5-haiku-20241022', 'anthropic:claude-3-5-haiku-latest', 'anthropic:claude-3-7-sonnet-20250219', 'anthropic:claude-3-7-sonnet-latest', 'anthropic:claude-3-haiku-20240307', 'anthropic:claude-3-opus-20240229', 'anthropic:claude-3-opus-latest', 'anthropic:claude-4-opus-20250514', 'anthropic:claude-4-sonnet-20250514', 'anthropic:claude-haiku-4-5-20251001', 'anthropic:claude-haiku-4-5', 'anthropic:claude-opus-4-0', 'anthropic:claude-opus-4-1-20250805', 'anthropic:claude-opus-4-20250514', 'anthropic:claude-opus-4-5-20251101', 'anthropic:claude-opus-4-5', 'anthropic:claude-opus-4-6', 'anthropic:claude-sonnet-4-0', 'anthropic:claude-sonnet-4-20250514', 'anthropic:claude-sonnet-4-5-20250929', 'anthropic:claude-sonnet-4-5', 'anthropic:claude-sonnet-4-6', 'bedrock:amazon.titan-text-express-v1', 'bedrock:amazon.titan-text-lite-v1', 'bedrock:amazon.titan-tg1-large', 'bedrock:anthropic.claude-3-5-haiku-20241022-v1:0', 'bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0', 'bedrock:anthropic.claude-3-5-sonnet-20241022-v2:0', 'bedrock:anthropic.claude-3-7-sonnet-20250219-v1:0', 'bedrock:anthropic.claude-3-haiku-20240307-v1:0', 'bedrock:anthropic.claude-3-opus-20240229-v1:0', 'bedrock:anthropic.claude-3-sonnet-20240229-v1:0', 'bedrock:anthropic.claude-haiku-4-5-20251001-v1:0', 'bedrock:anthropic.claude-instant-v1', 'bedrock:anthropic.claude-opus-4-20250514-v1:0', 'bedrock:anthropic.claude-sonnet-4-20250514-v1:0', 'bedrock:anthropic.claude-sonnet-4-5-20250929-v1:0', 'bedrock:anthropic.claude-sonnet-4-6', 'bedrock:anthropic.claude-v2:1', 'bedrock:anthropic.claude-v2', 'bedrock:cohere.command-light-text-v14', 'bedrock:cohere.command-r-plus-v1:0', 'bedrock:cohere.command-r-v1:0', 'bedrock:cohere.command-text-v14', 'bedrock:eu.anthropic.claude-haiku-4-5-20251001-v1:0', 'bedrock:eu.anthropic.claude-sonnet-4-20250514-v1:0', 'bedrock:eu.anthropic.claude-sonnet-4-5-20250929-v1:0', 'bedrock:eu.anthropic.claude-sonnet-4-6', 'bedrock:global.anthropic.claude-opus-4-5-20251101-v1:0', 'bedrock:meta.llama3-1-405b-instruct-v1:0', 'bedrock:meta.llama3-1-70b-instruct-v1:0', 'bedrock:meta.llama3-1-8b-instruct-v1:0', 'bedrock:meta.llama3-70b-instruct-v1:0', 'bedrock:meta.llama3-8b-instruct-v1:0', 'bedrock:mistral.mistral-7b-instruct-v0:2', 'bedrock:mistral.mistral-large-2402-v1:0', 'bedrock:mistral.mistral-large-2407-v1:0', 'bedrock:mistral.mixtral-8x7b-instruct-v0:1', 'bedrock:us.amazon.nova-2-lite-v1:0', 'bedrock:us.amazon.nova-lite-v1:0', 'bedrock:us.amazon.nova-micro-v1:0', 'bedrock:us.amazon.nova-pro-v1:0', 'bedrock:us.anthropic.claude-3-5-haiku-20241022-v1:0', 'bedrock:us.anthropic.claude-3-5-sonnet-20240620-v1:0', 'bedrock:us.anthropic.claude-3-5-sonnet-20241022-v2:0', 'bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0', 'bedrock:us.anthropic.claude-3-haiku-20240307-v1:0', 'bedrock:us.anthropic.claude-3-opus-20240229-v1:0', 'bedrock:us.anthropic.claude-3-sonnet-20240229-v1:0', 'bedrock:us.anthropic.claude-haiku-4-5-20251001-v1:0', 'bedrock:us.anthropic.claude-opus-4-20250514-v1:0', 'bedrock:us.anthropic.claude-sonnet-4-20250514-v1:0', 'bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0', 'bedrock:us.anthropic.claude-sonnet-4-6', 'bedrock:us.meta.llama3-1-70b-instruct-v1:0', 'bedrock:us.meta.llama3-1-8b-instruct-v1:0', 'bedrock:us.meta.llama3-2-11b-instruct-v1:0', 'bedrock:us.meta.llama3-2-1b-instruct-v1:0', 'bedrock:us.meta.llama3-2-3b-instruct-v1:0', 'bedrock:us.meta.llama3-2-90b-instruct-v1:0', 'bedrock:us.meta.llama3-3-70b-instruct-v1:0', 'cerebras:gpt-oss-120b', 'cerebras:llama3.1-8b', 'cerebras:qwen-3-235b-a22b-instruct-2507', 'cerebras:zai-glm-4.7', 'cohere:c4ai-aya-expanse-32b', 'cohere:c4ai-aya-expanse-8b', 'cohere:command-nightly', 'cohere:command-r-08-2024', 'cohere:command-r-plus-08-2024', 'cohere:command-r7b-12-2024', 'deepseek:deepseek-chat', 'deepseek:deepseek-reasoner', 'gateway/anthropic:claude-3-haiku-20240307', 'gateway/anthropic:claude-4-opus-20250514', 'gateway/anthropic:claude-4-sonnet-20250514', 'gateway/anthropic:claude-haiku-4-5-20251001', 'gateway/anthropic:claude-haiku-4-5', 'gateway/anthropic:claude-opus-4-0', 'gateway/anthropic:claude-opus-4-1-20250805', 'gateway/anthropic:claude-opus-4-20250514', 'gateway/anthropic:claude-opus-4-5-20251101', 'gateway/anthropic:claude-opus-4-5', 'gateway/anthropic:claude-opus-4-6', 'gateway/anthropic:claude-sonnet-4-0', 'gateway/anthropic:claude-sonnet-4-20250514', 'gateway/anthropic:claude-sonnet-4-5-20250929', 'gateway/anthropic:claude-sonnet-4-5', 'gateway/anthropic:claude-sonnet-4-6', 'gateway/bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0', 'gateway/bedrock:anthropic.claude-3-haiku-20240307-v1:0', 'gateway/bedrock:eu.anthropic.claude-haiku-4-5-20251001-v1:0', 'gateway/bedrock:eu.anthropic.claude-sonnet-4-20250514-v1:0', 'gateway/bedrock:eu.anthropic.claude-sonnet-4-5-20250929-v1:0', 'gateway/bedrock:eu.anthropic.claude-sonnet-4-6', 'gateway/bedrock:global.anthropic.claude-opus-4-5-20251101-v1:0', 'gateway/google-vertex:gemini-2.5-flash-image', 'gateway/google-vertex:gemini-2.5-flash-lite-preview-09-2025', 'gateway/google-vertex:gemini-2.5-flash-lite', 'gateway/google-vertex:gemini-2.5-flash', 'gateway/google-vertex:gemini-2.5-pro', 'gateway/google-vertex:gemini-3-flash-preview', 'gateway/google-vertex:gemini-3-pro-image-preview', 'gateway/google-vertex:gemini-3.1-flash-image-preview', 'gateway/google-vertex:gemini-3.1-flash-lite-preview', 'gateway/google-vertex:gemini-3.1-pro-preview', 'gateway/groq:llama-3.1-8b-instant', 'gateway/groq:llama-3.3-70b-versatile', 'gateway/groq:meta-llama/llama-4-scout-17b-16e-instruct', 'gateway/groq:moonshotai/kimi-k2-instruct-0905', 'gateway/groq:openai/gpt-oss-120b', 'gateway/groq:openai/gpt-oss-20b', 'gateway/groq:openai/gpt-oss-safeguard-20b', 'gateway/openai:gpt-3.5-turbo-0125', 'gateway/openai:gpt-3.5-turbo-1106', 'gateway/openai:gpt-3.5-turbo-16k', 'gateway/openai:gpt-3.5-turbo', 'gateway/openai:gpt-4-0613', 'gateway/openai:gpt-4-turbo-2024-04-09', 'gateway/openai:gpt-4-turbo', 'gateway/openai:gpt-4.1-2025-04-14', 'gateway/openai:gpt-4.1-mini-2025-04-14', 'gateway/openai:gpt-4.1-mini', 'gateway/openai:gpt-4.1-nano-2025-04-14', 'gateway/openai:gpt-4.1-nano', 'gateway/openai:gpt-4.1', 'gateway/openai:gpt-4', 'gateway/openai:gpt-4o-2024-05-13', 'gateway/openai:gpt-4o-2024-08-06', 'gateway/openai:gpt-4o-2024-11-20', 'gateway/openai:gpt-4o-mini-2024-07-18', 'gateway/openai:gpt-4o-mini-search-preview-2025-03-11', 'gateway/openai:gpt-4o-mini-search-preview', 'gateway/openai:gpt-4o-mini', 'gateway/openai:gpt-4o-search-preview-2025-03-11', 'gateway/openai:gpt-4o-search-preview', 'gateway/openai:gpt-4o', 'gateway/openai:gpt-5-2025-08-07', 'gateway/openai:gpt-5-chat-latest', 'gateway/openai:gpt-5-mini-2025-08-07', 'gateway/openai:gpt-5-mini', 'gateway/openai:gpt-5-nano-2025-08-07', 'gateway/openai:gpt-5-nano', 'gateway/openai:gpt-5.1-2025-11-13', 'gateway/openai:gpt-5.1-chat-latest', 'gateway/openai:gpt-5.1', 'gateway/openai:gpt-5.2-2025-12-11', 'gateway/openai:gpt-5.2-chat-latest', 'gateway/openai:gpt-5.2', 'gateway/openai:gpt-5.4-mini-2026-03-17', 'gateway/openai:gpt-5.4-mini', 'gateway/openai:gpt-5.4-nano-2026-03-17', 'gateway/openai:gpt-5.4-nano', 'gateway/openai:gpt-5.4', 'gateway/openai:gpt-5', 'gateway/openai:o1-2024-12-17', 'gateway/openai:o1', 'gateway/openai:o3-2025-04-16', 'gateway/openai:o3-mini-2025-01-31', 'gateway/openai:o3-mini', 'gateway/openai:o3', 'gateway/openai:o4-mini-2025-04-16', 'gateway/openai:o4-mini', 'google-gla:gemini-2.0-flash-lite', 'google-gla:gemini-2.0-flash', 'google-gla:gemini-2.5-flash-image', 'google-gla:gemini-2.5-flash-lite-preview-09-2025', 'google-gla:gemini-2.5-flash-lite', 'google-gla:gemini-2.5-flash-preview-09-2025', 'google-gla:gemini-2.5-flash', 'google-gla:gemini-2.5-pro', 'google-gla:gemini-3-flash-preview', 'google-gla:gemini-3-pro-image-preview', 'google-gla:gemini-3-pro-preview', 'google-gla:gemini-3.1-flash-image-preview', 'google-gla:gemini-3.1-flash-lite-preview', 'google-gla:gemini-3.1-pro-preview', 'google-gla:gemini-flash-latest', 'google-gla:gemini-flash-lite-latest', 'google-vertex:gemini-2.0-flash-lite', 'google-vertex:gemini-2.0-flash', 'google-vertex:gemini-2.5-flash-image', 'google-vertex:gemini-2.5-flash-lite-preview-09-2025', 'google-vertex:gemini-2.5-flash-lite', 'google-vertex:gemini-2.5-flash-preview-09-2025', 'google-vertex:gemini-2.5-flash', 'google-vertex:gemini-2.5-pro', 'google-vertex:gemini-3-flash-preview', 'google-vertex:gemini-3-pro-image-preview', 'google-vertex:gemini-3-pro-preview', 'google-vertex:gemini-3.1-flash-image-preview', 'google-vertex:gemini-3.1-flash-lite-preview', 'google-vertex:gemini-3.1-pro-preview', 'google-vertex:gemini-flash-latest', 'google-vertex:gemini-flash-lite-latest', 'grok:grok-2-image-1212', 'grok:grok-2-vision-1212', 'grok:grok-3-fast', 'grok:grok-3-mini-fast', 'grok:grok-3-mini', 'grok:grok-3', 'grok:grok-4-0709', 'grok:grok-4-latest', 'grok:grok-4-1-fast-non-reasoning', 'grok:grok-4-1-fast-reasoning', 'grok:grok-4-1-fast', 'grok:grok-4-fast-non-reasoning', 'grok:grok-4-fast-reasoning', 'grok:grok-4-fast', 'grok:grok-4', 'grok:grok-code-fast-1', 'xai:grok-3', 'xai:grok-3-fast', 'xai:grok-3-fast-latest', 'xai:grok-3-latest', 'xai:grok-3-mini', 'xai:grok-3-mini-fast', 'xai:grok-3-mini-fast-latest', 'xai:grok-4', 'xai:grok-4-0709', 'xai:grok-4-1-fast', 'xai:grok-4-1-fast-non-reasoning', 'xai:grok-4-1-fast-non-reasoning-latest', 'xai:grok-4-1-fast-reasoning', 'xai:grok-4-1-fast-reasoning-latest', 'xai:grok-4-fast', 'xai:grok-4-fast-non-reasoning', 'xai:grok-4-fast-non-reasoning-latest', 'xai:grok-4-fast-reasoning', 'xai:grok-4-fast-reasoning-latest', 'xai:grok-4-latest', 'xai:grok-code-fast-1', 'groq:llama-3.1-8b-instant', 'groq:llama-3.3-70b-versatile', 'groq:meta-llama/llama-guard-4-12b', 'groq:openai/gpt-oss-120b', 'groq:openai/gpt-oss-20b', 'groq:whisper-large-v3', 'groq:whisper-large-v3-turbo', 'groq:meta-llama/llama-4-maverick-17b-128e-instruct', 'groq:meta-llama/llama-4-scout-17b-16e-instruct', 'groq:meta-llama/llama-prompt-guard-2-22m', 'groq:meta-llama/llama-prompt-guard-2-86m', 'groq:moonshotai/kimi-k2-instruct-0905', 'groq:openai/gpt-oss-safeguard-20b', 'groq:playai-tts', 'groq:playai-tts-arabic', 'groq:qwen/qwen-3-32b', 'heroku:claude-3-5-haiku', 'heroku:claude-3-5-sonnet-latest', 'heroku:claude-3-7-sonnet', 'heroku:claude-3-haiku', 'heroku:claude-4-5-haiku', 'heroku:claude-4-5-sonnet', 'heroku:claude-4-6-sonnet', 'heroku:claude-4-sonnet', 'heroku:claude-opus-4-5', 'heroku:claude-opus-4-6', 'heroku:deepseek-v3-2', 'heroku:glm-4-7', 'heroku:glm-4-7-flash', 'heroku:gpt-oss-120b', 'heroku:kimi-k2-5', 'heroku:kimi-k2-thinking', 'heroku:minimax-m2', 'heroku:minimax-m2-1', 'heroku:qwen3-235b', 'heroku:qwen3-coder-480b', 'heroku:nova-2-lite', 'heroku:nova-lite', 'heroku:nova-pro', 'huggingface:deepseek-ai/DeepSeek-R1', 'huggingface:meta-llama/Llama-3.3-70B-Instruct', 'huggingface:meta-llama/Llama-4-Maverick-17B-128E-Instruct', 'huggingface:meta-llama/Llama-4-Scout-17B-16E-Instruct', 'huggingface:Qwen/Qwen2.5-72B-Instruct', 'huggingface:Qwen/Qwen3-235B-A22B', 'huggingface:Qwen/Qwen3-32B', 'huggingface:Qwen/QwQ-32B', 'mistral:codestral-latest', 'mistral:mistral-large-latest', 'mistral:mistral-moderation-latest', 'mistral:mistral-small-latest', 'moonshotai:kimi-k2-0711-preview', 'moonshotai:kimi-latest', 'moonshotai:kimi-thinking-preview', 'moonshotai:moonshot-v1-128k-vision-preview', 'moonshotai:moonshot-v1-128k', 'moonshotai:moonshot-v1-32k-vision-preview', 'moonshotai:moonshot-v1-32k', 'moonshotai:moonshot-v1-8k-vision-preview', 'moonshotai:moonshot-v1-8k', 'openai:computer-use-preview-2025-03-11', 'openai:computer-use-preview', 'openai:gpt-3.5-turbo-0125', 'openai:gpt-3.5-turbo-0301', 'openai:gpt-3.5-turbo-0613', 'openai:gpt-3.5-turbo-1106', 'openai:gpt-3.5-turbo-16k-0613', 'openai:gpt-3.5-turbo-16k', 'openai:gpt-3.5-turbo', 'openai:gpt-4-0314', 'openai:gpt-4-0613', 'openai:gpt-4-turbo-2024-04-09', 'openai:gpt-4-turbo', 'openai:gpt-4.1-2025-04-14', 'openai:gpt-4.1-mini-2025-04-14', 'openai:gpt-4.1-mini', 'openai:gpt-4.1-nano-2025-04-14', 'openai:gpt-4.1-nano', 'openai:gpt-4.1', 'openai:gpt-4', 'openai:gpt-4o-2024-05-13', 'openai:gpt-4o-2024-08-06', 'openai:gpt-4o-2024-11-20', 'openai:gpt-4o-audio-preview-2024-12-17', 'openai:gpt-4o-audio-preview-2025-06-03', 'openai:gpt-4o-audio-preview', 'openai:gpt-4o-mini-2024-07-18', 'openai:gpt-4o-mini-audio-preview-2024-12-17', 'openai:gpt-4o-mini-audio-preview', 'openai:gpt-4o-mini-search-preview-2025-03-11', 'openai:gpt-4o-mini-search-preview', 'openai:gpt-4o-mini', 'openai:gpt-4o-search-preview-2025-03-11', 'openai:gpt-4o-search-preview', 'openai:gpt-4o', 'openai:gpt-5-2025-08-07', 'openai:gpt-5-chat-latest', 'openai:gpt-5-codex', 'openai:gpt-5-mini-2025-08-07', 'openai:gpt-5-mini', 'openai:gpt-5-nano-2025-08-07', 'openai:gpt-5-nano', 'openai:gpt-5-pro-2025-10-06', 'openai:gpt-5-pro', 'openai:gpt-5.1-2025-11-13', 'openai:gpt-5.1-chat-latest', 'openai:gpt-5.1-codex-max', 'openai:gpt-5.1-codex', 'openai:gpt-5.1', 'openai:gpt-5.2-2025-12-11', 'openai:gpt-5.2-chat-latest', 'openai:gpt-5.2-pro-2025-12-11', 'openai:gpt-5.2-pro', 'openai:gpt-5.2', 'openai:gpt-5.3-chat-latest', 'openai:gpt-5.4-mini-2026-03-17', 'openai:gpt-5.4-mini', 'openai:gpt-5.4-nano-2026-03-17', 'openai:gpt-5.4-nano', 'openai:gpt-5.4', 'openai:gpt-5', 'openai:o1-2024-12-17', 'openai:o1-pro-2025-03-19', 'openai:o1-pro', 'openai:o1', 'openai:o3-2025-04-16', 'openai:o3-deep-research-2025-06-26', 'openai:o3-deep-research', 'openai:o3-mini-2025-01-31', 'openai:o3-mini', 'openai:o3-pro-2025-06-10', 'openai:o3-pro', 'openai:o3', 'openai:o4-mini-2025-04-16', 'openai:o4-mini-deep-research-2025-06-26', 'openai:o4-mini-deep-research', 'openai:o4-mini', 'test'])` ### ALLOW\_MODEL\_REQUESTS Whether to allow requests to models. This global setting allows you to disable request to most models, e.g. to make sure you don't accidentally make costly requests to a model during tests. The testing models [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) and [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) are no affected by this setting. **Default:** `True` --- # [pydantic_ai.models.bedrock](https://pydantic.dev/docs/ai/api/models/bedrock/) # pydantic\_ai.models.bedrock ## Setup For details on how to set up authentication with this model, see [model configuration for Bedrock](/docs/ai/models/bedrock). ### BedrockModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings for Bedrock models. See [the Bedrock Converse API docs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_RequestSyntax) for a full list. See [the boto3 implementation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html) of the Bedrock Converse API. #### Attributes ##### bedrock\_guardrail\_config Content moderation and safety settings for Bedrock API requests. See more about it on [https://docs.aws.amazon.com/bedrock/latest/APIReference/API\_runtime\_GuardrailConfiguration.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_GuardrailConfiguration.html). **Type:** `GuardrailConfigurationTypeDef` ##### bedrock\_performance\_configuration Performance optimization settings for model inference. See more about it on [https://docs.aws.amazon.com/bedrock/latest/APIReference/API\_runtime\_PerformanceConfiguration.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_PerformanceConfiguration.html). **Type:** `PerformanceConfigurationTypeDef` ##### bedrock\_request\_metadata Additional metadata to attach to Bedrock API requests. See more about it on [https://docs.aws.amazon.com/bedrock/latest/APIReference/API\_runtime\_Converse.html#API\_runtime\_Converse\_RequestSyntax](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_RequestSyntax). **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### bedrock\_additional\_model\_response\_fields\_paths JSON paths to extract additional fields from model responses. See more about it on [https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html). **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### bedrock\_prompt\_variables Variables for substitution into prompt templates. See more about it on [https://docs.aws.amazon.com/bedrock/latest/APIReference/API\_runtime\_PromptVariableValues.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_PromptVariableValues.html). **Type:** [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `PromptVariableValuesTypeDef`\] ##### bedrock\_additional\_model\_requests\_fields Additional model-specific parameters to include in requests. See more about it on [https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html). **Type:** [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### bedrock\_cache\_tool\_definitions Whether to add a cache point after the last tool definition. When enabled, the last tool in the `tools` array will include a `cachePoint`, allowing Bedrock to cache tool definitions and reduce costs for compatible models. See [https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) for more information. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### bedrock\_cache\_instructions Whether to add a cache point after the system prompt blocks. When enabled, an extra `cachePoint` is appended to the system prompt so Bedrock can cache system instructions. See [https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) for more information. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### bedrock\_cache\_messages Convenience setting to enable caching for the last user message. When enabled, this automatically adds a cache point to the last content block in the final user message, which is useful for caching conversation history or context in multi-turn conversations. Note: Uses 1 of Bedrock's 4 available cache points per request. Any additional CachePoint markers in messages will be automatically limited to respect the 4-cache-point maximum. See [https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) for more information. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### bedrock\_service\_tier Setting for optimizing performance and cost See more about it on [https://docs.aws.amazon.com/bedrock/latest/userguide/service-tiers-inference.html](https://docs.aws.amazon.com/bedrock/latest/userguide/service-tiers-inference.html). **Type:** `ServiceTierTypeDef` ##### bedrock\_inference\_profile An [inference profile](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles.html) ARN to use as the `modelId` in API requests. When set, this value is used as the `modelId` in `converse` and `converse_stream` API calls instead of the base `model_name`. This allows you to pass the base model name (e.g. `'anthropic.claude-sonnet-4-5-20250929-v1:0'`) as `model_name` for detecting model capabilities and token counting, while routing requests through an inference profile for cost tracking or cross-region inference. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### BedrockConverseModel **Bases:** `Model[BaseClient]` A model that uses the Bedrock Converse API. #### Attributes ##### model\_name The model name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### system The model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: BedrockModelName, provider: Literal['bedrock', 'gateway'] | Provider[BaseClient] = 'bedrock', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Initialize a Bedrock model. ###### Parameters **`model_name`** : `BedrockModelName` The name of the model to use. **`model_name`** : `BedrockModelName` The name of the Bedrock model to use. List of model names available [here](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['bedrock', 'gateway'\] | `Provider`\[`BaseClient`\] _Default:_ `'bedrock'` The provider to use for authentication and API access. Can be either the string 'bedrock' or an instance of `Provider[BaseClient]`. If not provided, a new provider will be created using the other parameters. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider based on the model name. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific settings that will be used as defaults for this model. ##### supported\_builtin\_tools `@classmethod` ```python def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]] ``` The set of builtin tool types this model can handle. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] ##### count\_tokens `@async` ```python def count_tokens( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> usage.RequestUsage ``` Count the number of tokens, works with limited models. Check the actual supported models on [https://docs.aws.amazon.com/bedrock/latest/userguide/count-tokens.html](https://docs.aws.amazon.com/bedrock/latest/userguide/count-tokens.html) ###### Returns [`usage.RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) ### BedrockStreamedResponse **Bases:** `StreamedResponse` Implementation of `StreamedResponse` for Bedrock models. #### Attributes ##### model\_name Get the model name of the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_name Get the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### LatestBedrockModelNames Latest Bedrock models. **Default:** `Literal['amazon.titan-tg1-large', 'amazon.titan-text-lite-v1', 'amazon.titan-text-express-v1', 'us.amazon.nova-2-lite-v1:0', 'us.amazon.nova-pro-v1:0', 'us.amazon.nova-lite-v1:0', 'us.amazon.nova-micro-v1:0', 'anthropic.claude-3-5-sonnet-20241022-v2:0', 'us.anthropic.claude-3-5-sonnet-20241022-v2:0', 'anthropic.claude-3-5-haiku-20241022-v1:0', 'us.anthropic.claude-3-5-haiku-20241022-v1:0', 'anthropic.claude-instant-v1', 'anthropic.claude-v2:1', 'anthropic.claude-v2', 'anthropic.claude-3-sonnet-20240229-v1:0', 'us.anthropic.claude-3-sonnet-20240229-v1:0', 'anthropic.claude-3-haiku-20240307-v1:0', 'us.anthropic.claude-3-haiku-20240307-v1:0', 'anthropic.claude-3-opus-20240229-v1:0', 'us.anthropic.claude-3-opus-20240229-v1:0', 'anthropic.claude-3-5-sonnet-20240620-v1:0', 'us.anthropic.claude-3-5-sonnet-20240620-v1:0', 'anthropic.claude-3-7-sonnet-20250219-v1:0', 'us.anthropic.claude-3-7-sonnet-20250219-v1:0', 'anthropic.claude-opus-4-20250514-v1:0', 'us.anthropic.claude-opus-4-20250514-v1:0', 'global.anthropic.claude-opus-4-5-20251101-v1:0', 'anthropic.claude-sonnet-4-20250514-v1:0', 'us.anthropic.claude-sonnet-4-20250514-v1:0', 'eu.anthropic.claude-sonnet-4-20250514-v1:0', 'anthropic.claude-sonnet-4-5-20250929-v1:0', 'us.anthropic.claude-sonnet-4-5-20250929-v1:0', 'eu.anthropic.claude-sonnet-4-5-20250929-v1:0', 'anthropic.claude-sonnet-4-6', 'us.anthropic.claude-sonnet-4-6', 'eu.anthropic.claude-sonnet-4-6', 'anthropic.claude-haiku-4-5-20251001-v1:0', 'us.anthropic.claude-haiku-4-5-20251001-v1:0', 'eu.anthropic.claude-haiku-4-5-20251001-v1:0', 'cohere.command-text-v14', 'cohere.command-r-v1:0', 'cohere.command-r-plus-v1:0', 'cohere.command-light-text-v14', 'meta.llama3-8b-instruct-v1:0', 'meta.llama3-70b-instruct-v1:0', 'meta.llama3-1-8b-instruct-v1:0', 'us.meta.llama3-1-8b-instruct-v1:0', 'meta.llama3-1-70b-instruct-v1:0', 'us.meta.llama3-1-70b-instruct-v1:0', 'meta.llama3-1-405b-instruct-v1:0', 'us.meta.llama3-2-11b-instruct-v1:0', 'us.meta.llama3-2-90b-instruct-v1:0', 'us.meta.llama3-2-1b-instruct-v1:0', 'us.meta.llama3-2-3b-instruct-v1:0', 'us.meta.llama3-3-70b-instruct-v1:0', 'mistral.mistral-7b-instruct-v0:2', 'mistral.mixtral-8x7b-instruct-v0:1', 'mistral.mistral-large-2402-v1:0', 'mistral.mistral-large-2407-v1:0']` ### BedrockModelName Possible Bedrock model names. Since Bedrock supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Bedrock docs](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for a full list. **Default:** `str | LatestBedrockModelNames` --- # [pydantic_ai.models.cerebras](https://pydantic.dev/docs/ai/api/models/cerebras/) # pydantic\_ai.models.cerebras ## Setup For details on how to set up authentication with this model, see [model configuration for Cerebras](/docs/ai/models/cerebras). Cerebras model implementation using OpenAI-compatible API. ### CerebrasModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings used for a Cerebras model request. ALL FIELDS MUST BE `cerebras_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. #### Attributes ##### cerebras\_disable\_reasoning Disable reasoning for the model. This setting is only supported on reasoning models: `zai-glm-4.6` and `gpt-oss-120b`. See [the Cerebras docs](https://inference-docs.cerebras.ai/resources/openai#passing-non-standard-parameters) for more details. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ### CerebrasModel **Bases:** `OpenAIChatModel` A model that uses Cerebras's OpenAI-compatible API. Cerebras provides ultra-fast inference powered by the Wafer-Scale Engine (WSE). Apart from `__init__`, all methods are private or match those of the base class. #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: CerebrasModelName, provider: Literal['cerebras'] | Provider[AsyncOpenAI] = 'cerebras', profile: ModelProfileSpec | None = None, settings: CerebrasModelSettings | None = None, ) ``` Initialize a Cerebras model. ###### Parameters **`model_name`** : `CerebrasModelName` The name of the Cerebras model to use. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['cerebras'\] | `Provider`\[`AsyncOpenAI`\] _Default:_ `'cerebras'` The provider to use. Defaults to 'cerebras'. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile based on the model name. **`settings`** : `CerebrasModelSettings` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific settings that will be used as defaults for this model. ### CerebrasModelName Possible Cerebras model names. Since Cerebras supports a variety of models and the list changes frequently, we explicitly list known models but allow any name in the type hints. See [https://inference-docs.cerebras.ai/models/overview](https://inference-docs.cerebras.ai/models/overview) for an up to date list of models. **Default:** `str | LatestCerebrasModelNames` --- # [pydantic_ai.models.cohere](https://pydantic.dev/docs/ai/api/models/cohere/) # pydantic\_ai.models.cohere ## Setup For details on how to set up authentication with this model, see [model configuration for Cohere](/docs/ai/models/cohere). ### CohereModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings used for a Cohere model request. ### CohereModel **Bases:** `Model[AsyncClientV2]` A model that uses the Cohere API. Internally, this uses the [Cohere Python client](https://github.com/cohere-ai/cohere-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. #### Attributes ##### model\_name The model name. **Type:** `CohereModelName` ##### system The model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: CohereModelName, provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Initialize an Cohere model. ###### Parameters **`model_name`** : `CohereModelName` The name of the Cohere model to use. List of model names available [here](https://docs.cohere.com/docs/models#command). **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['cohere'\] | `Provider`\[`AsyncClientV2`\] _Default:_ `'cohere'` The provider to use for authentication and API access. Can be either the string 'cohere' or an instance of `Provider[AsyncClientV2]`. If not provided, a new provider will be created using the other parameters. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider based on the model name. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific settings that will be used as defaults for this model. ### LatestCohereModelNames Latest Cohere models. **Default:** `Literal['c4ai-aya-expanse-32b', 'c4ai-aya-expanse-8b', 'command-nightly', 'command-r-08-2024', 'command-r-plus-08-2024', 'command-r7b-12-2024']` ### CohereModelName Possible Cohere model names. Since Cohere supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [Cohere's docs](https://docs.cohere.com/v2/docs/models) for a list of all available models. **Default:** `str | LatestCohereModelNames` --- # [pydantic_ai.models.fallback](https://pydantic.dev/docs/ai/api/models/fallback/) # pydantic\_ai.models.fallback ### ResponseRejected **Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception) Raised within a `FallbackExceptionGroup` when model responses are rejected by a response handler. ### FallbackModel **Bases:** `Model` A model that uses one or more fallback models upon failure. Apart from `__init__`, all methods are private or match those of the base class. #### Attributes ##### model\_name The model name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### model\_id The fully qualified model identifier, combining the wrapped models' IDs. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( default_model: Model | KnownModelName | str, fallback_models: Model | KnownModelName | str = (), fallback_on: FallbackOn = (ModelAPIError,), ) ``` Initialize a fallback model instance. ###### Parameters **`default_model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The name or instance of the default model to use. **`fallback_models`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `()` The names or instances of the fallback models to use upon failure. **`fallback_on`** : `FallbackOn` _Default:_ `(ModelAPIError,)` Conditions that trigger fallback to the next model. Accepts: - A tuple of exception types: `(ModelAPIError, RateLimitError)` - An exception handler (sync or async): `lambda exc: isinstance(exc, MyError)` - A response handler (sync or async): `def check(r: ModelResponse) -> bool` - A sequence mixing all of the above: `[ModelAPIError, exc_handler, response_handler]` Handler type is auto-detected by inspecting type hints on the first parameter. If the first parameter is hinted as `ModelResponse`, it's a response handler. Otherwise (including untyped handlers and lambdas), it's an exception handler. ##### \_\_aenter\_\_ `@async` ```python def __aenter__() -> FallbackModel ``` Enter all sub-models so their providers can manage HTTP client lifecycle. ###### Returns `FallbackModel` ##### \_\_aexit\_\_ `@async` ```python def __aexit__( exc_type: type[BaseException] | None, exc_val: BaseException | None, exc_tb: TracebackType | None, ) -> bool | None ``` Exit all sub-models, closing their providers' HTTP clients. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### request `@async` ```python def request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Try each model in sequence until one succeeds. In case of failure, raise a FallbackExceptionGroup with all exceptions. ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### request\_stream `@async` ```python def request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Try each model in sequence until one succeeds. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\] ### ExceptionHandler A sync or async callable that decides whether an exception should trigger fallback. **Default:** `Callable[[Exception], Awaitable[bool]] | Callable[[Exception], bool]` ### ResponseHandler A sync or async callable that decides whether a model response should trigger fallback. **Default:** `Callable[[ModelResponse], Awaitable[bool]] | Callable[[ModelResponse], bool]` ### FallbackOn The type of the `fallback_on` parameter to [`FallbackModel`](/docs/ai/api/models/fallback/#pydantic_ai.models.fallback.FallbackModel). **Default:** `type[Exception] | tuple[type[Exception], ...] | ExceptionHandler | ResponseHandler | Sequence[type[Exception] | ExceptionHandler | ResponseHandler]` --- # [pydantic_ai.models.function](https://pydantic.dev/docs/ai/api/models/function/) # pydantic\_ai.models.function A model controlled by a local function. [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) is similar to [`TestModel`](/docs/ai/models/test), but allows greater control over the model's behavior. Its primary use case is for more advanced unit testing than is possible with `TestModel`. Here's a minimal example: function\_model\_usage.py ```python from pydantic_ai import Agent from pydantic_ai import ModelMessage, ModelResponse, TextPart from pydantic_ai.models.function import FunctionModel, AgentInfo my_agent = Agent('openai:gpt-5.2') async def model_function( messages: list[ModelMessage], info: AgentInfo ) -> ModelResponse: print(messages) """ [ ModelRequest( parts=[ UserPromptPart( content='Testing my agent...', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ] """ print(info) """ AgentInfo( function_tools=[], allow_text_output=True, output_tools=[], model_settings=None, model_request_parameters=ModelRequestParameters( function_tools=[], builtin_tools=[], output_tools=[] ), instructions=None, ) """ return ModelResponse(parts=[TextPart('hello world')]) async def test_my_agent(): """Unit test for my_agent, to be run by pytest.""" with my_agent.override(model=FunctionModel(model_function)): result = await my_agent.run('Testing my agent...') assert result.output == 'hello world' ``` See [Unit testing with `FunctionModel`](/docs/ai/guides/testing#unit-testing-with-functionmodel) for detailed documentation. ### FunctionModel **Bases:** `Model` A model controlled by a local function. Apart from `__init__`, all methods are private or match those of the base class. #### Attributes ##### model\_name The model name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### system The system / model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( function: FunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None def __init__( stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None def __init__( function: FunctionDef, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None ``` Initialize a `FunctionModel`. Either `function` or `stream_function` must be provided, providing both is allowed. ###### Parameters **`function`** : `FunctionDef` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The function to call for non-streamed requests. **`stream_function`** : `StreamFunctionDef` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The function to call for streamed requests. **`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The name of the model. If not provided, a name is generated from the function names. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific settings that will be used as defaults for this model. ##### supported\_builtin\_tools `@classmethod` ```python def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]] ``` FunctionModel supports all builtin tools for testing flexibility. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] ### AgentInfo Information about an agent. This is passed as the second to functions used within [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel). #### Attributes ##### function\_tools The function tools available on this agent. These are the tools registered via the [`tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) and [`tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain) decorators. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition)\] ##### allow\_text\_output Whether a plain text output is allowed. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### output\_tools The tools that can called to produce the final output of the run. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition)\] ##### model\_settings The model settings passed to the run call. **Type:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### model\_request\_parameters The model request parameters passed to the run call. **Type:** `ModelRequestParameters` ##### instructions The instructions passed to model. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ### DeltaToolCall Incremental change to a tool call. Used to describe a chunk when streaming structured responses. #### Attributes ##### name Incremental change to the name of the tool. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### json\_args Incremental change to the arguments as JSON **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### tool\_call\_id Incremental change to the tool call ID. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### DeltaThinkingPart Incremental change to a thinking part. Used to describe a chunk when streaming thinking responses. #### Attributes ##### content Incremental change to the thinking content. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### signature Incremental change to the thinking signature. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### FunctionStreamedResponse **Bases:** `StreamedResponse` Implementation of `StreamedResponse` for [FunctionModel](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel). #### Attributes ##### model\_name Get the model name of the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_name Get the provider name. **Type:** [`None`](https://docs.python.org/3/library/constants.html#None) ##### provider\_url Get the provider base URL. **Type:** [`None`](https://docs.python.org/3/library/constants.html#None) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ### DeltaToolCalls A mapping of tool call IDs to incremental changes. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `dict[int, DeltaToolCall]` ### DeltaThinkingCalls A mapping of thinking call IDs to incremental changes. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `dict[int, DeltaThinkingPart]` ### FunctionDef A function used to generate a non-streamed response. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[list[ModelMessage], AgentInfo], ModelResponse | Awaitable[ModelResponse]]` ### StreamFunctionDef A function used to generate a streamed response. While this is defined as having return type of `AsyncIterator[str | DeltaToolCalls | DeltaThinkingCalls | BuiltinTools]`, it should really be considered as `AsyncIterator[str] | AsyncIterator[DeltaToolCalls] | AsyncIterator[DeltaThinkingCalls]`, E.g. you need to yield all text, all `DeltaToolCalls`, all `DeltaThinkingCalls`, or all `BuiltinToolCallsReturns`, not mix them. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[list[ModelMessage], AgentInfo], AsyncIterator[str | DeltaToolCalls | DeltaThinkingCalls | BuiltinToolCallsReturns]]` --- # [pydantic_ai.models.google](https://pydantic.dev/docs/ai/api/models/google/) # pydantic\_ai.models.google Interface that uses the [`google-genai`](https://pypi.org/project/google-genai/) package under the hood to access Google's Gemini models via both the Generative Language API and Vertex AI. ## Setup For details on how to set up authentication with this model, see [model configuration for Google](/docs/ai/models/google). ### GoogleModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings used for a Gemini model request. #### Attributes ##### google\_safety\_settings The safety settings to use for the model. See [https://ai.google.dev/gemini-api/docs/safety-settings](https://ai.google.dev/gemini-api/docs/safety-settings) for more information. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`SafetySettingDict`\] ##### google\_thinking\_config The thinking configuration to use for the model. See [https://ai.google.dev/gemini-api/docs/thinking](https://ai.google.dev/gemini-api/docs/thinking) for more information. **Type:** `ThinkingConfigDict` ##### google\_labels User-defined metadata to break down billed charges. Only supported by the Vertex AI API. See the [Gemini API docs](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/add-labels-to-api-calls) for use cases and limitations. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### google\_video\_resolution The video resolution to use for the model. See [https://ai.google.dev/api/generate-content#MediaResolution](https://ai.google.dev/api/generate-content#MediaResolution) for more information. **Type:** `MediaResolution` ##### google\_cached\_content The name of the cached content to use for the model. See [https://ai.google.dev/gemini-api/docs/caching](https://ai.google.dev/gemini-api/docs/caching) for more information. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### google\_logprobs Include log probabilities in the response. See [https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/content-generation-parameters#log-probabilities-output-tokens](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/content-generation-parameters#log-probabilities-output-tokens) for more information. Note: Only supported for Vertex AI and non-streaming requests. These will be included in `ModelResponse.provider_details['logprobs']`. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### google\_top\_logprobs Include log probabilities of the top n tokens in the response. See [https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/content-generation-parameters#log-probabilities-output-tokens](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/content-generation-parameters#log-probabilities-output-tokens) for more information. Note: Only supported for Vertex AI and non-streaming requests. These will be included in `ModelResponse.provider_details['logprobs']`. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### google\_service\_tier Vertex AI routing for Provisioned Throughput and Flex PayGo. Defaults to `'pt_then_on_demand'`. See [`GoogleServiceTier`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleServiceTier) for all values, headers sent, and links to Google docs. **Type:** `GoogleServiceTier` ### GoogleModel **Bases:** `Model[Client]` A model that uses Gemini via `generativelanguage.googleapis.com` API. This is implemented from scratch rather than using a dedicated SDK, good API documentation is available [here](https://ai.google.dev/api). Apart from `__init__`, all methods are private or match those of the base class. #### Attributes ##### model\_name The model name. **Type:** `GoogleModelName` ##### system The model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: GoogleModelName, provider: Literal['google-gla', 'google-vertex', 'gateway'] | Provider[Client] = 'google-gla', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Initialize a Gemini model. ###### Parameters **`model_name`** : `GoogleModelName` The name of the model to use. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['google-gla', 'google-vertex', 'gateway'\] | `Provider`\[`Client`\] _Default:_ `'google-gla'` The provider to use for authentication and API access. Can be either the string 'google-gla' or 'google-vertex' or an instance of `Provider[google.genai.AsyncClient]`. Defaults to 'google-gla'. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider based on the model name. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model settings to use. Defaults to None. ##### supported\_builtin\_tools `@classmethod` ```python def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]] ``` Return the set of builtin tool types this model can handle. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] ### GeminiStreamedResponse **Bases:** `StreamedResponse` Implementation of `StreamedResponse` for the Gemini model. #### Attributes ##### model\_name Get the model name of the response. **Type:** `GoogleModelName` ##### provider\_name Get the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ### LatestGoogleModelNames Latest Gemini models. **Default:** `Literal['gemini-flash-latest', 'gemini-flash-lite-latest', 'gemini-2.0-flash', 'gemini-2.0-flash-lite', 'gemini-2.5-flash', 'gemini-2.5-flash-preview-09-2025', 'gemini-2.5-flash-image', 'gemini-2.5-flash-lite', 'gemini-2.5-flash-lite-preview-09-2025', 'gemini-2.5-pro', 'gemini-3-flash-preview', 'gemini-3-pro-image-preview', 'gemini-3-pro-preview', 'gemini-3.1-flash-image-preview', 'gemini-3.1-flash-lite-preview', 'gemini-3.1-pro-preview']` ### GoogleModelName Possible Gemini model names. Since Gemini supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Gemini API docs](https://ai.google.dev/gemini-api/docs/models/gemini#model-variations) for a full list. **Default:** `str | LatestGoogleModelNames` ### GoogleServiceTier Values for the `google_service_tier` field on [`GoogleModelSettings`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings). Controls Vertex AI HTTP headers for [Provisioned Throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/provisioned-throughput/use-provisioned-throughput) (PT) and [Flex PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/flex-paygo). Only applies when using the Vertex AI API. **Values:** - `'pt_then_on_demand'` (**default**): PT when quota allows, then standard on-demand spillover. No headers sent. - `'pt_only'`: PT only (`X-Vertex-AI-LLM-Request-Type: dedicated`). No on-demand spillover; returns 429 when over quota. - `'pt_then_flex'`: PT when quota allows, then [Flex PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/flex-paygo) spillover (`X-Vertex-AI-LLM-Shared-Request-Type: flex`). - `'on_demand'`: Standard on-demand only (`X-Vertex-AI-LLM-Request-Type: shared`). Bypasses PT for this request. - `'flex_only'`: [Flex PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/flex-paygo) only (`X-Vertex-AI-LLM-Request-Type: shared` and `X-Vertex-AI-LLM-Shared-Request-Type: flex`). Bypasses PT. Not every model or region supports every value; see the linked Google docs. Note: these headers only affect Vertex AI. When using the GLA API they are silently ignored. **Default:** `Literal['pt_then_on_demand', 'pt_only', 'pt_then_flex', 'on_demand', 'flex_only']` --- # [pydantic_ai.models.groq](https://pydantic.dev/docs/ai/api/models/groq/) # pydantic\_ai.models.groq ## Setup For details on how to set up authentication with this model, see [model configuration for Groq](/docs/ai/models/groq). ### GroqModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings used for a Groq model request. #### Attributes ##### groq\_reasoning\_format The format of the reasoning output. See [the Groq docs](https://console.groq.com/docs/reasoning#reasoning-format) for more details. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['hidden', 'raw', 'parsed'\] ### GroqModel **Bases:** `Model[AsyncGroq]` A model that uses the Groq API. Internally, this uses the [Groq Python client](https://github.com/groq/groq-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. #### Attributes ##### model\_name The model name. **Type:** `GroqModelName` ##### system The model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: GroqModelName, provider: Literal['groq', 'gateway'] | Provider[AsyncGroq] = 'groq', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Initialize a Groq model. ###### Parameters **`model_name`** : `GroqModelName` The name of the Groq model to use. List of model names available [here](https://console.groq.com/docs/models). **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['groq', 'gateway'\] | `Provider`\[`AsyncGroq`\] _Default:_ `'groq'` The provider to use for authentication and API access. Can be either the string 'groq' or an instance of `Provider[AsyncGroq]`. If not provided, a new provider will be created using the other parameters. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider based on the model name. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific settings that will be used as defaults for this model. ##### supported\_builtin\_tools `@classmethod` ```python def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]] ``` Return the set of builtin tool types this model can handle. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] ### GroqStreamedResponse **Bases:** `StreamedResponse` Implementation of `StreamedResponse` for Groq models. #### Attributes ##### model\_name Get the model name of the response. **Type:** `GroqModelName` ##### provider\_name Get the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ### ProductionGroqModelNames Production Groq models from [https://console.groq.com/docs/models#production-models](https://console.groq.com/docs/models#production-models). **Default:** `Literal['llama-3.1-8b-instant', 'llama-3.3-70b-versatile', 'meta-llama/llama-guard-4-12b', 'openai/gpt-oss-120b', 'openai/gpt-oss-20b', 'whisper-large-v3', 'whisper-large-v3-turbo']` ### PreviewGroqModelNames Preview Groq models from [https://console.groq.com/docs/models#preview-models](https://console.groq.com/docs/models#preview-models). **Default:** `Literal['meta-llama/llama-4-maverick-17b-128e-instruct', 'meta-llama/llama-4-scout-17b-16e-instruct', 'meta-llama/llama-prompt-guard-2-22m', 'meta-llama/llama-prompt-guard-2-86m', 'moonshotai/kimi-k2-instruct-0905', 'openai/gpt-oss-safeguard-20b', 'playai-tts', 'playai-tts-arabic', 'qwen/qwen-3-32b']` ### GroqModelName Possible Groq model names. Since Groq supports a variety of models and the list changes frequently, we explicitly list the named models as of 2025-03-31 but allow any name in the type hints. See [https://console.groq.com/docs/models](https://console.groq.com/docs/models) for an up to date list of models and more details. **Default:** `str | ProductionGroqModelNames | PreviewGroqModelNames` --- # [pydantic_ai.models.huggingface](https://pydantic.dev/docs/ai/api/models/huggingface/) # pydantic\_ai.models.huggingface ## Setup For details on how to set up authentication with this model, see [model configuration for Hugging Face](/docs/ai/models/huggingface). ### HuggingFaceModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings used for a Hugging Face model request. ### HuggingFaceModel **Bases:** `Model[AsyncInferenceClient]` A model that uses Hugging Face Inference Providers. Internally, this uses the [HF Python client](https://github.com/huggingface/huggingface_hub) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. #### Attributes ##### base\_url The base URL of the provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### model\_name The model name. **Type:** `HuggingFaceModelName` ##### system The system / model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: str, provider: Literal['huggingface'] | Provider[AsyncInferenceClient] = 'huggingface', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Initialize a Hugging Face model. ###### Parameters **`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The name of the Model to use. You can browse available models [here](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=all&sort=trending). **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['huggingface'\] | `Provider`\[`AsyncInferenceClient`\] _Default:_ `'huggingface'` The provider to use for Hugging Face Inference Providers. Can be either the string 'huggingface' or an instance of `Provider[AsyncInferenceClient]`. If not provided, the other parameters will be used. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider based on the model name. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific settings that will be used as defaults for this model. ### HuggingFaceStreamedResponse **Bases:** `StreamedResponse` Implementation of `StreamedResponse` for Hugging Face models. #### Attributes ##### model\_name Get the model name of the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_name Get the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ### LatestHuggingFaceModelNames Latest Hugging Face models. **Default:** `Literal['deepseek-ai/DeepSeek-R1', 'meta-llama/Llama-3.3-70B-Instruct', 'meta-llama/Llama-4-Maverick-17B-128E-Instruct', 'meta-llama/Llama-4-Scout-17B-16E-Instruct', 'Qwen/QwQ-32B', 'Qwen/Qwen2.5-72B-Instruct', 'Qwen/Qwen3-235B-A22B', 'Qwen/Qwen3-32B']` ### HuggingFaceModelName Possible Hugging Face model names. You can browse available models [here](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=all&sort=trending). **Default:** `str | LatestHuggingFaceModelNames` --- # [pydantic_ai.models.instrumented](https://pydantic.dev/docs/ai/api/models/instrumented/) # pydantic\_ai.models.instrumented ### InstrumentationSettings Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - [`Agent.instrument_all()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instrument_all) - [`InstrumentedModel`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentedModel) See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. #### Methods ##### \_\_init\_\_ ```python def __init__( tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2, 3, 4, 5] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal['attributes', 'logs'] = 'attributes', logger_provider: LoggerProvider | None = None, use_aggregated_usage_attribute_names: bool = False, ) ``` Create instrumentation options. ###### Parameters **`tracer_provider`** : `TracerProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. **`meter_provider`** : `MeterProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. **`include_binary_content`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include binary content in the instrumentation events. **`include_content`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. **`version`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[1, 2, 3, 4, 5\] _Default:_ `DEFAULT_INSTRUMENTATION_VERSION` Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. Version 3 is the same as version 2, with additional support for thinking tokens. Version 4 is the same as version 3, with GenAI semantic conventions for multimodal content: URL-based media uses type='uri' with uri and mime\_type fields (and modality for image/audio/video). Inline binary content uses type='blob' with mime\_type and content fields (and modality for image/audio/video). [https://opentelemetry.io/docs/specs/semconv/gen-ai/non-normative/examples-llm-calls/#multimodal-inputs-example](https://opentelemetry.io/docs/specs/semconv/gen-ai/non-normative/examples-llm-calls/#multimodal-inputs-example) Version 5 is the same as version 4, but CallDeferred and ApprovalRequired exceptions no longer record an exception event or set the span status to ERROR -- the span is left as UNSET, since deferrals are control flow, not errors. **`event_mode`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['attributes', 'logs'\] _Default:_ `'attributes'` The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. **`logger_provider`** : `LoggerProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The OpenTelemetry logger provider to use. If not provided, the global logger provider is used. Calling `logfire.configure()` sets the global logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. **`use_aggregated_usage_attribute_names`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to use `gen_ai.aggregated_usage.*` attribute names for token usage on agent run spans instead of the standard `gen_ai.usage.*` names. Enable this to prevent double-counting in observability backends that aggregate span attributes across parent and child spans. Defaults to False. Note: `gen_ai.aggregated_usage.*` is a custom namespace, not part of the OpenTelemetry Semantic Conventions. It may be updated if OTel introduces an official convention. ##### messages\_to\_otel\_events ```python def messages_to_otel_events( messages: list[ModelMessage], parameters: ModelRequestParameters | None = None, ) -> list[LogRecord] ``` Convert a list of model messages to OpenTelemetry events. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`LogRecord`\] -- A list of OpenTelemetry events. ###### Parameters **`messages`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] The messages to convert. **`parameters`** : `ModelRequestParameters` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model request parameters. ### InstrumentedModel **Bases:** `WrapperModel` Model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. #### Attributes ##### instrumentation\_settings Instrumentation settings for this model. **Type:** [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) **Default:** `options or InstrumentationSettings()` ### CostCalculationFailedWarning **Bases:** [`Warning`](https://docs.python.org/3/library/exceptions.html#Warning) Warning raised when cost calculation fails. ### instrument\_model ```python def instrument_model(model: Model, instrument: InstrumentationSettings | bool) -> Model ``` Instrument a model with OpenTelemetry/logfire. #### Returns `Model` --- # [pydantic_ai.models.mcp_sampling](https://pydantic.dev/docs/ai/api/models/mcp-sampling/) # pydantic\_ai.models.mcp\_sampling ### MCPSamplingModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings used for an MCP Sampling model request. #### Attributes ##### mcp\_model\_preferences Model preferences to use for MCP Sampling. **Type:** `ModelPreferences` ### MCPSamplingModel **Bases:** `Model` A model that uses MCP Sampling. [MCP Sampling](https://modelcontextprotocol.io/docs/concepts/sampling) allows an MCP server to make requests to a model by calling back to the MCP client that connected to it. #### Attributes ##### session The MCP server session to use for sampling. **Type:** `ServerSession` ##### default\_max\_tokens Default max tokens to use if not set in [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings.max_tokens). Max tokens is a required parameter for MCP Sampling, but optional on [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings), so this value is used as fallback. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `16384` ##### model\_name The model name. Since the model name isn't known until the request is made, this property always returns `'mcp-sampling'`. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### system The system / model provider, returns `'MCP'`. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) --- # [pydantic_ai.models.mistral](https://pydantic.dev/docs/ai/api/models/mistral/) # pydantic\_ai.models.mistral ## Setup For details on how to set up authentication with this model, see [model configuration for Mistral](/docs/ai/models/mistral). ### MistralModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings used for a Mistral model request. ### MistralModel **Bases:** `Model[Mistral]` A model that uses Mistral. Internally, this uses the [Mistral Python client](https://github.com/mistralai/client-python) to interact with the API. [API Documentation](https://docs.mistral.ai/) #### Attributes ##### model\_name The model name. **Type:** `MistralModelName` ##### system The model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: MistralModelName, provider: Literal['mistral'] | Provider[Mistral] = 'mistral', profile: ModelProfileSpec | None = None, json_mode_schema_prompt: str = 'Answer in JSON Object, respect the format:\n```\n{schema}\n```\n', settings: ModelSettings | None = None, ) ``` Initialize a Mistral model. ###### Parameters **`model_name`** : `MistralModelName` The name of the model to use. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['mistral'\] | `Provider`\[`Mistral`\] _Default:_ `'mistral'` The provider to use for authentication and API access. Can be either the string 'mistral' or an instance of `Provider[Mistral]`. If not provided, a new provider will be created using the other parameters. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider based on the model name. **`json_mode_schema_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'Answer in JSON Object, respect the format:\n```\n{schema}\n```\n'` The prompt to show when the model expects a JSON object as input. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific settings that will be used as defaults for this model. ##### request `@async` ```python def request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a non-streaming request to the model from Pydantic AI call. ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### request\_stream `@async` ```python def request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a streaming request to the model from Pydantic AI call. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\] ### MistralStreamedResponse **Bases:** `StreamedResponse` Implementation of `StreamedResponse` for Mistral models. #### Attributes ##### model\_name Get the model name of the response. **Type:** `MistralModelName` ##### provider\_name Get the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ### LatestMistralModelNames Latest Mistral models. **Default:** `Literal['mistral-large-latest', 'mistral-small-latest', 'codestral-latest', 'mistral-moderation-latest']` ### MistralModelName Possible Mistral model names. Since Mistral supports a variety of date-stamped models, we explicitly list the most popular models but allow any name in the type hints. Since [the Mistral docs](https://docs.mistral.ai/getting-started/models/models_overview/) for a full list. **Default:** `str | LatestMistralModelNames` --- # [pydantic_ai.models.openai](https://pydantic.dev/docs/ai/api/models/openai/) # pydantic\_ai.models.openai ## Setup For details on how to set up authentication with this model, see [model configuration for OpenAI](/docs/ai/models/openai). ### OpenAIChatModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings used for an OpenAI model request. #### Attributes ##### openai\_reasoning\_effort Constrains effort on reasoning for [reasoning models](https://platform.openai.com/docs/guides/reasoning). Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. **Type:** `ReasoningEffort` ##### openai\_logprobs Include log probabilities in the response. For Chat models, these will be included in `ModelResponse.provider_details['logprobs']`. For Responses models, these will be included in the response output parts `TextPart.provider_details['logprobs']`. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### openai\_top\_logprobs Include log probabilities of the top n tokens in the response. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### openai\_store Whether or not to store the output of this request in OpenAI's systems. If `False`, OpenAI will not store the request for its own internal review or training. See [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat/create#chat-create-store). **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### openai\_user A unique identifier representing the end-user, which can help OpenAI monitor and detect abuse. See [OpenAI's safety best practices](https://platform.openai.com/docs/guides/safety-best-practices#end-user-ids) for more details. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### openai\_service\_tier The service tier to use for the model request. Currently supported values are `auto`, `default`, `flex`, and `priority`. For more information, see [OpenAI's service tiers documentation](https://platform.openai.com/docs/api-reference/chat/object#chat/object-service_tier). **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', 'default', 'flex', 'priority'\] ##### openai\_prediction Enables [predictive outputs](https://platform.openai.com/docs/guides/predicted-outputs). This feature is currently only supported for some OpenAI models. **Type:** `ChatCompletionPredictionContentParam` ##### openai\_prompt\_cache\_key Used by OpenAI to cache responses for similar requests to optimize your cache hit rates. See the [OpenAI Prompt Caching documentation](https://platform.openai.com/docs/guides/prompt-caching#how-it-works) for more information. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### openai\_prompt\_cache\_retention The retention policy for the prompt cache. Set to 24h to enable extended prompt caching, which keeps cached prefixes active for longer, up to a maximum of 24 hours. See the [OpenAI Prompt Caching documentation](https://platform.openai.com/docs/guides/prompt-caching#how-it-works) for more information. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['in\_memory', '24h'\] ##### openai\_continuous\_usage\_stats When True, enables continuous usage statistics in streaming responses. When enabled, the API returns cumulative usage data with each chunk rather than only at the end. This setting correctly handles the cumulative nature of these stats by using only the final usage values rather than summing all intermediate values. See [OpenAI's streaming documentation](https://platform.openai.com/docs/api-reference/chat/create#stream_options) for more information. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ### OpenAIModelSettings **Bases:** `OpenAIChatModelSettings` Deprecated alias for `OpenAIChatModelSettings`. ### OpenAIResponsesModelSettings **Bases:** `OpenAIChatModelSettings` Settings used for an OpenAI Responses model request. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. #### Attributes ##### openai\_builtin\_tools The provided OpenAI built-in tools to use. See [OpenAI's built-in tools](https://platform.openai.com/docs/guides/tools?api-mode=responses) for more details. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`FileSearchToolParam` | `WebSearchToolParam` | `ComputerToolParam`\] ##### openai\_reasoning\_generate\_summary Deprecated alias for `openai_reasoning_summary`. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['detailed', 'concise'\] ##### openai\_reasoning\_summary A summary of the reasoning performed by the model. This can be useful for debugging and understanding the model's reasoning process. One of `concise`, `detailed`, or `auto`. Check the [OpenAI Reasoning documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries) for more details. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['detailed', 'concise', 'auto'\] ##### openai\_send\_reasoning\_ids Whether to send the unique IDs of reasoning, text, and function call parts from the message history to the model. Enabled by default for reasoning models. This can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](/docs/ai/core-concepts/message-history#processing-message-history). In that case, you'll want to disable this. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### openai\_truncation The truncation strategy to use for the model response. It can be either: - `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. - `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['disabled', 'auto'\] ##### openai\_text\_verbosity Constrains the verbosity of the model's text response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are `low`, `medium`, and `high`. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high'\] ##### openai\_previous\_response\_id The ID of a previous response from the model to use as the starting point for a continued conversation. When set to `'auto'`, the request automatically uses the most recent `provider_response_id` from the message history and omits earlier messages. This enables the model to use server-side conversation state and faithfully reference previous reasoning. See the [OpenAI Responses API documentation](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) for more information. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto'\] | [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### openai\_include\_code\_execution\_outputs Whether to include the code execution results in the response. Corresponds to the `code_interpreter_call.outputs` value of the `include` parameter in the Responses API. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### openai\_include\_web\_search\_sources Whether to include the web search results in the response. Corresponds to the `web_search_call.action.sources` value of the `include` parameter in the Responses API. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### openai\_include\_file\_search\_results Whether to include the file search results in the response. Corresponds to the `file_search_call.results` value of the `include` parameter in the Responses API. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### openai\_include\_raw\_annotations Whether to include the raw annotations in `TextPart.provider_details`. When enabled, any annotations (e.g., citations from web search) will be available in the `provider_details['annotations']` field of text parts. This is opt-in since there may be overlap with native annotation support once added via [https://github.com/pydantic/pydantic-ai/issues/3126](https://github.com/pydantic/pydantic-ai/issues/3126). **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ### OpenAIChatModel **Bases:** `Model[AsyncOpenAI]` A model that uses the OpenAI API. Internally, this uses the [OpenAI Python client](https://github.com/openai/openai-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. #### Attributes ##### model\_name The model name. **Type:** `OpenAIModelName` ##### system The model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### profile The model profile. WebSearchTool is only supported if openai\_chat\_supports\_web\_search is True. **Type:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: OpenAIModelName, provider: OpenAIChatCompatibleProvider | Literal['openai', 'openai-chat', 'gateway'] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None def __init__( model_name: OpenAIModelName, provider: OpenAIChatCompatibleProvider | Literal['openai', 'openai-chat', 'gateway'] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, system_prompt_role: OpenAISystemPromptRole | None = None, settings: ModelSettings | None = None, ) -> None ``` Initialize an OpenAI model. ###### Parameters **`model_name`** : `OpenAIModelName` The name of the OpenAI model to use. List of model names available [here](https://github.com/openai/openai-python/blob/v1.54.3/src/openai/types/chat_model.py#L7) (Unfortunately, despite being ask to do so, OpenAI do not provide `.inv` files for their API). **`provider`** : `OpenAIChatCompatibleProvider` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['openai', 'openai-chat', 'gateway'\] | `Provider`\[`AsyncOpenAI`\] _Default:_ `'openai'` The provider to use. Defaults to `'openai'`. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider based on the model name. **`system_prompt_role`** : `OpenAISystemPromptRole` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The role to use for the system prompt message. If not provided, defaults to `'system'`. In the future, this may be inferred from the model name. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default model settings for this model instance. ##### supported\_builtin\_tools `@classmethod` ```python def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]] ``` Return the set of builtin tool types this model can handle. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] ### OpenAIModel **Bases:** `OpenAIChatModel` Deprecated alias for `OpenAIChatModel`. ### OpenAIResponsesModel **Bases:** `Model[AsyncOpenAI]` A model that uses the OpenAI Responses API. The [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses) is the new API for OpenAI models. If you are interested in the differences between the Responses API and the Chat Completions API, see the [OpenAI API docs](https://platform.openai.com/docs/guides/responses-vs-chat-completions). #### Attributes ##### model\_name The model name. **Type:** `OpenAIModelName` ##### system The model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: OpenAIModelName, provider: OpenAIResponsesCompatibleProvider | Literal['openai', 'gateway'] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Initialize an OpenAI Responses model. ###### Parameters **`model_name`** : `OpenAIModelName` The name of the OpenAI model to use. **`provider`** : `OpenAIResponsesCompatibleProvider` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['openai', 'gateway'\] | `Provider`\[`AsyncOpenAI`\] _Default:_ `'openai'` The provider to use. Defaults to `'openai'`. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider based on the model name. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default model settings for this model instance. ##### supported\_builtin\_tools `@classmethod` ```python def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]] ``` Return the set of builtin tool types this model can handle. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] ##### compact\_messages `@async` ```python def compact_messages( request_context: ModelRequestContext, instructions: str | None = None, ) -> ModelResponse ``` Compact messages using the OpenAI Responses compaction endpoint. This calls OpenAI's `responses.compact` API to produce an encrypted compaction that summarizes the conversation history. The returned `ModelResponse` contains a single `CompactionPart` that must be round-tripped in subsequent requests. ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) -- A `ModelResponse` with a single `CompactionPart` containing the encrypted compaction data. ###### Parameters **`request_context`** : `ModelRequestContext` The model request context containing messages, settings, and parameters. **`instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional custom instructions for the compaction summarization. If provided, these override the agent-level instructions. ### OpenAIStreamedResponse **Bases:** `StreamedResponse` Implementation of `StreamedResponse` for OpenAI models. #### Attributes ##### model\_name Get the model name of the response. **Type:** `OpenAIModelName` ##### provider\_name Get the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ### OpenAIResponsesStreamedResponse **Bases:** `StreamedResponse` Implementation of `StreamedResponse` for OpenAI Responses API. #### Attributes ##### model\_name Get the model name of the response. **Type:** `OpenAIModelName` ##### provider\_name Get the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ### OpenAICompaction **Bases:** `AbstractCapability[AgentDepsT]` Compaction capability for OpenAI Responses API. Calls the `responses.compact` endpoint via a `before_model_request` hook when the trigger condition is met. The compacted history replaces the original messages, keeping conversation context within manageable limits. Example usage:: from pydantic\_ai import Agent from pydantic\_ai.models.openai import OpenAICompaction agent = Agent( 'openai-responses:gpt-4o', capabilities=\[OpenAICompaction(message\_count\_threshold=10)\], ) #### Methods ##### \_\_init\_\_ ```python def __init__( message_count_threshold: int | None = None, trigger: Callable[[list[ModelMessage]], bool] | None = None, instructions: str | None = None, ) -> None ``` Initialize the OpenAI compaction capability. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`message_count_threshold`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Compact when the message count exceeds this threshold. **`trigger`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom callable that decides whether to compact based on the current messages. Takes precedence over `message_count_threshold`. **`instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom instructions for the compaction summarization. ### DEPRECATED\_OPENAI\_MODELS Models that are deprecated or don't exist but are still present in the OpenAI SDK's type definitions. **Type:** [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] **Default:** `frozenset({'chatgpt-4o-latest', 'codex-mini-latest', 'gpt-4-0125-preview', 'gpt-4-1106-preview', 'gpt-4-turbo-preview', 'gpt-4-32k', 'gpt-4-32k-0314', 'gpt-4-32k-0613', 'gpt-4-vision-preview', 'gpt-4o-audio-preview-2024-10-01', 'gpt-5.1-mini', 'o1-mini', 'o1-mini-2024-09-12', 'o1-preview', 'o1-preview-2024-09-12'})` ### OpenAIModelName Possible OpenAI model names. Since OpenAI supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the OpenAI docs](https://platform.openai.com/docs/models) for a full list. Using this more broad type for the model name instead of the ChatModel definition allows this model to be used more easily with other model types (ie, Ollama, Deepseek). **Default:** `str | AllModels` ### MCP\_SERVER\_TOOL\_CONNECTOR\_URI\_SCHEME Prefix for OpenAI connector IDs. OpenAI supports either a URL or a connector ID when passing MCP configuration to a model, by using that prefix like `x-openai-connector:` in a URL, you can pass a connector ID to a model. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['x-openai-connector'\] **Default:** `'x-openai-connector'` --- # [pydantic_ai.models.openrouter](https://pydantic.dev/docs/ai/api/models/openrouter/) # pydantic\_ai.models.openrouter ## Setup For details on how to set up authentication with this model, see [model configuration for OpenRouter](/docs/ai/models/openrouter). ### OpenRouterProviderConfig **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Represents the 'Provider' object from the OpenRouter API. #### Attributes ##### order List of provider slugs to try in order (e.g. \["anthropic", "openai"\]). [See details](https://openrouter.ai/docs/features/provider-routing#ordering-specific-providers) **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`OpenRouterProviderName`\] ##### allow\_fallbacks Whether to allow backup providers when the primary is unavailable. [See details](https://openrouter.ai/docs/features/provider-routing#disabling-fallbacks) **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### require\_parameters Only use providers that support all parameters in your request. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### data\_collection Control whether to use providers that may store data. [See details](https://openrouter.ai/docs/features/provider-routing#requiring-providers-to-comply-with-data-policies) **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['allow', 'deny'\] ##### zdr Restrict routing to only ZDR (Zero Data Retention) endpoints. [See details](https://openrouter.ai/docs/features/provider-routing#zero-data-retention-enforcement) **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### only List of provider slugs to allow for this request. [See details](https://openrouter.ai/docs/features/provider-routing#allowing-only-specific-providers) **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`OpenRouterProviderName`\] ##### ignore List of provider slugs to skip for this request. [See details](https://openrouter.ai/docs/features/provider-routing#ignoring-providers) **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### quantizations List of quantization levels to filter by (e.g. \["int4", "int8"\]). [See details](https://openrouter.ai/docs/features/provider-routing#quantization) **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['int4', 'int8', 'fp4', 'fp6', 'fp8', 'fp16', 'bf16', 'fp32', 'unknown'\]\] ##### sort Sort providers by price or throughput. (e.g. "price" or "throughput"). [See details](https://openrouter.ai/docs/features/provider-routing#provider-sorting) **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['price', 'throughput', 'latency'\] ##### max\_price The maximum pricing you want to pay for this request. [See details](https://openrouter.ai/docs/features/provider-routing#max-price) **Type:** `_OpenRouterMaxPrice` ### OpenRouterReasoning **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Configuration for reasoning tokens in OpenRouter requests. Reasoning tokens allow models to show their step-by-step thinking process. You can configure this using either OpenAI-style effort levels or Anthropic-style token limits, but not both simultaneously. #### Attributes ##### effort OpenAI-style reasoning effort level. Cannot be used with max\_tokens. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['xhigh', 'high', 'medium', 'low', 'minimal', 'none'\] ##### max\_tokens Anthropic-style specific token limit for reasoning. Cannot be used with effort. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### exclude Whether to exclude reasoning tokens from the response. Default is False. All models support this. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### enabled Whether to enable reasoning with default parameters. Default is inferred from effort or max\_tokens. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ### OpenRouterUsageConfig **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Configuration for OpenRouter usage. ### OpenRouterModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings used for an OpenRouter model request. #### Attributes ##### openrouter\_models A list of fallback models. These models will be tried, in order, if the main model returns an error. [See details](https://openrouter.ai/docs/features/model-routing#the-models-parameter) **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### openrouter\_provider OpenRouter routes requests to the best available providers for your model. By default, requests are load balanced across the top providers to maximize uptime. You can customize how your requests are routed using the provider object. [See more](https://openrouter.ai/docs/features/provider-routing) **Type:** `OpenRouterProviderConfig` ##### openrouter\_preset Presets allow you to separate your LLM configuration from your code. Create and manage presets through the OpenRouter web application to control provider routing, model selection, system prompts, and other parameters, then reference them in OpenRouter API requests. [See more](https://openrouter.ai/docs/features/presets) **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### openrouter\_transforms To help with prompts that exceed the maximum context size of a model. Transforms work by removing or truncating messages from the middle of the prompt, until the prompt fits within the model's context window. [See more](https://openrouter.ai/docs/features/message-transforms) **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`OpenRouterTransforms`\] ##### openrouter\_reasoning To control the reasoning tokens in the request. The reasoning config object consolidates settings for controlling reasoning strength across different models. [See more](https://openrouter.ai/docs/use-cases/reasoning-tokens) **Type:** `OpenRouterReasoning` ##### openrouter\_usage To control the usage of the model. The usage config object consolidates settings for enabling detailed usage information. [See more](https://openrouter.ai/docs/use-cases/usage-accounting) **Type:** `OpenRouterUsageConfig` ### OpenRouterModel **Bases:** `OpenAIChatModel` Extends OpenAIModel to capture extra metadata for Openrouter. #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: str, provider: Literal['openrouter'] | Provider[AsyncOpenAI] = 'openrouter', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Initialize an OpenRouter model. ###### Parameters **`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The name of the model to use. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['openrouter'\] | `Provider`\[`AsyncOpenAI`\] _Default:_ `'openrouter'` The provider to use for authentication and API access. If not provided, a new provider will be created with the default settings. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider based on the model name. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific settings that will be used as defaults for this model. ##### supported\_builtin\_tools `@classmethod` ```python def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]] ``` Return the set of builtin tool types this model can handle. OpenRouter supports web search via its plugins system. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] ### OpenRouterStreamedResponse **Bases:** `OpenAIStreamedResponse` Implementation of `StreamedResponse` for OpenRouter models. ### KnownOpenRouterProviders Known providers in the OpenRouter marketplace **Default:** `Literal['z-ai', 'cerebras', 'venice', 'moonshotai', 'morph', 'stealth', 'wandb', 'klusterai', 'openai', 'sambanova', 'amazon-bedrock', 'mistral', 'nextbit', 'atoma', 'ai21', 'minimax', 'baseten', 'anthropic', 'featherless', 'groq', 'lambda', 'azure', 'ncompass', 'deepseek', 'hyperbolic', 'crusoe', 'cohere', 'mancer', 'avian', 'perplexity', 'novita', 'siliconflow', 'switchpoint', 'xai', 'inflection', 'fireworks', 'deepinfra', 'inference-net', 'inception', 'atlas-cloud', 'nvidia', 'alibaba', 'friendli', 'infermatic', 'targon', 'ubicloud', 'aion-labs', 'liquid', 'nineteen', 'cloudflare', 'nebius', 'chutes', 'enfer', 'crofai', 'open-inference', 'phala', 'gmicloud', 'meta', 'relace', 'parasail', 'together', 'google-ai-studio', 'google-vertex']` ### OpenRouterProviderName Possible OpenRouter provider names. Since OpenRouter is constantly updating their list of providers, we explicitly list some known providers but allow any name in the type hints. See [the OpenRouter API](https://openrouter.ai/docs/api-reference/list-available-providers) for a full list. **Default:** `str | KnownOpenRouterProviders` ### OpenRouterTransforms Available messages transforms for OpenRouter models with limited token windows. Currently only supports 'middle-out', but is expected to grow in the future. **Default:** `Literal['middle-out']` --- # [pydantic_ai.models.outlines](https://pydantic.dev/docs/ai/api/models/outlines/) # pydantic\_ai.models.outlines ## Setup For details on how to set up this model, see [model configuration for Outlines](/docs/ai/models/outlines). ### OutlinesModel **Bases:** `Model` A model that relies on the Outlines library to run non API-based models. #### Methods ##### \_\_init\_\_ ```python def __init__( model: OutlinesBaseModel | OutlinesAsyncBaseModel, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Initialize an Outlines model. ###### Parameters **`model`** : `OutlinesBaseModel` | `OutlinesAsyncBaseModel` The Outlines model used for the model. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'` The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default model settings for this model instance. ##### from\_transformers `@classmethod` ```python def from_transformers( cls, hf_model: transformers.modeling_utils.PreTrainedModel, hf_tokenizer_or_processor: transformers.PreTrainedTokenizer | transformers.processing_utils.ProcessorMixin, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Create an Outlines model from a Hugging Face model and tokenizer. ###### Parameters **`hf_model`** : `transformers.modeling_utils.PreTrainedModel` The Hugging Face PreTrainedModel or any model that is compatible with the `transformers` API. **`hf_tokenizer_or_processor`** : `transformers.PreTrainedTokenizer` | `transformers.processing_utils.ProcessorMixin` Either a HuggingFace `PreTrainedTokenizer` or any tokenizer that is compatible with the `transformers` API, or a HuggingFace processor inheriting from `ProcessorMixin`. If a tokenizer is provided, a regular model will be used, while if you provide a processor, it will be a multimodal model. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'` The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default model settings for this model instance. ##### from\_llamacpp `@classmethod` ```python def from_llamacpp( cls, llama_model: llama_cpp.Llama, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Create an Outlines model from a LlamaCpp model. ###### Parameters **`llama_model`** : `llama_cpp.Llama` The llama\_cpp.Llama model to use. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'` The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default model settings for this model instance. ##### from\_mlxlm `@classmethod` ```python def from_mlxlm( cls, mlx_model: nn.Module, mlx_tokenizer: transformers.PreTrainedTokenizer, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Create an Outlines model from a MLXLM model. ###### Parameters **`mlx_model`** : `nn.Module` The nn.Module model to use. **`mlx_tokenizer`** : `transformers.PreTrainedTokenizer` The PreTrainedTokenizer to use. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'` The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default model settings for this model instance. ##### from\_sglang `@classmethod` ```python def from_sglang( cls, base_url: str, api_key: str | None = None, model_name: str | None = None, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Create an Outlines model to send requests to an SGLang server. ###### Parameters **`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The url of the SGLang server. **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authenticating requests to the SGLang server. **`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The name of the model to use. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'` The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default model settings for this model instance. ##### from\_vllm\_offline `@classmethod` ```python def from_vllm_offline( cls, vllm_model: Any, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Create an Outlines model from a vLLM offline inference model. ###### Parameters **`vllm_model`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) The vllm.LLM local model to use. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'` The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model profile to use. Defaults to a profile picked by the provider. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default model settings for this model instance. ##### format\_inference\_kwargs ```python def format_inference_kwargs(model_settings: ModelSettings | None) -> dict[str, Any] ``` Format the model settings for the inference kwargs. ###### Returns [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ### OutlinesStreamedResponse **Bases:** `StreamedResponse` Implementation of `StreamedResponse` for Outlines models. #### Attributes ##### model\_name Get the model name of the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_name Get the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) --- # [pydantic_ai.models.test](https://pydantic.dev/docs/ai/api/models/test/) # pydantic\_ai.models.test Utility model for quickly testing apps built with Pydantic AI. Here's a minimal example: test\_model\_usage.py ```python from pydantic_ai import Agent from pydantic_ai.models.test import TestModel my_agent = Agent('openai:gpt-5.2', instructions='...') async def test_my_agent(): """Unit test for my_agent, to be run by pytest.""" m = TestModel() with my_agent.override(model=m): result = await my_agent.run('Testing my agent...') assert result.output == 'success (no tool calls)' assert m.last_model_request_parameters.function_tools == [] ``` See [Unit testing with `TestModel`](/docs/ai/guides/testing#unit-testing-with-testmodel) for detailed documentation. ### TestModel **Bases:** `Model` A model specifically for testing purposes. This will (by default) call all tools in the agent, then return a tool response if possible, otherwise a plain response. How useful this model is will vary significantly. Apart from `__init__` derived by the `dataclass` decorator, all methods are private or match those of the base class. #### Attributes ##### call\_tools List of tools to call. If `'all'`, all tools will be called. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['all'\] **Default:** `call_tools` ##### custom\_output\_text If set, this text is returned as the final output. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `custom_output_text` ##### custom\_output\_args If set, these args will be passed to the output tool. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `custom_output_args` ##### seed Seed for generating random data. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `seed` ##### last\_model\_request\_parameters The last ModelRequestParameters passed to the model in a request. The ModelRequestParameters contains information about the function and output tools available during request handling. This is set when a request is made, so will reflect the function tools from the last step of the last run. **Type:** `ModelRequestParameters` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### model\_name The model name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### system The model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( call_tools: list[str] | Literal['all'] = 'all', custom_output_text: str | None = None, custom_output_args: Any | None = None, seed: int = 0, model_name: str = 'test', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Initialize TestModel with optional settings and profile. ##### supported\_builtin\_tools `@classmethod` ```python def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]] ``` TestModel supports all builtin tools for testing flexibility. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] ### TestStreamedResponse **Bases:** `StreamedResponse` A structured response that streams test data. #### Attributes ##### model\_name Get the model name of the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_name Get the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) --- # [pydantic_ai.models.wrapper](https://pydantic.dev/docs/ai/api/models/wrapper/) # pydantic\_ai.models.wrapper ### CompletedStreamedResponse **Bases:** `StreamedResponse` A `StreamedResponse` that wraps an already-completed `ModelResponse`. Used by durable execution integrations (Temporal, Prefect, DBOS) where the actual stream is consumed within a task/activity and only the final response is returned. ### WrapperModel **Bases:** `Model` Model which wraps another model. Does nothing on its own, used as a base class. #### Attributes ##### wrapped The underlying model being wrapped. **Type:** `Model` **Default:** `infer_model(wrapped)` ##### settings Get the settings from the wrapped model. **Type:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) --- # [pydantic_ai.models.xai](https://pydantic.dev/docs/ai/api/models/xai/) # pydantic\_ai.models.xai ## Setup For details on how to set up authentication with this model, see [model configuration for xAI](/docs/ai/models/xai). xAI model implementation using [xAI SDK](https://github.com/xai-org/xai-sdk-python). ### XaiModelSettings **Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) Settings specific to xAI models. See [xAI SDK documentation](https://docs.x.ai/docs) for more details on these parameters. #### Attributes ##### xai\_logprobs Whether to return log probabilities of the output tokens or not. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### xai\_top\_logprobs An integer between 0 and 20 specifying the number of most likely tokens to return at each position. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### xai\_user A unique identifier representing your end-user, which can help xAI to monitor and detect abuse. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### xai\_store\_messages Whether to store messages on xAI's servers for conversation continuity. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### xai\_previous\_response\_id The ID of the previous response to continue the conversation. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### xai\_include\_encrypted\_content Whether to include the encrypted content in the response. Corresponds to the `use_encrypted_content` value of the model settings in the Responses API. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### xai\_include\_code\_execution\_output Whether to include the code execution results in the response. Corresponds to the `code_interpreter_call.outputs` value of the `include` parameter in the Responses API. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### xai\_include\_web\_search\_output Whether to include the web search results in the response. Corresponds to the `web_search_call.action.sources` value of the `include` parameter in the Responses API. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### xai\_include\_inline\_citations Whether to include inline citations in the response. Corresponds to the `inline_citations` option in the xAI `include` parameter. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### xai\_include\_mcp\_output Whether to include the MCP results in the response. Corresponds to the `mcp_call.outputs` value of the `include` parameter in the Responses API. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### xai\_reasoning\_effort Reasoning effort level for Grok reasoning models. See [https://docs.x.ai](https://docs.x.ai) for details. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'high'\] ### XaiModel **Bases:** `Model[AsyncClient]` A model that uses the xAI SDK to interact with xAI models. #### Attributes ##### model\_name The model name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### system The model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: XaiModelName, provider: Literal['xai'] | Provider[AsyncClient] = 'xai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) ``` Initialize the xAI model. ###### Parameters **`model_name`** : `XaiModelName` The name of the xAI model to use (e.g., "grok-4-1-fast-non-reasoning") **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['xai'\] | `Provider`\[`AsyncClient`\] _Default:_ `'xai'` The provider to use for API calls. Defaults to `'xai'`. **`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model profile specification. Defaults to a profile picked by the provider based on the model name. **`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model settings. ##### supported\_builtin\_tools `@classmethod` ```python def supported_builtin_tools(cls) -> frozenset[type] ``` Return the set of builtin tool types this model can handle. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\] ##### request `@async` ```python def request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a request to the xAI model. ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### request\_stream `@async` ```python def request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a streaming request to the xAI model. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\] ### XaiStreamedResponse **Bases:** `StreamedResponse` Implementation of `StreamedResponse` for xAI SDK. #### Attributes ##### system The model provider system name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_url Get the provider base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### model\_name Get the model name of the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_name The model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ### XAI\_EFFORT\_MAP Maps unified thinking values to xAI reasoning\_effort. xAI only supports 'low' and 'high'. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`ThinkingLevel`, [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'high'\]\] **Default:** `{True: 'high', 'minimal': 'low', 'low': 'low', 'medium': 'high', 'high': 'high', 'xhigh': 'high'}` ### XaiModelName Possible xAI model names. **Default:** `str | ChatModel` --- # [pydantic_ai.ag_ui](https://pydantic.dev/docs/ai/api/pydantic-ai/ag_ui/) # pydantic\_ai.ag\_ui Provides an AG-UI protocol adapter for the Pydantic AI agent. This package provides seamless integration between pydantic-ai agents and ag-ui for building interactive AI applications with streaming event-based communication. ### AGUIApp **Bases:** `Generic[AgentDepsT, OutputDataT]`, `Starlette` ASGI application for running Pydantic AI agents with AG-UI protocol support. #### Methods ##### \_\_init\_\_ ```python def __init__( agent: AbstractAgent[AgentDepsT, OutputDataT], ag_ui_version: str = DEFAULT_AG_UI_VERSION, preserve_file_data: bool = False, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[Any] | None = None, debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[Self] | None = None, ) -> None ``` An ASGI application that handles every request by running the agent and streaming the response. Note that the `deps` will be the same for each request, with the exception of the frontend state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler) protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use `AGUIAdapter.run_stream()` or `AGUIAdapter.dispatch_request()` instead. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`agent`** : `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\] The agent to run. **`ag_ui_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_AG_UI_VERSION` AG-UI protocol version controlling thinking/reasoning event format. **`preserve_file_data`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to preserve agent-generated files and uploaded files in AG-UI message conversion. See [`AGUIAdapter.preserve_file_data`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.preserve_file_data). **`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`on_complete`** : `OnCompleteFunc`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can access `all_messages()` and other result data. **`debug`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Boolean indicating if debug tracebacks should be returned on errors. **`routes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`BaseRoute`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of routes to serve incoming HTTP and WebSocket requests. **`middleware`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Middleware`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. **`exception_handlers`** : [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `ExceptionHandler`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. **`on_startup`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. **`on_shutdown`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. **`lifespan`** : `Lifespan`\[[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. ### StateHandler **Bases:** [`Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol) Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field. #### Attributes ##### state Get the current state of the agent run. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ### StateDeps **Bases:** `Generic[StateT]` Dependency type that holds state. This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`. The state is set using the `state` setter by the `Adapter` when the run starts. Implements the `StateHandler` protocol. ### handle\_ag\_ui\_request `@async` ```python def handle_ag_ui_request( agent: AbstractAgent[AgentDepsT, Any], request: Request, ag_ui_version: str = DEFAULT_AG_UI_VERSION, preserve_file_data: bool = False, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, on_complete: OnCompleteFunc[BaseEvent] | None = None, ) -> Response ``` Handle an AG-UI request by running the agent and returning a streaming response. #### Returns `Response` -- A streaming Starlette response with AG-UI protocol events. #### Parameters **`agent`** : `AbstractAgent`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] The agent to run. **`request`** : `Request` The Starlette request (e.g. from FastAPI) containing the AG-UI run input. **`ag_ui_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_AG_UI_VERSION` AG-UI protocol version controlling thinking/reasoning event format. **`preserve_file_data`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to preserve agent-generated files and uploaded files in AG-UI message conversion. See [`AGUIAdapter.preserve_file_data`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.preserve_file_data). **`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`on_complete`** : `OnCompleteFunc`\[`BaseEvent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can access `all_messages()` and other result data. ### run\_ag\_ui ```python def run_ag_ui( agent: AbstractAgent[AgentDepsT, Any], run_input: RunAgentInput, accept: str = SSE_CONTENT_TYPE, ag_ui_version: str = DEFAULT_AG_UI_VERSION, preserve_file_data: bool = False, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, on_complete: OnCompleteFunc[BaseEvent] | None = None, ) -> AsyncIterator[str] ``` Run the agent with the AG-UI run input and stream AG-UI protocol events. #### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] #### Parameters **`agent`** : `AbstractAgent`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] The agent to run. **`run_input`** : `RunAgentInput` The AG-UI run input containing thread\_id, run\_id, messages, etc. **`accept`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `SSE_CONTENT_TYPE` The accept header value for the run. **`ag_ui_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_AG_UI_VERSION` AG-UI protocol version controlling thinking/reasoning event format. **`preserve_file_data`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to preserve agent-generated files and uploaded files in AG-UI message conversion. See [`AGUIAdapter.preserve_file_data`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.preserve_file_data). **`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`on_complete`** : `OnCompleteFunc`\[`BaseEvent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can access `all_messages()` and other result data. ### SSE\_CONTENT\_TYPE Content type header value for Server-Sent Events (SSE). **Default:** `'text/event-stream'` ### OnCompleteFunc Callback function type that receives the `AgentRunResult` of the completed run. Can be sync, async, or an async generator of protocol-specific events. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[AgentRunResult[Any]], None] | Callable[[AgentRunResult[Any]], Awaitable[None]] | Callable[[AgentRunResult[Any]], AsyncIterator[EventT]]` --- # [pydantic_ai.agent](https://pydantic.dev/docs/ai/api/pydantic-ai/agent/) # pydantic\_ai.agent ### Agent **Bases:** `AbstractAgent[AgentDepsT, OutputDataT]` Class for defining "agents" - a way to have a specific type of "conversation" with an LLM. Agents are generic in the dependency type they take [`AgentDepsT`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentDepsT) and the output type they return, [`OutputDataT`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.OutputDataT). By default, if neither generic parameter is customised, agents have type `Agent[None, str]`. Minimal usage example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` #### Attributes ##### end\_strategy The strategy for handling multiple tool calls when a final result is found. - `'early'` (default): Output tools are executed first. Once a valid final result is found, remaining function and output tool calls are skipped - `'exhaustive'`: Output tools are executed first, then all function tools are executed. The first valid output tool result becomes the final output **Type:** [`EndStrategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EndStrategy) **Default:** `end_strategy` ##### model\_settings Optional model request settings to use for this agent's runs, by default. Can be a static `ModelSettings` dict or a callable that takes a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns `ModelSettings`. Callables are called before each model request, allowing dynamic per-step settings. Note, if `model_settings` is also provided at run time, those settings will be merged on top of the agent-level settings, with the run-level argument taking priority. **Type:** `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `model_settings` ##### instrument Options to automatically instrument with OpenTelemetry. **Type:** [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `instrument` ##### model The default model configured for this agent. **Type:** [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### name The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### description A human-readable description of the agent. If the description is a TemplateStr, returns the raw template source. The rendered description is available at runtime via OTel span attributes. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### deps\_type The type of dependencies used by the agent. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type) ##### output\_type The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`. **Type:** `OutputSpec`\[`OutputDataT`\] ##### event\_stream\_handler Optional handler for events from the model's streaming response and the agent's execution of tools. **Type:** `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### root\_capability The root capability of the agent, containing all registered capabilities. **Type:** `CombinedCapability`\[`AgentDepsT`\] ##### toolsets All toolsets registered on the agent, including a function toolset holding tools that were registered on the agent directly. Output tools are not included. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] #### Methods ##### \_\_init\_\_ ```python def __init__( model: models.Model | models.KnownModelName | str | None = None, output_type: OutputSpec[OutputDataT] = str, instructions: AgentInstructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, description: TemplateStr[AgentDepsT] | str | None = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, retries: int = 1, validation_context: Any | Callable[[RunContext[AgentDepsT]], Any] = None, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, toolsets: Sequence[AgentToolset[AgentDepsT]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, tool_timeout: float | None = None, max_concurrency: _concurrency.AnyConcurrencyLimit = None, capabilities: Sequence[AbstractCapability[AgentDepsT]] | None = None, ) -> None def __init__( model: models.Model | models.KnownModelName | str | None = None, output_type: OutputSpec[OutputDataT] = str, instructions: AgentInstructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, description: TemplateStr[AgentDepsT] | str | None = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, retries: int = 1, validation_context: Any | Callable[[RunContext[AgentDepsT]], Any] = None, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, mcp_servers: Sequence[MCPServer] = (), defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, tool_timeout: float | None = None, max_concurrency: _concurrency.AnyConcurrencyLimit = None, capabilities: Sequence[AbstractCapability[AgentDepsT]] | None = None, ) -> None ``` Create an agent. ###### Parameters **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The default model to use for this agent, if not provided, you must provide the model when calling it. We allow `str` here since the actual list of allowed models changes frequently. **`output_type`** : `OutputSpec`\[`OutputDataT`\] _Default:_ `str` The type of the output data, used to validate the data returned by the model, defaults to `str`. **`instructions`** : `AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Instructions to use for this agent, you can also register instructions via a function with [`instructions`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instructions) or pass additional, temporary, instructions when executing a run. **`system_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `()` Static system prompts to use for this agent, you can also register system prompts via a function with [`system_prompt`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.system_prompt). **`deps_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`AgentDepsT`\] _Default:_ `NoneType` The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set `deps=None` to satisfy Pyright or add a type hint `: Agent[None, ]`. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. **`description`** : `TemplateStr`\[`AgentDepsT`\] | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A human-readable description of the agent, attached to the agent run span as `gen_ai.agent.description` when instrumentation is enabled. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model request settings to use for this agent's runs, by default. Can be a static `ModelSettings` dict or a callable that takes a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns `ModelSettings`. Callables are called before each model request, allowing dynamic per-step settings. **`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `1` The default number of retries to allow for tool calls and output validation, before raising an error. For model request retries, see the [HTTP Request Retries](/docs/ai/advanced-features/retries) documentation. **`validation_context`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[`AgentDepsT`\]\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] _Default:_ `None` Pydantic [validation context](https://docs.pydantic.dev/latest/concepts/validators/#validation-context) used to validate tool arguments and outputs. **`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The maximum number of retries to allow for output validation, defaults to `retries`. **`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] _Default:_ `()` Tools to register with the agent, you can also register tools via the decorators [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) and [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain). **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] _Default:_ `()` The builtin tools that the agent will use. This depends on the model, as some models may not support certain tools. If the model doesn't support the builtin tools, an error will be raised. **`prepare_tools`** : `ToolsPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom function to prepare the tool definition of all tools for each step, except output tools. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. See [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) **`prepare_output_tools`** : `ToolsPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom function to prepare the tool definition of all output tools for each step. This is useful if you want to customize the definition of multiple output tools or you want to register a subset of output tools for a given step. See [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AgentToolset`\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Toolsets to register with the agent, including MCP servers and functions which take a run context and return a toolset. See [`ToolsetFunc`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.ToolsetFunc) for more information. **`defer_model_check`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` by default, if you provide a [named](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) model, it's evaluated to create a [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) instance immediately, which checks for the necessary environment variables. Set this to `false` to defer the evaluation until the first run. Useful if you want to [override the model](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.override) for testing. **`end_strategy`** : [`EndStrategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EndStrategy) _Default:_ `'early'` Strategy for handling tool calls that are requested alongside a final result. See [`EndStrategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EndStrategy) for more information. **`instrument`** : [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Set to True to automatically instrument with OpenTelemetry, which will use Logfire if it's configured. Set to an instance of [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) to customize. If this isn't set, then the last value set by [`Agent.instrument_all()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instrument_all) will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to store with each run. Provide a dictionary of primitives, or a callable returning one computed from the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) on each run. Metadata is resolved when a run starts and recomputed after a successful run finishes so it can reflect the final state. Resolved metadata can be read after the run completes via [`AgentRun.metadata`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun), [`AgentRunResult.metadata`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult), and [`StreamedRunResult.metadata`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult), and is attached to the agent run span when instrumentation is enabled. **`history_processors`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`HistoryProcessor`\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional list of callables to process the message history before sending it to the model. Each processor takes a list of messages and returns a modified list of messages. Processors can be sync or async and are applied in sequence. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional handler for events from the model's streaming response and the agent's execution of tools. **`tool_timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default timeout in seconds for tool execution. If a tool takes longer than this, the tool is considered to have failed and a retry prompt is returned to the model (counting towards the retry limit). Individual tools can override this with their own timeout. Defaults to None (no timeout). **`max_concurrency`** : `_concurrency.AnyConcurrencyLimit` _Default:_ `None` Optional limit on concurrent agent runs. Can be an integer for simple limiting, a [`ConcurrencyLimit`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimit) for advanced configuration with backpressure, a [`ConcurrencyLimiter`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimiter) for sharing limits across multiple agents, or None (default) for no limiting. When the limit is reached, additional calls to `run()` or `iter()` will wait until a slot becomes available. **`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractCapability`\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional list of [capabilities](https://ai.pydantic.dev/capabilities/) to configure the agent with. Custom capabilities can be created by subclassing [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability). ##### from\_spec `@classmethod` ```python def from_spec( cls, spec: dict[str, Any] | AgentSpec, custom_capability_types: Sequence[type[AbstractCapability[Any]]] = (), model: models.Model | models.KnownModelName | str | None = None, output_type: OutputSpec[Any] = str, instructions: AgentInstructions[Any] = None, system_prompt: str | Sequence[str] = (), name: str | None = None, description: TemplateStr[Any] | str | None = None, model_settings: ModelSettings | None = None, retries: int | None = None, validation_context: Any = None, output_retries: int | None = None, tools: Sequence[Tool[Any] | ToolFuncEither[Any, ...]] = (), builtin_tools: Sequence[AgentBuiltinTool[Any]] = (), prepare_tools: ToolsPrepareFunc[Any] | None = None, prepare_output_tools: ToolsPrepareFunc[Any] | None = None, toolsets: Sequence[AgentToolset[Any]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy | None = None, instrument: InstrumentationSettings | bool | None = None, metadata: AgentMetadata[Any] | None = None, history_processors: Sequence[HistoryProcessor[Any]] | None = None, event_stream_handler: EventStreamHandler[Any] | None = None, tool_timeout: float | None = None, max_concurrency: _concurrency.AnyConcurrencyLimit = None, capabilities: Sequence[AbstractCapability[Any]] | None = None, ) -> Agent[None, str] def from_spec( cls, spec: dict[str, Any] | AgentSpec, deps_type: type[T], custom_capability_types: Sequence[type[AbstractCapability[Any]]] = (), model: models.Model | models.KnownModelName | str | None = None, output_type: OutputSpec[Any] = str, instructions: AgentInstructions[Any] = None, system_prompt: str | Sequence[str] = (), name: str | None = None, description: TemplateStr[Any] | str | None = None, model_settings: ModelSettings | None = None, retries: int | None = None, validation_context: Any = None, output_retries: int | None = None, tools: Sequence[Tool[Any] | ToolFuncEither[Any, ...]] = (), builtin_tools: Sequence[AgentBuiltinTool[Any]] = (), prepare_tools: ToolsPrepareFunc[Any] | None = None, prepare_output_tools: ToolsPrepareFunc[Any] | None = None, toolsets: Sequence[AgentToolset[Any]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy | None = None, instrument: InstrumentationSettings | bool | None = None, metadata: AgentMetadata[Any] | None = None, history_processors: Sequence[HistoryProcessor[Any]] | None = None, event_stream_handler: EventStreamHandler[Any] | None = None, tool_timeout: float | None = None, max_concurrency: _concurrency.AnyConcurrencyLimit = None, capabilities: Sequence[AbstractCapability[Any]] | None = None, ) -> Agent[T, str] ``` Construct an Agent from a spec dict or `AgentSpec`. This allows defining agents declaratively in YAML/JSON/dict form. Keyword arguments supplement the spec: scalar spec fields (like `name`, `retries`) are used as defaults that explicit arguments override, while `capabilities` from both sources are merged. ###### Returns [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- A new Agent instance. ###### Parameters **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` The agent specification, either a dict or an `AgentSpec` instance. **`deps_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] _Default:_ `type(None)` The type of the dependencies for the agent. When provided, template strings in capabilities (e.g. `"Hello {{name}}"`) are compiled and validated against this type. **`custom_capability_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractCapability`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]\] _Default:_ `()` Additional capability classes to make available beyond the built-in defaults. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Override the model from the spec. **`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] _Default:_ `str` The type of the output data, defaults to `str`. **`instructions`** : `AgentInstructions`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] _Default:_ `None` Instructions for the agent. **`system_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `()` Static system prompts. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The agent name, overrides spec `name` if provided. **`description`** : `TemplateStr`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The agent description, overrides spec `description` if provided. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model request settings. **`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default retries for tool calls and output validation, overrides spec `retries` if provided. **`validation_context`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) _Default:_ `None` Pydantic validation context for tool arguments and outputs. **`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Max retries for output validation, overrides spec `output_retries` if provided. **`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `ToolFuncEither`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), ...\]\] _Default:_ `()` Tools to register with the agent. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] _Default:_ `()` Builtin tools for the agent. **`prepare_tools`** : `ToolsPrepareFunc`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom function to prepare tool definitions. **`prepare_output_tools`** : `ToolsPrepareFunc`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom function to prepare output tool definitions. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AgentToolset`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Toolsets to register with the agent. **`defer_model_check`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Defer model evaluation until first run. **`end_strategy`** : [`EndStrategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EndStrategy) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Strategy for tool calls alongside a final result, overrides spec `end_strategy` if provided. **`instrument`** : [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Instrumentation settings, overrides spec `instrument` if provided. **`metadata`** : `AgentMetadata`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Metadata to store with each run, overrides spec `metadata` if provided. **`history_processors`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`HistoryProcessor`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Processors for message history. **`event_stream_handler`** : `EventStreamHandler`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Handler for streaming events. **`tool_timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default timeout for tool execution, overrides spec `tool_timeout` if provided. **`max_concurrency`** : `_concurrency.AnyConcurrencyLimit` _Default:_ `None` Limit on concurrent agent runs. **`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractCapability`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Additional capabilities merged with those from the spec. ##### from\_file `@classmethod` ```python def from_file( cls, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, custom_capability_types: Sequence[type[AbstractCapability[Any]]] = (), model: models.Model | models.KnownModelName | str | None = None, output_type: OutputSpec[Any] = str, instructions: AgentInstructions[Any] = None, system_prompt: str | Sequence[str] = (), name: str | None = None, description: TemplateStr[Any] | str | None = None, model_settings: ModelSettings | None = None, retries: int | None = None, validation_context: Any = None, output_retries: int | None = None, tools: Sequence[Tool[Any] | ToolFuncEither[Any, ...]] = (), builtin_tools: Sequence[AgentBuiltinTool[Any]] = (), prepare_tools: ToolsPrepareFunc[Any] | None = None, prepare_output_tools: ToolsPrepareFunc[Any] | None = None, toolsets: Sequence[AgentToolset[Any]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy | None = None, instrument: InstrumentationSettings | bool | None = None, metadata: AgentMetadata[Any] | None = None, history_processors: Sequence[HistoryProcessor[Any]] | None = None, event_stream_handler: EventStreamHandler[Any] | None = None, tool_timeout: float | None = None, max_concurrency: _concurrency.AnyConcurrencyLimit = None, capabilities: Sequence[AbstractCapability[Any]] | None = None, ) -> Agent[None, str] def from_file( cls, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, deps_type: type[T], custom_capability_types: Sequence[type[AbstractCapability[Any]]] = (), model: models.Model | models.KnownModelName | str | None = None, output_type: OutputSpec[Any] = str, instructions: AgentInstructions[Any] = None, system_prompt: str | Sequence[str] = (), name: str | None = None, description: TemplateStr[Any] | str | None = None, model_settings: ModelSettings | None = None, retries: int | None = None, validation_context: Any = None, output_retries: int | None = None, tools: Sequence[Tool[Any] | ToolFuncEither[Any, ...]] = (), builtin_tools: Sequence[AgentBuiltinTool[Any]] = (), prepare_tools: ToolsPrepareFunc[Any] | None = None, prepare_output_tools: ToolsPrepareFunc[Any] | None = None, toolsets: Sequence[AgentToolset[Any]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy | None = None, instrument: InstrumentationSettings | bool | None = None, metadata: AgentMetadata[Any] | None = None, history_processors: Sequence[HistoryProcessor[Any]] | None = None, event_stream_handler: EventStreamHandler[Any] | None = None, tool_timeout: float | None = None, max_concurrency: _concurrency.AnyConcurrencyLimit = None, capabilities: Sequence[AbstractCapability[Any]] | None = None, ) -> Agent[T, str] ``` Construct an Agent from a YAML or JSON spec file. This is a convenience method equivalent to `Agent.from_spec(AgentSpec.from_file(path), ...)`. The file format is inferred from the extension (`.yaml`/`.yml` or `.json`) unless overridden with the `fmt` argument. All other arguments are forwarded to [`from_spec`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_spec). ###### Returns [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### instrument\_all `@staticmethod` ```python def instrument_all(instrument: InstrumentationSettings | bool = True) -> None ``` Set the instrumentation options for all agents where `instrument` is not set. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### iter `@async` ```python def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]] def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. At run time, spec values are additive. ##### override ```python def override( name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET, metadata: AgentMetadata[AgentDepsT] | _utils.Unset = _utils.UNSET, model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET, spec: dict[str, Any] | AgentSpec | None = None, ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures). ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The name to use instead of the name passed to the agent constructor and agent run. **`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET` The dependencies to use instead of the dependencies passed to the agent run. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The model to use instead of the model passed to the agent run. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The toolsets to use instead of the toolsets passed to the agent constructor and agent run. **`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The tools to use instead of the tools registered with the agent. **`instructions`** : `AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The instructions to use instead of the instructions registered with the agent. Note: this also replaces capability-contributed instructions (e.g. from [`get_instructions`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_instructions)). **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The metadata to use instead of the metadata passed to the agent constructor. When set, any per-run `metadata` argument is ignored. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec providing defaults for override. Explicit params take precedence over spec values. When the spec includes `capabilities`, they replace (not merge with) the agent's existing capabilities. To add capabilities without replacing, pass `spec` to `run()` or `iter()` instead. ##### instructions ```python def instructions( func: Callable[[RunContext[AgentDepsT]], str | None], ) -> Callable[[RunContext[AgentDepsT]], str | None] def instructions( func: Callable[[RunContext[AgentDepsT]], Awaitable[str | None]], ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str | None]] def instructions(func: Callable[[], str | None]) -> Callable[[], str | None] def instructions( func: Callable[[], Awaitable[str | None]], ) -> Callable[[], Awaitable[str | None]] def instructions( ) -> Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] ``` Decorator to register an instructions function. Optionally takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.instructions`). Overloads for every possible signature of `instructions` are included so the decorator doesn't obscure the type of the function. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.instructions def simple_instructions() -> str: return 'foobar' @agent.instructions async def async_instructions(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` ###### Returns [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`_system_prompt.SystemPromptFunc`\[`AgentDepsT`\]\], `_system_prompt.SystemPromptFunc`\[`AgentDepsT`\]\] | `_system_prompt.SystemPromptFunc`\[`AgentDepsT`\] ##### system\_prompt ```python def system_prompt( func: Callable[[RunContext[AgentDepsT]], str | None], ) -> Callable[[RunContext[AgentDepsT]], str | None] def system_prompt( func: Callable[[RunContext[AgentDepsT]], Awaitable[str | None]], ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str | None]] def system_prompt(func: Callable[[], str | None]) -> Callable[[], str | None] def system_prompt( func: Callable[[], Awaitable[str | None]], ) -> Callable[[], Awaitable[str | None]] def system_prompt( dynamic: bool = False, ) -> Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] ``` Decorator to register a system prompt function. Optionally takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its only argument. Can decorate a sync or async functions. The decorator can be used either bare (`agent.system_prompt`) or as a function call (`agent.system_prompt(...)`), see the examples below. Overloads for every possible signature of `system_prompt` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.system_prompt def simple_system_prompt() -> str: return 'foobar' @agent.system_prompt(dynamic=True) async def async_system_prompt(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` ###### Returns [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`_system_prompt.SystemPromptFunc`\[`AgentDepsT`\]\], `_system_prompt.SystemPromptFunc`\[`AgentDepsT`\]\] | `_system_prompt.SystemPromptFunc`\[`AgentDepsT`\] ###### Parameters **`func`** : `_system_prompt.SystemPromptFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The function to decorate **`dynamic`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` If True, the system prompt will be reevaluated even when `messages_history` is provided, see [`SystemPromptPart.dynamic_ref`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.SystemPromptPart.dynamic_ref) ##### output\_validator ```python def output_validator( func: Callable[[RunContext[AgentDepsT], OutputDataT], OutputDataT], ) -> Callable[[RunContext[AgentDepsT], OutputDataT], OutputDataT] def output_validator( func: Callable[[RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT]], ) -> Callable[[RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT]] def output_validator( func: Callable[[OutputDataT], OutputDataT], ) -> Callable[[OutputDataT], OutputDataT] def output_validator( func: Callable[[OutputDataT], Awaitable[OutputDataT]], ) -> Callable[[OutputDataT], Awaitable[OutputDataT]] ``` Decorator to register an output validator function. Optionally takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument. Can decorate a sync or async functions. Overloads for every possible signature of `output_validator` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Example: ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('test', deps_type=str) @agent.output_validator def output_validator_simple(data: str) -> str: if 'wrong' in data: raise ModelRetry('wrong response') return data @agent.output_validator async def output_validator_deps(ctx: RunContext[str], data: str) -> str: if ctx.deps in data: raise ModelRetry('wrong response') return data result = agent.run_sync('foobar', deps='spam') print(result.output) #> success (no tool calls) ``` ###### Returns `_output.OutputValidatorFunc`\[`AgentDepsT`, `OutputDataT`\] ##### tool ```python def tool( func: ToolFuncContext[AgentDepsT, ToolParams], ) -> ToolFuncContext[AgentDepsT, ToolParams] def tool( name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, args_validator: ArgsValidatorFunc[AgentDepsT, ToolParams] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, defer_loading: bool = False, include_return_schema: bool | None = None, ) -> Callable[[ToolFuncContext[AgentDepsT, ToolParams]], ToolFuncContext[AgentDepsT, ToolParams]] ``` Decorator to register a tool function which takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](/docs/ai/tools-toolsets/tools#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=int) @agent.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @agent.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` ###### Returns [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ###### Parameters **`func`** : `ToolFuncContext`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The tool function to register. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The name of the tool, defaults to the function name. **`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The description of the tool, defaults to the function docstring. **`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. **`prepare`** : `ToolPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc). **`args_validator`** : `ArgsValidatorFunc`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with `RunContext` as the first argument. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc). **`docstring_format`** : `DocstringFormat` _Default:_ `'auto'` The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. **`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` If True, raise an error if a parameter description is missing. Defaults to False. **`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] _Default:_ `GenerateToolJsonSchema` The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. **`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info. **`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether the function requires a sequential/serial execution environment. Defaults to False. **`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. **`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Overrides the agent-level `tool_timeout` if set. Defaults to None (no timeout). **`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to hide this tool until it's discovered via tool search. Defaults to False. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. **`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to include the return schema in the tool definition sent to the model. If `None`, defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used. ##### tool\_plain ```python def tool_plain(func: ToolFuncPlain[ToolParams]) -> ToolFuncPlain[ToolParams] def tool_plain( name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, args_validator: ArgsValidatorFunc[AgentDepsT, ToolParams] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, defer_loading: bool = False, include_return_schema: bool | None = None, ) -> Callable[[ToolFuncPlain[ToolParams]], ToolFuncPlain[ToolParams]] ``` Decorator to register a tool function which DOES NOT take `RunContext` as an argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](/docs/ai/tools-toolsets/tools#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test') @agent.tool def foobar(ctx: RunContext[int]) -> int: return 123 @agent.tool(retries=2) async def spam(ctx: RunContext[str]) -> float: return 3.14 result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":123,"spam":3.14} ``` ###### Returns [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ###### Parameters **`func`** : `ToolFuncPlain`\[`ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The tool function to register. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The name of the tool, defaults to the function name. **`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The description of the tool, defaults to the function docstring. **`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. **`prepare`** : `ToolPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc). **`args_validator`** : `ArgsValidatorFunc`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as the first argument -- even though the tool function itself does not take `RunContext` when using `tool_plain`. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc). **`docstring_format`** : `DocstringFormat` _Default:_ `'auto'` The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. **`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` If True, raise an error if a parameter description is missing. Defaults to False. **`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] _Default:_ `GenerateToolJsonSchema` The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. **`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info. **`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether the function requires a sequential/serial execution environment. Defaults to False. **`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. **`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Overrides the agent-level `tool_timeout` if set. Defaults to None (no timeout). **`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to hide this tool until it's discovered via tool search. Defaults to False. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. **`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to include the return schema in the tool definition sent to the model. If `None`, defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used. ##### toolset ```python def toolset(func: ToolsetFunc[AgentDepsT]) -> ToolsetFunc[AgentDepsT] def toolset( per_run_step: bool = True, id: str | None = None, ) -> Callable[[ToolsetFunc[AgentDepsT]], ToolsetFunc[AgentDepsT]] ``` Decorator to register a toolset function which takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.toolset`). Example: ```python from pydantic_ai import AbstractToolset, Agent, FunctionToolset, RunContext agent = Agent('test', deps_type=str) @agent.toolset async def simple_toolset(ctx: RunContext[str]) -> AbstractToolset[str]: return FunctionToolset() ``` ###### Returns [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ###### Parameters **`func`** : [`ToolsetFunc`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.ToolsetFunc)\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The toolset function to register. **`per_run_step`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to re-evaluate the toolset for each run step. Defaults to True. **`id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An optional unique ID for the dynamic toolset. Required for use with durable execution environments like Temporal, where the ID identifies the toolset's activities within the workflow. ##### \_\_aenter\_\_ `@async` ```python def __aenter__() -> Self ``` Enter the agent context. This will start all [`MCPServerStdio`s](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStdio) registered as `toolsets` so they are ready to be used, and enter the model so the provider's HTTP client will be closed cleanly on exit. This is a no-op if the agent has already been entered. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) ##### set\_mcp\_sampling\_model ```python def set_mcp_sampling_model( model: models.Model | models.KnownModelName | str | None = None, ) -> None ``` Set the sampling model on all MCP servers registered with the agent. If no sampling model is provided, the agent's model will be used. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### to\_web ```python def to_web( models: ModelsParam = None, builtin_tools: list[AbstractBuiltinTool] | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, instructions: str | None = None, html_source: str | Path | None = None, ) -> Starlette ``` Create a Starlette app that serves a web chat UI for this agent. This method returns a pre-configured Starlette application that provides a web-based chat interface for interacting with the agent. By default, the UI is fetched from a CDN and cached on first use. The returned Starlette application can be mounted into a FastAPI app or run directly with any ASGI server (uvicorn, hypercorn, etc.). Note that the `deps` and `model_settings` will be the same for each request. To provide different `deps` for each request use the lower-level adapters directly. ###### Returns `Starlette` -- A configured Starlette application ready to be served (e.g., with uvicorn) ###### Parameters **`models`** : `ModelsParam` _Default:_ `None` Additional models to make available in the UI. Can be: - A sequence of model names/instances (e.g., `['openai:gpt-5', 'anthropic:claude-sonnet-4-6']`) - A dict mapping display labels to model names/instances (e.g., `{'GPT 5': 'openai:gpt-5', 'Claude': 'anthropic:claude-sonnet-4-6'}`) The agent's model is always included. Builtin tool support is automatically determined from each model's profile. **`builtin_tools`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Additional builtin tools to make available in the UI. The agent's configured builtin tools are always included. Tool labels in the UI are derived from the tool's `label` property. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for all requests. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for all model requests. **`instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional extra instructions to pass to each agent run. **`html_source`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `Path` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Path or URL for the chat UI HTML. Can be: - None (default): Fetches from CDN and caches locally - A Path instance: Reads from the local file - A URL string (http:// or https://): Fetches from the URL - A file path string: Reads from the local file ##### run\_mcp\_servers `@async` `@deprecated` ```python def run_mcp_servers( model: models.Model | models.KnownModelName | str | None = None, ) -> AsyncIterator[None] ``` Run [`MCPServerStdio`s](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStdio) so they can be used by the agent. Deprecated: use [`async with agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__aenter__) instead. If you need to set a sampling model on all MCP servers, use [`agent.set_mcp_sampling_model()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.set_mcp_sampling_model). Returns: a context manager to start and shutdown the servers. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ### AbstractAgent **Bases:** `Generic[AgentDepsT, OutputDataT]`, `ABC` Abstract superclass for [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent), [`WrapperAgent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.WrapperAgent), and your own custom agent implementations. #### Attributes ##### model The default model configured for this agent. **Type:** [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### name The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### description A human-readable description of the agent. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### deps\_type The type of dependencies used by the agent. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type) ##### output\_type The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`. **Type:** `OutputSpec`\[`OutputDataT`\] ##### event\_stream\_handler Optional handler for events from the model's streaming response and the agent's execution of tools. **Type:** `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### toolsets All toolsets registered on the agent. Output tools are not included. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] #### Methods ##### output\_json\_schema ```python def output_json_schema( output_type: OutputSpec[OutputDataT | RunOutputDataT] | None = None, ) -> JsonSchema ``` The output return JSON schema. ###### Returns `JsonSchema` ##### run `@async` ```python def run( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[OutputDataT] def run( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[RunOutputDataT] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` ###### Returns [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. At run time, spec values are additive. ##### run\_sync ```python def run_sync( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[OutputDataT] def run_sync( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[RunOutputDataT] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` ###### Returns [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. At run time, spec values are additive. ##### run\_stream `@async` ```python def run_stream( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[result.StreamedRunResult[AgentDepsT, OutputDataT]] def run_stream( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[result.StreamedRunResult[AgentDepsT, RunOutputDataT]] ``` Run the agent with a user prompt in async streaming mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with an `event_stream_handler` or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.iter) instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`result.StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. At run time, spec values are additive. ##### run\_stream\_sync ```python def run_stream_sync( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> result.StreamedRunResultSync[AgentDepsT, OutputDataT] def run_stream_sync( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> result.StreamedRunResultSync[AgentDepsT, RunOutputDataT] ``` Run the agent with a user prompt in sync streaming mode. This is a convenience method that wraps [`run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with an `event_stream_handler` or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.iter) instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') def main(): response = agent.run_stream_sync('What is the capital of the UK?') print(response.get_output()) #> The capital of the UK is London. ``` ###### Returns [`result.StreamedRunResultSync`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. At run time, spec values are additive. ##### run\_stream\_events ```python def run_stream_events( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]] def run_stream_events( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), except that `event_stream_handler` is now allowed. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_messages.AgentStreamEvent` | [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_messages.AgentStreamEvent` | [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- run result. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. At run time, spec values are additive. ##### iter `@abstractmethod` `@async` ```python def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]] def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. At run time, spec values are additive. ##### override `@abstractmethod` ```python def override( name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: _instructions.AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET, model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET, spec: dict[str, Any] | AgentSpec | None = None, ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures). ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The name to use instead of the name passed to the agent constructor and agent run. **`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET` The dependencies to use instead of the dependencies passed to the agent run. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The model to use instead of the model passed to the agent run. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The toolsets to use instead of the toolsets passed to the agent constructor and agent run. **`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The tools to use instead of the tools registered with the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The instructions to use instead of the instructions registered with the agent. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec providing defaults for override. ##### parallel\_tool\_call\_execution\_mode `@staticmethod` ```python def parallel_tool_call_execution_mode( mode: tool_manager.ParallelExecutionMode = 'parallel', ) -> Iterator[None] ``` Set the parallel execution mode during the context. ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ###### Parameters **`mode`** : `tool_manager.ParallelExecutionMode` _Default:_ `'parallel'` The execution mode for tool calls: - 'parallel': Run tool calls in parallel, yielding events as they complete (default). - 'sequential': Run tool calls one at a time in order. - 'parallel\_ordered\_events': Run tool calls in parallel, but events are emitted in order, after all calls complete. ##### sequential\_tool\_calls `@deprecated` `@staticmethod` ```python def sequential_tool_calls() -> Iterator[None] ``` Run tool calls sequentially during the context. ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ##### using\_thread\_executor `@staticmethod` ```python def using_thread_executor(executor: Executor) -> Iterator[None] ``` Use a custom executor for running sync functions in threads during the context. By default, sync tool functions and other sync callbacks are run in threads using `anyio.to_thread.run_sync`, which creates ephemeral threads. In long-running servers (e.g. FastAPI), this can lead to thread accumulation under sustained load. This context manager lets you provide a bounded [`ThreadPoolExecutor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor) (or any [`Executor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Executor)) to control thread lifecycle: ```python from concurrent.futures import ThreadPoolExecutor from contextlib import asynccontextmanager from pydantic_ai import Agent @asynccontextmanager async def lifespan(app): executor = ThreadPoolExecutor(max_workers=16) with Agent.using_thread_executor(executor): yield executor.shutdown(wait=True) ``` For per-agent configuration, use the [`ThreadExecutor`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ThreadExecutor) capability instead. ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ###### Parameters **`executor`** : `Executor` The executor to use for running sync functions. ##### is\_model\_request\_node `@staticmethod` ```python def is_model_request_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.ModelRequestNode[T, S]] ``` Check if the node is a `ModelRequestNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. ###### Returns [`TypeIs`](https://docs.python.org/3/library/typing.html#typing.TypeIs)\[`_agent_graph.ModelRequestNode`\[`T`, `S`\]\] ##### is\_call\_tools\_node `@staticmethod` ```python def is_call_tools_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.CallToolsNode[T, S]] ``` Check if the node is a `CallToolsNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. ###### Returns [`TypeIs`](https://docs.python.org/3/library/typing.html#typing.TypeIs)\[`_agent_graph.CallToolsNode`\[`T`, `S`\]\] ##### is\_user\_prompt\_node `@staticmethod` ```python def is_user_prompt_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.UserPromptNode[T, S]] ``` Check if the node is a `UserPromptNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. ###### Returns [`TypeIs`](https://docs.python.org/3/library/typing.html#typing.TypeIs)\[`_agent_graph.UserPromptNode`\[`T`, `S`\]\] ##### is\_end\_node `@staticmethod` ```python def is_end_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[End[result.FinalResult[S]]] ``` Check if the node is a `End`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. ###### Returns [`TypeIs`](https://docs.python.org/3/library/typing.html#typing.TypeIs)\[`End`\[`result.FinalResult`\[`S`\]\]\] ##### to\_ag\_ui ```python def to_ag_ui( output_type: OutputSpec[OutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None, ) -> AGUIApp[AgentDepsT, OutputDataT] ``` Returns an ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler) protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`](/docs/ai/api/pydantic-ai/ag_ui/#pydantic_ai.ag_ui.run_ag_ui) or [`pydantic_ai.ag_ui.handle_ag_ui_request`](/docs/ai/api/pydantic-ai/ag_ui/#pydantic_ai.ag_ui.handle_ag_ui_request) instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') app = agent.to_ag_ui() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: Terminal ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` See [AG-UI docs](/docs/ai/integrations/ui/ag-ui) for more information. ###### Returns `AGUIApp`\[`AgentDepsT`, `OutputDataT`\] -- An ASGI application for running Pydantic AI agents with AG-UI protocol support. ###### Parameters **`output_type`** : `OutputSpec`\[`OutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`debug`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Boolean indicating if debug tracebacks should be returned on errors. **`routes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`BaseRoute`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of routes to serve incoming HTTP and WebSocket requests. **`middleware`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Middleware`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. **`exception_handlers`** : [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `ExceptionHandler`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. **`on_startup`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. **`on_shutdown`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. **`lifespan`** : `Lifespan`\[`AGUIApp`\[`AgentDepsT`, `OutputDataT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. ##### to\_a2a ```python def to_a2a( storage: Storage | None = None, broker: Broker | None = None, name: str | None = None, url: str = 'http://localhost:8000', version: str = '1.0.0', description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: dict[Any, ExceptionHandler] | None = None, lifespan: Lifespan[FastA2A] | None = None, ) -> FastA2A ``` Convert the agent to a FastA2A application. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') app = agent.to_a2a() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: Terminal ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` ###### Returns `FastA2A` ##### to\_cli `@async` ```python def to_cli( deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: Sequence[_messages.ModelMessage] | None = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, ) -> None ``` Run the agent in a CLI chat interface. Example: agent\_to\_cli.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`deps`** : `AgentDepsT` _Default:_ `None` The dependencies to pass to the agent. **`prog_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'pydantic-ai'` The name of the program to use for the CLI. Defaults to 'pydantic-ai'. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. ##### to\_cli\_sync ```python def to_cli_sync( deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: Sequence[_messages.ModelMessage] | None = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, ) -> None ``` Run the agent in a CLI chat interface with the non-async interface. agent\_to\_cli\_sync.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') agent.to_cli_sync() agent.to_cli_sync(prog_name='assistant') ``` ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`deps`** : `AgentDepsT` _Default:_ `None` The dependencies to pass to the agent. **`prog_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'pydantic-ai'` The name of the program to use for the CLI. Defaults to 'pydantic-ai'. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. ### WrapperAgent **Bases:** `AbstractAgent[AgentDepsT, OutputDataT]` Agent which wraps another agent. Does nothing on its own, used as a base class. #### Methods ##### iter `@async` ```python def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]] def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### override ```python def override( name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: _instructions.AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET, model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET, spec: dict[str, Any] | AgentSpec | None = None, ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures). ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The name to use instead of the name passed to the agent constructor and agent run. **`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET` The dependencies to use instead of the dependencies passed to the agent run. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The model to use instead of the model passed to the agent run. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The toolsets to use instead of the toolsets passed to the agent constructor and agent run. **`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The tools to use instead of the tools registered with the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The instructions to use instead of the instructions registered with the agent. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply as overrides. ### AgentRun **Bases:** `Generic[AgentDepsT, OutputDataT]` A stateful, async-iterable run of an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent). You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) is reached, the run finishes and [`result`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.result) becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the [`next`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.next) method for more granular control. #### Attributes ##### ctx The current context of the agent run. **Type:** `GraphRunContext`\[`_agent_graph.GraphAgentState`, `_agent_graph.GraphAgentDeps`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ##### next\_node The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. **Type:** `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] ##### result The final result of the run if it has ended, otherwise `None`. Once the run returns an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node, `result` is populated with an [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult). **Type:** [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[`OutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### metadata Metadata associated with this agent run, if configured. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### run\_id The unique identifier for the agent run. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### all\_messages ```python def all_messages() -> list[_messages.ModelMessage] ``` Return all messages for the run so far. Messages from older runs are included. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] ##### all\_messages\_json ```python def all_messages_json(output_tool_return_content: str | None = None) -> bytes ``` Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.all_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages. ##### new\_messages ```python def new_messages() -> list[_messages.ModelMessage] ``` Return new messages for the run so far. Messages from older runs are excluded. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] ##### new\_messages\_json ```python def new_messages_json() -> bytes ``` Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.new_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages. ##### \_\_aiter\_\_ ```python def __aiter__( ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]] ``` Provide async-iteration over the nodes in the agent run. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\]\] ##### \_\_anext\_\_ `@async` ```python def __anext__( ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ``` Advance to the next node automatically based on the last returned node. Note: this uses the graph run's internal iteration which does NOT call node hooks (`before_node_run`, `wrap_node_run`, `after_node_run`, `on_node_run_error`). Use `next()` for capability-hooked iteration, or use `agent.run()` which drives via `next()` automatically. ###### Returns `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] ##### next `@async` ```python def next( node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ``` Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-5.2') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` ###### Returns `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] -- The next node returned by the graph logic, or an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node if `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] -- the run has completed. ###### Parameters **`node`** : `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] The node to run next in the graph. ##### usage ```python def usage() -> _usage.RunUsage ``` Get usage statistics for the run so far, including token usage, model requests, and so on. ###### Returns `_usage.RunUsage` ### AgentRunResult **Bases:** `Generic[OutputDataT]` The final result of an agent run. #### Attributes ##### output The output data from the agent run. **Type:** `OutputDataT` ##### response Return the last response from the message history. **Type:** `_messages.ModelResponse` ##### metadata Metadata associated with this agent run, if configured. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### run\_id The unique identifier for the agent run. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### all\_messages ```python def all_messages( output_tool_return_content: str | None = None, ) -> list[_messages.ModelMessage] ``` Return the history of \_messages. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### all\_messages\_json ```python def all_messages_json(output_tool_return_content: str | None = None) -> bytes ``` Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.all_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### new\_messages ```python def new_messages( output_tool_return_content: str | None = None, ) -> list[_messages.ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of new messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### new\_messages\_json ```python def new_messages_json(output_tool_return_content: str | None = None) -> bytes ``` Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.new_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### usage ```python def usage() -> _usage.RunUsage ``` Return the usage of the whole run. ###### Returns `_usage.RunUsage` ##### timestamp ```python def timestamp() -> datetime ``` Return the timestamp of last response. ###### Returns [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ### InstrumentationSettings Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - [`Agent.instrument_all()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instrument_all) - [`InstrumentedModel`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentedModel) See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. #### Methods ##### \_\_init\_\_ ```python def __init__( tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2, 3, 4, 5] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal['attributes', 'logs'] = 'attributes', logger_provider: LoggerProvider | None = None, use_aggregated_usage_attribute_names: bool = False, ) ``` Create instrumentation options. ###### Parameters **`tracer_provider`** : `TracerProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. **`meter_provider`** : `MeterProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. **`include_binary_content`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include binary content in the instrumentation events. **`include_content`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. **`version`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[1, 2, 3, 4, 5\] _Default:_ `DEFAULT_INSTRUMENTATION_VERSION` Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. Version 3 is the same as version 2, with additional support for thinking tokens. Version 4 is the same as version 3, with GenAI semantic conventions for multimodal content: URL-based media uses type='uri' with uri and mime\_type fields (and modality for image/audio/video). Inline binary content uses type='blob' with mime\_type and content fields (and modality for image/audio/video). [https://opentelemetry.io/docs/specs/semconv/gen-ai/non-normative/examples-llm-calls/#multimodal-inputs-example](https://opentelemetry.io/docs/specs/semconv/gen-ai/non-normative/examples-llm-calls/#multimodal-inputs-example) Version 5 is the same as version 4, but CallDeferred and ApprovalRequired exceptions no longer record an exception event or set the span status to ERROR -- the span is left as UNSET, since deferrals are control flow, not errors. **`event_mode`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['attributes', 'logs'\] _Default:_ `'attributes'` The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. **`logger_provider`** : `LoggerProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The OpenTelemetry logger provider to use. If not provided, the global logger provider is used. Calling `logfire.configure()` sets the global logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. **`use_aggregated_usage_attribute_names`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to use `gen_ai.aggregated_usage.*` attribute names for token usage on agent run spans instead of the standard `gen_ai.usage.*` names. Enable this to prevent double-counting in observability backends that aggregate span attributes across parent and child spans. Defaults to False. Note: `gen_ai.aggregated_usage.*` is a custom namespace, not part of the OpenTelemetry Semantic Conventions. It may be updated if OTel introduces an official convention. ##### messages\_to\_otel\_events ```python def messages_to_otel_events( messages: list[ModelMessage], parameters: ModelRequestParameters | None = None, ) -> list[LogRecord] ``` Convert a list of model messages to OpenTelemetry events. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`LogRecord`\] -- A list of OpenTelemetry events. ###### Parameters **`messages`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] The messages to convert. **`parameters`** : `ModelRequestParameters` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model request parameters. ### capture\_run\_messages ```python def capture_run_messages() -> Iterator[list[_messages.ModelMessage]] ``` Context manager to access the messages used in a [`run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`run_sync`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync), or [`run_stream`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) call. Useful when a run may raise an exception, see [model errors](/docs/ai/core-concepts/agent#model-errors) for more information. Examples: ```python from pydantic_ai import Agent, capture_run_messages agent = Agent('test') with capture_run_messages() as messages: try: result = agent.run_sync('foobar') except Exception: print(messages) raise ``` Note If you call `run`, `run_sync`, or `run_stream` more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only. #### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\]\] ### EndStrategy **Default:** `Literal['early', 'exhaustive']` ### RunOutputDataT Type variable for the result data of a run where `output_type` was customized on the run call. **Default:** `TypeVar('RunOutputDataT')` ### EventStreamHandler A function that receives agent [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and an async iterable of events from the model's streaming response and the agent's execution of tools. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT], AsyncIterable[_messages.AgentStreamEvent]], Awaitable[None]]` --- # [pydantic_ai.builtin_tools](https://pydantic.dev/docs/ai/api/pydantic-ai/builtin_tools/) # pydantic\_ai.builtin\_tools ### AbstractBuiltinTool **Bases:** `ABC` A builtin tool that can be used by an agent. This class is abstract and cannot be instantiated directly. The builtin tools are passed to the model as part of the `ModelRequestParameters`. #### Attributes ##### kind Built-in tool identifier, this should be available on all built-in tools as a discriminator. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'unknown_builtin_tool'` ##### unique\_id A unique identifier for the builtin tool. If multiple instances of the same builtin tool can be passed to the model, subclasses should override this property to allow them to be distinguished. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### label Human-readable label for UI display. Subclasses should override this to provide a meaningful label. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### WebSearchTool **Bases:** `AbstractBuiltinTool` A builtin tool that allows your agent to search the web for information. The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models. Supported by: - Anthropic - OpenAI Responses - Groq - Google - xAI - OpenRouter #### Attributes ##### search\_context\_size The `search_context_size` parameter controls how much context is retrieved from the web to help the tool formulate a response. Supported by: - OpenAI Responses - OpenRouter **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high'\] **Default:** `'medium'` ##### user\_location The `user_location` parameter allows you to localize search results based on a user's location. Supported by: - Anthropic - OpenAI Responses **Type:** [`WebSearchUserLocation`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchUserLocation) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### blocked\_domains If provided, these domains will never appear in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see [https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool#domain-filtering](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool#domain-filtering) - Groq, see [https://console.groq.com/docs/agentic-tooling#search-settings](https://console.groq.com/docs/agentic-tooling#search-settings) - xAI, see [https://docs.x.ai/docs/guides/tools/search-tools#web-search-parameters](https://docs.x.ai/docs/guides/tools/search-tools#web-search-parameters) **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### allowed\_domains If provided, only these domains will be included in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see [https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool#domain-filtering](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool#domain-filtering) - Groq, see [https://console.groq.com/docs/agentic-tooling#search-settings](https://console.groq.com/docs/agentic-tooling#search-settings) - OpenAI Responses, see [https://platform.openai.com/docs/guides/tools-web-search](https://platform.openai.com/docs/guides/tools-web-search) - xAI, see [https://docs.x.ai/docs/guides/tools/search-tools#web-search-parameters](https://docs.x.ai/docs/guides/tools/search-tools#web-search-parameters) **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### max\_uses If provided, the tool will stop searching the web after the given number of uses. Supported by: - Anthropic **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### kind The kind of tool. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'web_search'` ### WebSearchUserLocation **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Allows you to localize search results based on a user's location. Supported by: - Anthropic - OpenAI Responses #### Attributes ##### city The city where the user is located. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### country The country where the user is located. For OpenAI, this must be a 2-letter country code (e.g., 'US', 'GB'). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### region The region or state where the user is located. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timezone The timezone of the user's location. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### CodeExecutionTool **Bases:** `AbstractBuiltinTool` A builtin tool that allows your agent to execute code. Supported by: - Anthropic - OpenAI Responses - Google - Bedrock (Nova2.0) - xAI #### Attributes ##### kind The kind of tool. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'code_execution'` ### WebFetchTool **Bases:** `AbstractBuiltinTool` Allows your agent to access contents from URLs. The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models. Supported by: - Anthropic - Google #### Attributes ##### max\_uses If provided, the tool will stop fetching URLs after the given number of uses. Supported by: - Anthropic **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### allowed\_domains If provided, only these domains will be fetched. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see [https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering](https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering) **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### blocked\_domains If provided, these domains will never be fetched. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see [https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering](https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering) **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### enable\_citations If True, enables citations for fetched content. Supported by: - Anthropic **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### max\_content\_tokens Maximum content length in tokens for fetched content. Supported by: - Anthropic **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### kind The kind of tool. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'web_fetch'` ### UrlContextTool **Bases:** [`WebFetchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebFetchTool) Deprecated alias for WebFetchTool. Use WebFetchTool instead. Overrides kind to 'url\_context' so old serialized payloads with {"kind": "url\_context", ...} can be deserialized to UrlContextTool for backward compatibility. #### Attributes ##### kind The kind of tool (deprecated value for backward compatibility). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'url_context'` ### ImageGenerationTool **Bases:** `AbstractBuiltinTool` A builtin tool that allows your agent to generate images. Supported by: - OpenAI Responses - Google #### Attributes ##### background Background type for the generated image. Supported by: - OpenAI Responses. 'transparent' is only supported for 'png' and 'webp' output formats. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['transparent', 'opaque', 'auto'\] **Default:** `'auto'` ##### input\_fidelity Control how much effort the model will exert to match the style and features, especially facial features, of input images. Supported by: - OpenAI Responses. Default: 'low'. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['high', 'low'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### moderation Moderation level for the generated image. Supported by: - OpenAI Responses **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', 'low'\] **Default:** `'auto'` ##### output\_compression Compression level for the output image. Supported by: - OpenAI Responses. Only supported for 'jpeg' and 'webp' output formats. Default: 100. - Google (Vertex AI only). Only supported for 'jpeg' output format. Default: 75. Setting this will default `output_format` to 'jpeg' if not specified. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### output\_format The output format of the generated image. Supported by: - OpenAI Responses. Default: 'png'. - Google (Vertex AI only). Default: 'png', or 'jpeg' if `output_compression` is set. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['png', 'webp', 'jpeg'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### partial\_images Number of partial images to generate in streaming mode. Supported by: - OpenAI Responses. Supports 0 to 3. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### quality The quality of the generated image. Supported by: - OpenAI Responses **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high', 'auto'\] **Default:** `'auto'` ##### size The size of the generated image. - OpenAI Responses: 'auto' (default: model selects the size based on the prompt), '1024x1024', '1024x1536', '1536x1024' - Google (Gemini 3 Pro Image and later): '512' (Gemini 3.1 Flash Image only), '1K' (default), '2K', '4K' **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', '1024x1024', '1024x1536', '1536x1024', '512', '1K', '2K', '4K'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### aspect\_ratio The aspect ratio to use for generated images. Supported by: - Google image-generation models (Gemini) - OpenAI Responses (maps '1:1', '2:3', and '3:2' to supported sizes) **Type:** `ImageAspectRatio` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### kind The kind of tool. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'image_generation'` ### MemoryTool **Bases:** `AbstractBuiltinTool` A builtin tool that allows your agent to use memory. Supported by: - Anthropic #### Attributes ##### kind The kind of tool. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'memory'` ### MCPServerTool **Bases:** `AbstractBuiltinTool` A builtin tool that allows your agent to use MCP servers. Supported by: - OpenAI Responses - Anthropic - xAI #### Attributes ##### id A unique identifier for the MCP server. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### url The URL of the MCP server to use. For OpenAI Responses, it is possible to use `connector_id` by providing it as `x-openai-connector:`. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### authorization\_token Authorization header to use when making requests to the MCP server. Supported by: - OpenAI Responses - Anthropic - xAI **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### description A description of the MCP server. Supported by: - OpenAI Responses - xAI **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### allowed\_tools A list of tools that the MCP server can use. Supported by: - OpenAI Responses - Anthropic - xAI **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### headers Optional HTTP headers to send to the MCP server. Use for authentication or other purposes. Supported by: - OpenAI Responses - xAI **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### FileSearchTool **Bases:** `AbstractBuiltinTool` A builtin tool that allows your agent to search through uploaded files using vector search. This tool provides a fully managed Retrieval-Augmented Generation (RAG) system that handles file storage, chunking, embedding generation, and context injection into prompts. Supported by: - OpenAI Responses - Google (Gemini) #### Attributes ##### file\_store\_ids The file store IDs to search through. For OpenAI, these are the IDs of vector stores created via the OpenAI API. For Google, these are file search store names that have been uploaded and processed via the Gemini Files API. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### kind The kind of tool. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'file_search'` ### BUILTIN\_TOOL\_TYPES Registry of all builtin tool types, keyed by their kind string. This dict is populated automatically via `__init_subclass__` when tool classes are defined. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] **Default:** `{}` ### ImageAspectRatio Supported aspect ratios for image generation tools. **Default:** `Literal['21:9', '16:9', '4:3', '3:2', '1:1', '9:16', '3:4', '2:3', '5:4', '4:5']` ### DEPRECATED\_BUILTIN\_TOOLS Set of deprecated builtin tool IDs that should not be offered in new UIs. **Type:** [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] **Default:** `frozenset({UrlContextTool})` ### SUPPORTED\_BUILTIN\_TOOLS Get the set of all builtin tool types (excluding deprecated tools). **Default:** `frozenset(cls for cls in (BUILTIN_TOOL_TYPES.values()) if cls not in DEPRECATED_BUILTIN_TOOLS)` --- # [pydantic_ai.capabilities](https://pydantic.dev/docs/ai/api/pydantic-ai/capabilities/) # pydantic\_ai.capabilities ### Toolset **Bases:** `AbstractCapability[AgentDepsT]` A capability that provides a toolset. ### Thinking **Bases:** `AbstractCapability[Any]` Enables and configures model thinking/reasoning. Uses the unified `thinking` setting in [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) to work portably across providers. Provider-specific thinking settings (e.g., `anthropic_thinking`, `openai_reasoning_effort`) take precedence when both are set. #### Attributes ##### effort The thinking effort level. - `True`: Enable thinking with the provider's default effort. - `False`: Disable thinking (silently ignored on always-on models). - `'minimal'`/`'low'`/`'medium'`/`'high'`/`'xhigh'`: Enable thinking at a specific effort level. **Type:** `ThinkingLevel` **Default:** `True` ### PrepareTools **Bases:** `AbstractCapability[AgentDepsT]` Capability that filters or modifies tool definitions using a callable. Wraps a [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) as a capability, allowing it to be composed with other capabilities via the capability system. ```python from pydantic_ai import Agent, RunContext from pydantic_ai.capabilities import PrepareTools from pydantic_ai.tools import ToolDefinition async def hide_admin_tools( ctx: RunContext[None], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition] | None: return [td for td in tool_defs if not td.name.startswith('admin_')] agent = Agent('openai:gpt-5', capabilities=[PrepareTools(hide_admin_tools)]) ``` ### WebFetch **Bases:** `BuiltinOrLocalTool[AgentDepsT]` URL fetching capability. Uses the model's builtin URL fetching when available, falling back to a local function tool (markdownify-based fetch by default) when it isn't. The local fallback requires the `web-fetch` optional group:: pip install "pydantic-ai-slim\[web-fetch\]" #### Attributes ##### allowed\_domains Only fetch from these domains. Enforced locally when builtin is unavailable. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `allowed_domains` ##### blocked\_domains Never fetch from these domains. Enforced locally when builtin is unavailable. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `blocked_domains` ##### max\_uses Maximum number of fetches per run. Requires builtin support. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `max_uses` ##### enable\_citations Enable citations for fetched content. Builtin-only; ignored by local tools. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `enable_citations` ##### max\_content\_tokens Maximum content length in tokens. Builtin-only; ignored by local tools. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `max_content_tokens` ### WebSearch **Bases:** `BuiltinOrLocalTool[AgentDepsT]` Web search capability. Uses the model's builtin web search when available, falling back to a local function tool (DuckDuckGo by default) when it isn't. #### Attributes ##### search\_context\_size Controls how much context is retrieved from the web. Builtin-only; ignored by local tools. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `search_context_size` ##### user\_location Localize search results based on user location. Builtin-only; ignored by local tools. **Type:** [`WebSearchUserLocation`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchUserLocation) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `user_location` ##### blocked\_domains Domains to exclude from results. Requires builtin support. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `blocked_domains` ##### allowed\_domains Only include results from these domains. Requires builtin support. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `allowed_domains` ##### max\_uses Maximum number of web searches per run. Requires builtin support. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `max_uses` ### IncludeToolReturnSchemas **Bases:** `AbstractCapability[AgentDepsT]` Capability that includes return schemas for selected tools. When added to an agent's capabilities, this sets [`include_return_schema`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition.include_return_schema) to `True` on matching tool definitions, causing the model to receive return type information for those tools. For models that natively support return schemas (e.g. Google Gemini), the schema is passed as a structured field. For other models, it is injected into the tool description as JSON text. Per-tool overrides (`Tool(..., include_return_schema=False)`) take precedence -- this capability only sets the flag on tools that haven't explicitly opted out. ```python from pydantic_ai import Agent from pydantic_ai.capabilities import IncludeToolReturnSchemas agent = Agent('openai:gpt-5', capabilities=[IncludeToolReturnSchemas()]) ``` #### Attributes ##### tools Which tools should have their return schemas included. - `'all'` (default): every tool gets its return schema included. - `Sequence[str]`: only tools whose names are listed. - `dict[str, Any]`: matches tools whose metadata deeply includes the specified key-value pairs. - Callable `(ctx, tool_def) -> bool`: custom sync or async predicate. **Type:** `ToolSelector`\[`AgentDepsT`\] **Default:** `'all'` ### PrefixTools **Bases:** `WrapperCapability[AgentDepsT]` A capability that wraps another capability and prefixes its tool names. Only the wrapped capability's tools are prefixed; other agent tools are unaffected. ```python from pydantic_ai import Agent from pydantic_ai.capabilities import PrefixTools, Toolset from pydantic_ai.toolsets import FunctionToolset toolset = FunctionToolset() agent = Agent( 'openai:gpt-5', capabilities=[ PrefixTools( wrapped=Toolset(toolset), prefix='ns', ), ], ) ``` #### Methods ##### from\_spec `@classmethod` ```python def from_spec(cls, prefix: str, capability: CapabilitySpec) -> PrefixTools[Any] ``` Create from spec with a nested capability specification. ###### Returns `PrefixTools`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ###### Parameters **`prefix`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The prefix to add to tool names (e.g. `'mcp'` turns `'search'` into `'mcp_search'`). **`capability`** : `CapabilitySpec` A capability spec (same format as entries in the `capabilities` list). ### SetToolMetadata **Bases:** `AbstractCapability[AgentDepsT]` Capability that merges metadata key-value pairs onto selected tools. ```python from pydantic_ai import Agent from pydantic_ai.capabilities import SetToolMetadata agent = Agent('openai:gpt-5', capabilities=[SetToolMetadata(code_mode=True)]) ``` ### ThreadExecutor **Bases:** `AbstractCapability[Any]` Use a custom executor for running sync functions in threads. By default, sync tool functions and other sync callbacks are run in threads using `anyio.to_thread.run_sync`, which creates ephemeral threads. In long-running servers (e.g. FastAPI), this can lead to thread accumulation under sustained load. This capability provides a bounded [`ThreadPoolExecutor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor) (or any [`Executor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Executor)) to use instead, scoped to agent runs: ```python from concurrent.futures import ThreadPoolExecutor from pydantic_ai import Agent from pydantic_ai.capabilities import ThreadExecutor executor = ThreadPoolExecutor(max_workers=16, thread_name_prefix='agent-worker') agent = Agent('openai:gpt-5.2', capabilities=[ThreadExecutor(executor)]) ``` To set an executor for all agents globally, use [`Agent.using_thread_executor()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.using_thread_executor). #### Attributes ##### executor The executor to use for running sync functions. **Type:** `Executor` ### BuiltinOrLocalTool **Bases:** `AbstractCapability[AgentDepsT]` Capability that pairs a provider builtin tool with a local fallback. When the model supports the builtin natively, the local fallback is removed. When the model doesn't support the builtin, it is removed and the local tool stays. Can be used directly: ```python from pydantic_ai.capabilities import BuiltinOrLocalTool cap = BuiltinOrLocalTool(builtin=WebSearchTool(), local=my_search_func) ``` Or subclassed to set defaults by overriding `_default_builtin`, `_default_local`, and `_requires_builtin`. The built-in [`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch), [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch), and [`ImageGeneration`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ImageGeneration) capabilities are all subclasses. #### Attributes ##### builtin Configure the provider builtin tool. - `True` (default): use the default builtin tool configuration (subclasses only). - `False`: disable the builtin; always use the local tool. - An `AbstractBuiltinTool` instance: use this specific configuration. - A callable (`BuiltinToolFunc`): dynamically create the builtin per-run via `RunContext`. **Type:** [`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\] | [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True` ##### local Configure the local fallback tool. - `None` (default): auto-detect a local fallback via `_default_local`. - `False`: disable the local fallback; only use the builtin. - A `Tool` or `AbstractToolset` instance: use this specific local tool. - A bare callable: automatically wrapped in a `Tool`. **Type:** [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[..., [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\] | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### BuiltinTool **Bases:** `AbstractCapability[AgentDepsT]` A capability that registers a builtin tool with the agent. Wraps a single [`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool) -- either a static [`AbstractBuiltinTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.AbstractBuiltinTool) instance or a callable that dynamically produces one. When `builtin_tools` is passed to [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__), each item is automatically wrapped in a `BuiltinTool` capability. #### Methods ##### from\_spec `@classmethod` ```python def from_spec( cls, tool: AbstractBuiltinTool | None = None, kwargs: Any = {}, ) -> BuiltinTool[Any] ``` Create from spec. Supports two YAML forms: - Flat: `{BuiltinTool: {kind: web_search, search_context_size: high}}` - Explicit: `{BuiltinTool: {tool: {kind: web_search}}}` ###### Returns `BuiltinTool`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ### ImageGeneration **Bases:** `BuiltinOrLocalTool[AgentDepsT]` Image generation capability. Uses the model's builtin image generation when available. When the model doesn't support it and `fallback_model` is provided, falls back to a local tool that delegates to a subagent running the specified image-capable model. Image generation settings (`quality`, `size`, etc.) are forwarded to the [`ImageGenerationTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool) used by both the builtin and the local fallback subagent. When passing a custom `builtin` instance, its settings are also used for the fallback subagent; capability-level fields override any `builtin` instance settings. #### Attributes ##### fallback\_model Model to use for image generation when the agent's model doesn't support it natively. Must be a model that supports image generation via the [`ImageGenerationTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool) builtin. This requires a conversational model with image generation support, not a dedicated image-only API. Examples: - `'openai-responses:gpt-5.4'` -- OpenAI model with image generation support - `'google-gla:gemini-3-pro-image-preview'` -- Google image generation model Can be a model name string, `Model` instance, or a callable taking `RunContext` that returns a `Model` instance. **Type:** `ImageGenerationFallbackModel` **Default:** `fallback_model` ##### background Background type for the generated image. Supported by: OpenAI Responses. `'transparent'` only supported for `'png'` and `'webp'`. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['transparent', 'opaque', 'auto'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `background` ##### input\_fidelity Input fidelity for matching style/features of input images. Supported by: OpenAI Responses. Default: `'low'`. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['high', 'low'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `input_fidelity` ##### moderation Moderation level for the generated image. Supported by: OpenAI Responses. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', 'low'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `moderation` ##### output\_compression Compression level for the output image. Supported by: OpenAI Responses (jpeg/webp, default: 100), Google Vertex AI (jpeg, default: 75). **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `output_compression` ##### output\_format Output format of the generated image. Supported by: OpenAI Responses (default: `'png'`), Google Vertex AI. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['png', 'webp', 'jpeg'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `output_format` ##### quality Quality of the generated image. Supported by: OpenAI Responses. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high', 'auto'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `quality` ##### size Size of the generated image. Supported by: OpenAI Responses (`'auto'`, `'1024x1024'`, `'1024x1536'`, `'1536x1024'`), Google (`'512'`, `'1K'`, `'2K'`, `'4K'`). **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', '1024x1024', '1024x1536', '1536x1024', '512', '1K', '2K', '4K'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `size` ##### aspect\_ratio Aspect ratio for generated images. Supported by: Google (Gemini), OpenAI Responses (maps `'1:1'`, `'2:3'`, `'3:2'` to sizes). **Type:** `ImageAspectRatio` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `aspect_ratio` ### MCP **Bases:** `BuiltinOrLocalTool[AgentDepsT]` MCP server capability. Uses the model's builtin MCP server support when available, connecting directly via HTTP when it isn't. #### Attributes ##### url The URL of the MCP server. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `url` ##### id Unique identifier for the MCP server. Defaults to a slug derived from the URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `id` ##### authorization\_token Authorization header value for MCP server requests. Passed to both builtin and local. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `authorization_token` ##### headers HTTP headers for MCP server requests. Passed to both builtin and local. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `headers` ##### allowed\_tools Filter to only these tools. Applied to both builtin and local. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `allowed_tools` ##### description Description of the MCP server. Builtin-only; ignored by local tools. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `description` ### HistoryProcessor **Bases:** `AbstractCapability[AgentDepsT]` A capability that processes message history before model requests. ### CombinedCapability **Bases:** `AbstractCapability[AgentDepsT]` A capability that combines multiple capabilities. ### WrapperCapability **Bases:** `AbstractCapability[AgentDepsT]` A capability that wraps another capability and delegates all methods. Analogous to [`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset) for toolsets. Subclass and override specific methods to modify behavior while delegating the rest. ### HookTimeoutError **Bases:** [`TimeoutError`](https://docs.python.org/3/library/exceptions.html#TimeoutError) Raised when a hook function exceeds its configured timeout. ### CapabilityOrdering Ordering constraints for a capability within a combined capability chain. Capabilities follow middleware semantics: the first capability in the list is the **outermost** layer, wrapping all others. Declare ordering constraints via [`get_ordering`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_ordering) to control a capability's position in the chain regardless of how the user lists them. When a [`CombinedCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CombinedCapability) is constructed, it topologically sorts its children to satisfy these constraints, preserving user-provided order as a tiebreaker. #### Attributes ##### position Fixed position in the chain, or `None` for user-provided order. **Type:** `CapabilityPosition` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### wraps This capability wraps around (is outside of) these capabilities in the middleware chain. Each entry can be a capability **type** (matches all instances of that type via `issubclass`) or a specific capability **instance** (matches by identity via `is`). Note: instance refs use identity (`is`) matching, so if a capability's [`for_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.for_run) returns a new instance, refs to the original will no longer match. Use type refs when the target capability uses per-run state isolation. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`CapabilityRef`\] **Default:** `()` ##### wrapped\_by This capability is wrapped by (is inside of) these capabilities in the middleware chain. Each entry can be a capability **type** (matches all instances of that type via `issubclass`) or a specific capability **instance** (matches by identity via `is`). Note: instance refs use identity (`is`) matching, so if a capability's [`for_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.for_run) returns a new instance, refs to the original will no longer match. Use type refs when the target capability uses per-run state isolation. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`CapabilityRef`\] **Default:** `()` ##### requires These types must be present in the chain (no ordering implied). **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractCapability`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]\] **Default:** `()` ### AbstractCapability **Bases:** `ABC`, `Generic[AgentDepsT]` Abstract base class for agent capabilities. A capability is a reusable, composable unit of agent behavior that can provide instructions, model settings, tools, and request/response hooks. Lifecycle: capabilities are passed to an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) at construction time, where most `get_*` methods are called to collect static configuration (instructions, model settings, toolsets, builtin tools). The exception is [`get_wrapper_toolset`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset), which is called per-run during toolset assembly. Then, on each model request during a run, the [`before_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_model_request) and [`after_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_model_request) hooks are called to allow dynamic adjustments. See the [capabilities documentation](/docs/ai/api/pydantic-ai/capabilities) for built-in capabilities. [`get_serialization_name`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_serialization_name) and [`from_spec`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.from_spec) support YAML/JSON specs (via [`Agent.from_spec`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_spec)); they have sensible defaults and typically don't need to be overridden. #### Attributes ##### has\_wrap\_node\_run Whether this capability (or any sub-capability) overrides wrap\_node\_run. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### has\_wrap\_run\_event\_stream Whether this capability (or any sub-capability) overrides wrap\_run\_event\_stream. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) #### Methods ##### apply ```python def apply(visitor: Callable[[AbstractCapability[AgentDepsT]], None]) -> None ``` Run a visitor function on all leaf capabilities in this tree. For a single capability, calls the visitor on itself. Overridden by [`CombinedCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CombinedCapability) to recursively visit all child capabilities, and by [`WrapperCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WrapperCapability) to delegate to the wrapped capability. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### get\_serialization\_name `@classmethod` ```python def get_serialization_name(cls) -> str | None ``` Return the name used for spec serialization (CamelCase class name by default). Return None to opt out of spec-based construction. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### from\_spec `@classmethod` ```python def from_spec(cls, args: Any = (), kwargs: Any = {}) -> AbstractCapability[Any] ``` Create from spec arguments. Default: `cls(*args, **kwargs)`. Override when `__init__` takes non-serializable types. ###### Returns `AbstractCapability`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### get\_ordering ```python def get_ordering() -> CapabilityOrdering | None ``` Return ordering constraints for this capability, or `None` for default behavior. Override to declare a fixed position (`'outermost'` / `'innermost'`), relative ordering (`wraps` / `wrapped_by` other capability types or instances), or dependency requirements (`requires`). [`CombinedCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CombinedCapability) uses these to topologically sort its children at construction time. ###### Returns `CapabilityOrdering` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### for\_run `@async` ```python def for_run(ctx: RunContext[AgentDepsT]) -> AbstractCapability[AgentDepsT] ``` Return the capability instance to use for this agent run. Called once per run, before `get_*()` re-extraction and before any hooks fire. Override to return a fresh instance for per-run state isolation. Default: return `self` (shared across runs). ###### Returns `AbstractCapability`\[`AgentDepsT`\] ##### get\_instructions ```python def get_instructions() -> AgentInstructions[AgentDepsT] | None ``` Return instructions to include in the system prompt, or None. This method is called once at agent construction time. To get dynamic per-request behavior, return a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) or a `TemplateStr` -- not a dynamic string. ###### Returns `AgentInstructions`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### get\_model\_settings ```python def get_model_settings() -> AgentModelSettings[AgentDepsT] | None ``` Return model settings to merge into the agent's defaults, or None. This method is called once at agent construction time. Return a static `ModelSettings` dict when the settings don't change between requests. Return a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) when settings need to vary per step (e.g. based on `ctx.run_step` or `ctx.deps`). When the callable is invoked, `ctx.model_settings` contains the merged result of all layers resolved before this capability (model defaults and agent-level settings). The returned dict is merged on top of that. ###### Returns `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### get\_toolset ```python def get_toolset() -> AgentToolset[AgentDepsT] | None ``` Return a toolset to register with the agent, or None. ###### Returns `AgentToolset`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### get\_builtin\_tools ```python def get_builtin_tools() -> Sequence[AgentBuiltinTool[AgentDepsT]] ``` Return builtin tools to register with the agent. ###### Returns [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] ##### get\_wrapper\_toolset ```python def get_wrapper_toolset( toolset: AbstractToolset[AgentDepsT], ) -> AbstractToolset[AgentDepsT] | None ``` Wrap the agent's assembled toolset, or return None to leave it unchanged. Called per-run with the combined non-output toolset (after agent-level [`prepare_tools`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) wrapping). Output tools are added separately and are not included. Unlike the other `get_*` methods which are called once at agent construction, this is called each run (after [`for_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.for_run)). When multiple capabilities provide wrappers, they follow middleware semantics: the first capability in the list wraps outermost (matching `wrap_*` hooks). Use this to apply cross-cutting toolset wrappers like [`PreparedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.PreparedToolset), [`FilteredToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FilteredToolset), or custom [`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset) subclasses. ###### Returns [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### prepare\_tools `@async` ```python def prepare_tools( ctx: RunContext[AgentDepsT], tool_defs: list[ToolDefinition], ) -> list[ToolDefinition] ``` Filter or modify tool definitions visible to the model for this step. The list contains all tool kinds (function, output, unapproved) distinguished by [`tool_def.kind`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition.kind). Return a filtered or modified list. Called after the agent-level [`prepare_tools`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) has already run. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition)\] ##### before\_run `@async` ```python def before_run(ctx: RunContext[AgentDepsT]) -> None ``` Called before the agent run starts. Observe-only; use wrap\_run for modification. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### after\_run `@async` ```python def after_run( ctx: RunContext[AgentDepsT], result: AgentRunResult[Any], ) -> AgentRunResult[Any] ``` Called after the agent run completes. Can modify the result. ###### Returns [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### wrap\_run `@async` ```python def wrap_run( ctx: RunContext[AgentDepsT], handler: WrapRunHandler, ) -> AgentRunResult[Any] ``` Wraps the entire agent run. `handler()` executes the run. If `handler()` raises and this method catches the exception and returns a result instead, the error is suppressed and the recovery result is used. If this method does not call `handler()` (short-circuit), the run is skipped and the returned result is used directly. Note: if the caller cancels the run (e.g. by breaking out of an `iter()` loop), this method receives an :class:`asyncio.CancelledError`. Implementations that hold resources should handle cleanup accordingly. ###### Returns [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### on\_run\_error `@async` ```python def on_run_error( ctx: RunContext[AgentDepsT], error: BaseException, ) -> AgentRunResult[Any] ``` Called when the agent run fails with an exception. This is the error counterpart to [`after_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_run): while `after_run` is called on success, `on_run_error` is called on failure (after [`wrap_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_run) has had its chance to recover). **Raise** the original `error` (or a different exception) to propagate it. **Return** an [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) to suppress the error and recover the run. Not called for `GeneratorExit` or `KeyboardInterrupt`. ###### Returns [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### before\_node\_run `@async` ```python def before_node_run( ctx: RunContext[AgentDepsT], node: AgentNode[AgentDepsT], ) -> AgentNode[AgentDepsT] ``` Called before each graph node executes. Can observe or replace the node. ###### Returns `AgentNode`\[`AgentDepsT`\] ##### after\_node\_run `@async` ```python def after_node_run( ctx: RunContext[AgentDepsT], node: AgentNode[AgentDepsT], result: NodeResult[AgentDepsT], ) -> NodeResult[AgentDepsT] ``` Called after each graph node succeeds. Can modify the result (next node or `End`). ###### Returns `NodeResult`\[`AgentDepsT`\] ##### wrap\_node\_run `@async` ```python def wrap_node_run( ctx: RunContext[AgentDepsT], node: AgentNode[AgentDepsT], handler: WrapNodeRunHandler[AgentDepsT], ) -> NodeResult[AgentDepsT] ``` Wraps execution of each agent graph node (run step). Called for every node in the agent graph (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`). `handler(node)` executes the node and returns the next node (or `End`). Override to inspect or modify nodes before execution, inspect or modify the returned next node, call `handler` multiple times (retry), or return a different node to redirect graph progression. Note: this hook fires when using `agent.run()`, `agent.run_stream()`, and when manually driving an [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) run with `next()`, but it does **not** fire when iterating over the run with bare `async for` (which yields stream events, not node results). When using `agent.run()` with `event_stream_handler`, the handler wraps both streaming and graph advancement (i.e. the model call happens inside the wrapper). When using `agent.run_stream()`, the handler wraps only graph advancement -- streaming happens before the wrapper because `run_stream()` must yield the stream to the caller while the stream context is still open, which cannot happen from inside a callback. ###### Returns `NodeResult`\[`AgentDepsT`\] ##### on\_node\_run\_error `@async` ```python def on_node_run_error( ctx: RunContext[AgentDepsT], node: AgentNode[AgentDepsT], error: Exception, ) -> NodeResult[AgentDepsT] ``` Called when a graph node fails with an exception. This is the error counterpart to [`after_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_node_run). **Raise** the original `error` (or a different exception) to propagate it. **Return** a next node or `End` to recover and continue the graph. Useful for recovering from [`UnexpectedModelBehavior`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UnexpectedModelBehavior) by redirecting to a different node (e.g. retry with different model settings). ###### Returns `NodeResult`\[`AgentDepsT`\] ##### wrap\_run\_event\_stream `@async` ```python def wrap_run_event_stream( ctx: RunContext[AgentDepsT], stream: AsyncIterable[AgentStreamEvent], ) -> AsyncIterable[AgentStreamEvent] ``` Wraps the event stream for a streamed node. Can observe or transform events. Note: when this method is overridden (or [`Hooks.on.event`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks.on) / [`Hooks.on.run_event_stream`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks.on) are registered), `agent.run()` automatically enables streaming mode so this hook fires even without an explicit `event_stream_handler`. ###### Returns [`AsyncIterable`](https://docs.python.org/3/library/typing.html#typing.AsyncIterable)\[[`AgentStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent)\] ##### before\_model\_request `@async` ```python def before_model_request( ctx: RunContext[AgentDepsT], request_context: ModelRequestContext, ) -> ModelRequestContext ``` Called before each model request. Can modify messages, settings, and parameters. ###### Returns `ModelRequestContext` ##### after\_model\_request `@async` ```python def after_model_request( ctx: RunContext[AgentDepsT], request_context: ModelRequestContext, response: ModelResponse, ) -> ModelResponse ``` Called after each model response. Can modify the response before further processing. Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to reject the response and ask the model to try again. The original response is still appended to message history so the model can see what it said. Retries count against `max_result_retries`. ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### wrap\_model\_request `@async` ```python def wrap_model_request( ctx: RunContext[AgentDepsT], request_context: ModelRequestContext, handler: WrapModelRequestHandler, ) -> ModelResponse ``` Wraps the model request. handler() calls the model. Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to skip `on_model_request_error` and directly retry the model request with a retry prompt. If the handler was called, the model response is preserved in history for context (same as `after_model_request`). ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### on\_model\_request\_error `@async` ```python def on_model_request_error( ctx: RunContext[AgentDepsT], request_context: ModelRequestContext, error: Exception, ) -> ModelResponse ``` Called when a model request fails with an exception. This is the error counterpart to [`after_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_model_request). **Raise** the original `error` (or a different exception) to propagate it. **Return** a [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) to suppress the error and use the response as if the model call succeeded. **Raise** [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to retry the model request with a retry prompt instead of recovering or propagating. Not called for [`SkipModelRequest`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipModelRequest) or [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry). ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### before\_tool\_validate `@async` ```python def before_tool_validate( ctx: RunContext[AgentDepsT], call: ToolCallPart, tool_def: ToolDefinition, args: RawToolArgs, ) -> RawToolArgs ``` Modify raw args before validation. Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to skip validation and ask the model to redo the tool call. ###### Returns `RawToolArgs` ##### after\_tool\_validate `@async` ```python def after_tool_validate( ctx: RunContext[AgentDepsT], call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs, ) -> ValidatedToolArgs ``` Modify validated args. Called only on successful validation. Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to reject the validated args and ask the model to redo the tool call. ###### Returns `ValidatedToolArgs` ##### wrap\_tool\_validate `@async` ```python def wrap_tool_validate( ctx: RunContext[AgentDepsT], call: ToolCallPart, tool_def: ToolDefinition, args: RawToolArgs, handler: WrapToolValidateHandler, ) -> ValidatedToolArgs ``` Wraps tool argument validation. handler() runs the validation. ###### Returns `ValidatedToolArgs` ##### on\_tool\_validate\_error `@async` ```python def on_tool_validate_error( ctx: RunContext[AgentDepsT], call: ToolCallPart, tool_def: ToolDefinition, args: RawToolArgs, error: ValidationError | ModelRetry, ) -> ValidatedToolArgs ``` Called when tool argument validation fails. This is the error counterpart to [`after_tool_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_tool_validate). Fires for `ValidationError` (schema mismatch) and [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) (custom validator rejection). **Raise** the original `error` (or a different exception) to propagate it. **Return** validated args to suppress the error and continue as if validation passed. Not called for [`SkipToolValidation`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolValidation). ###### Returns `ValidatedToolArgs` ##### before\_tool\_execute `@async` ```python def before_tool_execute( ctx: RunContext[AgentDepsT], call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs, ) -> ValidatedToolArgs ``` Modify validated args before execution. Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to skip execution and ask the model to redo the tool call. ###### Returns `ValidatedToolArgs` ##### after\_tool\_execute `@async` ```python def after_tool_execute( ctx: RunContext[AgentDepsT], call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs, result: Any, ) -> Any ``` Modify result after execution. Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to reject the tool result and ask the model to redo the tool call. ###### Returns [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ##### wrap\_tool\_execute `@async` ```python def wrap_tool_execute( ctx: RunContext[AgentDepsT], call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs, handler: WrapToolExecuteHandler, ) -> Any ``` Wraps tool execution. handler() runs the tool. ###### Returns [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ##### on\_tool\_execute\_error `@async` ```python def on_tool_execute_error( ctx: RunContext[AgentDepsT], call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs, error: Exception, ) -> Any ``` Called when tool execution fails with an exception. This is the error counterpart to [`after_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_tool_execute). **Raise** the original `error` (or a different exception) to propagate it. **Return** any value to suppress the error and use it as the tool result. **Raise** [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to ask the model to redo the tool call instead of recovering or propagating. Not called for control flow exceptions ([`SkipToolExecution`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolExecution), [`CallDeferred`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.CallDeferred), [`ApprovalRequired`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ApprovalRequired)) or retry signals ([`ToolRetryError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ToolRetryError) from [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry)). Use [`wrap_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_tool_execute) to intercept retries. ###### Returns [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ##### prefix\_tools ```python def prefix_tools(prefix: str) -> PrefixTools[AgentDepsT] ``` Returns a new capability that wraps this one and prefixes its tool names. Only this capability's tools are prefixed; other agent tools are unaffected. ###### Returns `PrefixTools`\[`AgentDepsT`\] ### Hooks **Bases:** `AbstractCapability[AgentDepsT]` Register hook functions via decorators or constructor kwargs. For extension developers building reusable capabilities, subclass :class:`AbstractCapability` directly. For application code that needs a few hooks without the ceremony of a subclass, use `Hooks`. Example using decorators:: hooks = Hooks() @hooks.on.before\_model\_request async def log\_request(ctx, request\_context): print(f'Request: {request\_context}') return request\_context agent = Agent('openai:gpt-5', capabilities=\[hooks\]) Example using constructor kwargs:: agent = Agent('openai:gpt-5', capabilities=\[ Hooks(before\_model\_request=log\_request) \]) #### Attributes ##### on Decorator namespace for registering hook functions. **Type:** `_HookRegistration`\[`AgentDepsT`\] ### AgentNode Type alias for an agent graph node (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`). **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'_agent_graph.AgentNode[AgentDepsT, Any]'` ### NodeResult Type alias for the result of executing an agent graph node: either the next node or `End`. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'_agent_graph.AgentNode[AgentDepsT, Any] | End[FinalResult[Any]]'` ### WrapRunHandler Handler type for [`wrap_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_run). **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'Callable[[], Awaitable[AgentRunResult[Any]]]'` ### CAPABILITY\_TYPES Registry of all capability types that have a serialization name, mapping name to class. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractCapability`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]\] **Default:** `{name: cls for cls in (BuiltinTool, HistoryProcessor, ImageGeneration, IncludeToolReturnSchemas, MCP, PrefixTools, PrepareTools, SetToolMetadata, Thinking, Toolset, WebFetch, WebSearch) if (name := (cls.get_serialization_name())) is not None}` ### WrapNodeRunHandler Handler type for [`wrap_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_node_run). **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'Callable[[_agent_graph.AgentNode[AgentDepsT, Any]], Awaitable[_agent_graph.AgentNode[AgentDepsT, Any] | End[FinalResult[Any]]]]'` ### WrapModelRequestHandler Handler type for [`wrap_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_model_request). **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'Callable[[ModelRequestContext], Awaitable[ModelResponse]]'` ### RawToolArgs Type alias for raw (pre-validation) tool arguments. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'str | dict[str, Any]'` ### ValidatedToolArgs Type alias for validated tool arguments. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'dict[str, Any]'` ### WrapToolValidateHandler Handler type for [`wrap_tool_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_tool_validate). **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'Callable[[str | dict[str, Any]], Awaitable[dict[str, Any]]]'` ### WrapToolExecuteHandler Handler type for [`wrap_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_tool_execute). **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'Callable[[dict[str, Any]], Awaitable[Any]]'` ### CapabilityPosition Position tier for a capability in the middleware chain. - `'outermost'`: in the outermost tier, before all non-outermost capabilities. Multiple capabilities can declare `'outermost'`; original list order breaks ties within the tier, and `wraps`/`wrapped_by` edges refine order further. - `'innermost'`: in the innermost tier, after all non-innermost capabilities. Same tie-breaking rules apply. **Default:** `Literal['outermost', 'innermost']` ### CapabilityRef Reference to a capability -- either a type (matches all instances of that type) or a specific instance (matches by identity). **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'type[AbstractCapability[Any]] | AbstractCapability[Any]'` --- # [pydantic_ai.common_tools](https://pydantic.dev/docs/ai/api/pydantic-ai/common_tools/) # pydantic\_ai.common\_tools ### DuckDuckGoResult **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) A DuckDuckGo search result. #### Attributes ##### title The title of the search result. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### href The URL of the search result. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### body The body of the search result. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### DuckDuckGoSearchTool The DuckDuckGo search tool. #### Attributes ##### client The DuckDuckGo search client. **Type:** `DDGS` ##### max\_results The maximum number of results. If None, returns results only from the first response. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) #### Methods ##### \_\_call\_\_ `@async` ```python def __call__(query: str) -> list[DuckDuckGoResult] ``` Searches DuckDuckGo for the given query and returns the results. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`DuckDuckGoResult`\] -- The search results. ###### Parameters **`query`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The query to search for. ### duckduckgo\_search\_tool ```python def duckduckgo_search_tool( duckduckgo_client: DDGS | None = None, max_results: int | None = None, ) ``` Creates a DuckDuckGo search tool. #### Parameters **`duckduckgo_client`** : `DDGS` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The DuckDuckGo search client. **`max_results`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The maximum number of results. If None, returns results only from the first response. Exa tools for Pydantic AI agents. Provides web search, content retrieval, and AI-powered answer capabilities using the Exa API, a neural search engine that finds high-quality, relevant results across billions of web pages. ### ExaSearchResult **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) An Exa search result with content. See [Exa Search API documentation](https://docs.exa.ai/reference/search) for more information. #### Attributes ##### title The title of the search result. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### url The URL of the search result. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### published\_date The published date of the content, if available. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### author The author of the content, if available. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### text The text content of the search result. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### ExaAnswerResult **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) An Exa answer result with citations. See [Exa Answer API documentation](https://docs.exa.ai/reference/answer) for more information. #### Attributes ##### answer The AI-generated answer to the query. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### citations Citations supporting the answer. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ### ExaContentResult **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Content retrieved from a URL. See [Exa Contents API documentation](https://docs.exa.ai/reference/get-contents) for more information. #### Attributes ##### url The URL of the content. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### title The title of the page. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### text The text content of the page. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### author The author of the content, if available. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### published\_date The published date of the content, if available. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ### ExaSearchTool The Exa search tool. #### Attributes ##### client The Exa async client. **Type:** `AsyncExa` ##### num\_results The number of results to return. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### max\_characters Maximum characters of text content per result, or None for no limit. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) #### Methods ##### \_\_call\_\_ `@async` ```python def __call__( query: str, search_type: Literal['auto', 'keyword', 'neural', 'fast', 'deep'] = 'auto', ) -> list[ExaSearchResult] ``` Searches Exa for the given query and returns the results with content. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ExaSearchResult`\] -- The search results with text content. ###### Parameters **`query`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The search query to execute with Exa. **`search_type`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', 'keyword', 'neural', 'fast', 'deep'\] _Default:_ `'auto'` The type of search to perform. 'auto' automatically chooses the best search type, 'keyword' for exact matches, 'neural' for semantic search, 'fast' for speed-optimized search, 'deep' for comprehensive multi-query search. ### ExaFindSimilarTool The Exa find similar tool. #### Attributes ##### client The Exa async client. **Type:** `AsyncExa` ##### num\_results The number of results to return. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) #### Methods ##### \_\_call\_\_ `@async` ```python def __call__(url: str, exclude_source_domain: bool = True) -> list[ExaSearchResult] ``` Finds pages similar to the given URL and returns them with content. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ExaSearchResult`\] -- Similar pages with text content. ###### Parameters **`url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The URL to find similar pages for. **`exclude_source_domain`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to exclude results from the same domain as the input URL. Defaults to True. ### ExaGetContentsTool The Exa get contents tool. #### Attributes ##### client The Exa async client. **Type:** `AsyncExa` #### Methods ##### \_\_call\_\_ `@async` ```python def __call__(urls: list[str]) -> list[ExaContentResult] ``` Gets the content of the specified URLs. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ExaContentResult`\] -- The content of each URL. ###### Parameters **`urls`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] A list of URLs to get content for. ### ExaAnswerTool The Exa answer tool. #### Attributes ##### client The Exa async client. **Type:** `AsyncExa` #### Methods ##### \_\_call\_\_ `@async` ```python def __call__(query: str) -> ExaAnswerResult ``` Generates an AI-powered answer to the query with citations. ###### Returns `ExaAnswerResult` -- An answer with supporting citations from web sources. ###### Parameters **`query`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The question to answer. ### ExaToolset **Bases:** [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) A toolset that provides Exa search tools with a shared client. This is more efficient than creating individual tools when using multiple Exa tools, as it shares a single API client across all tools. Example: ```python from pydantic_ai import Agent from pydantic_ai.common_tools.exa import ExaToolset toolset = ExaToolset(api_key='your-api-key') agent = Agent('openai:gpt-5.2', toolsets=[toolset]) ``` #### Methods ##### \_\_init\_\_ ```python def __init__( api_key: str, num_results: int = 5, max_characters: int | None = None, include_search: bool = True, include_find_similar: bool = True, include_get_contents: bool = True, include_answer: bool = True, id: str | None = None, ) ``` Creates an Exa toolset with a shared client. ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The Exa API key. You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai). **`num_results`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `5` The number of results to return for search and find\_similar. Defaults to 5. **`max_characters`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Maximum characters of text content per result. Use this to limit token usage. Defaults to None (no limit). **`include_search`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include the search tool. Defaults to True. **`include_find_similar`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include the find\_similar tool. Defaults to True. **`include_get_contents`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include the get\_contents tool. Defaults to True. **`include_answer`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include the answer tool. Defaults to True. **`id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID for the toolset, used for durable execution environments. ### exa\_search\_tool ```python def exa_search_tool( api_key: str, num_results: int = 5, max_characters: int | None = None, ) -> Tool[Any] def exa_search_tool( client: AsyncExa, num_results: int = 5, max_characters: int | None = None, ) -> Tool[Any] ``` Creates an Exa search tool. #### Returns [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] #### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Exa API key. Required if `client` is not provided. You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai). **`client`** : `AsyncExa` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools. **`num_results`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `5` The number of results to return. Defaults to 5. **`max_characters`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Maximum characters of text content per result. Use this to limit token usage. Defaults to None (no limit). ### exa\_find\_similar\_tool ```python def exa_find_similar_tool(api_key: str, num_results: int = 5) -> Tool[Any] def exa_find_similar_tool(client: AsyncExa, num_results: int = 5) -> Tool[Any] ``` Creates an Exa find similar tool. #### Returns [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] #### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Exa API key. Required if `client` is not provided. You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai). **`client`** : `AsyncExa` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools. **`num_results`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `5` The number of similar results to return. Defaults to 5. ### exa\_get\_contents\_tool ```python def exa_get_contents_tool(api_key: str) -> Tool[Any] def exa_get_contents_tool(client: AsyncExa) -> Tool[Any] ``` Creates an Exa get contents tool. #### Returns [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] #### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Exa API key. Required if `client` is not provided. You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai). **`client`** : `AsyncExa` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools. ### exa\_answer\_tool ```python def exa_answer_tool(api_key: str) -> Tool[Any] def exa_answer_tool(client: AsyncExa) -> Tool[Any] ``` Creates an Exa answer tool. #### Returns [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] #### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Exa API key. Required if `client` is not provided. You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai). **`client`** : `AsyncExa` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools. ### TavilySearchResult **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) A Tavily search result. See [Tavily Search Endpoint documentation](https://docs.tavily.com/api-reference/endpoint/search) for more information. #### Attributes ##### title The title of the search result. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### url The URL of the search result.. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### content A short description of the search result. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### score The relevance score of the search result. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) ### TavilySearchTool The Tavily search tool. #### Attributes ##### client The Tavily search client. **Type:** `AsyncTavilyClient` ##### max\_results The maximum number of results. If None, the Tavily default is used. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` #### Methods ##### \_\_call\_\_ `@async` ```python def __call__( query: str, search_depth: Literal['basic', 'advanced', 'fast', 'ultra-fast'] = 'basic', topic: Literal['general', 'news', 'finance'] = 'general', time_range: Literal['day', 'week', 'month', 'year'] | None = None, include_domains: list[str] | None = None, exclude_domains: list[str] | None = None, ) -> list[TavilySearchResult] ``` Searches Tavily for the given query and returns the results. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`TavilySearchResult`\] -- A list of search results from Tavily. ###### Parameters **`query`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The search query to execute with Tavily. **`search_depth`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['basic', 'advanced', 'fast', 'ultra-fast'\] _Default:_ `'basic'` The depth of the search. **`topic`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['general', 'news', 'finance'\] _Default:_ `'general'` The category of the search. **`time_range`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['day', 'week', 'month', 'year'\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The time range back from the current date to filter results. **`include_domains`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` List of domains to specifically include in the search results. **`exclude_domains`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` List of domains to specifically exclude from the search results. ### tavily\_search\_tool ```python def tavily_search_tool( api_key: str, max_results: int | None = None, search_depth: Literal['basic', 'advanced', 'fast', 'ultra-fast'] = _UNSET, topic: Literal['general', 'news', 'finance'] = _UNSET, time_range: Literal['day', 'week', 'month', 'year'] | None = _UNSET, include_domains: list[str] | None = _UNSET, exclude_domains: list[str] | None = _UNSET, ) -> Tool[Any] def tavily_search_tool( client: AsyncTavilyClient, max_results: int | None = None, search_depth: Literal['basic', 'advanced', 'fast', 'ultra-fast'] = _UNSET, topic: Literal['general', 'news', 'finance'] = _UNSET, time_range: Literal['day', 'week', 'month', 'year'] | None = _UNSET, include_domains: list[str] | None = _UNSET, exclude_domains: list[str] | None = _UNSET, ) -> Tool[Any] ``` Creates a Tavily search tool. `max_results` is always developer-controlled and does not appear in the LLM tool schema. Other parameters, when provided, are fixed for all searches and hidden from the LLM's tool schema. Parameters left unset remain available for the LLM to set per-call. #### Returns [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] #### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Tavily API key. Required if `client` is not provided. You can get one by signing up at [https://app.tavily.com/home](https://app.tavily.com/home). **`client`** : `AsyncTavilyClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing AsyncTavilyClient. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tool instances. **`max_results`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The maximum number of results. If None, the Tavily default is used. **`search_depth`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['basic', 'advanced', 'fast', 'ultra-fast'\] _Default:_ `_UNSET` The depth of the search. **`topic`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['general', 'news', 'finance'\] _Default:_ `_UNSET` The category of the search. **`time_range`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['day', 'week', 'month', 'year'\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `_UNSET` The time range back from the current date to filter results. **`include_domains`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `_UNSET` List of domains to specifically include in the search results. **`exclude_domains`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `_UNSET` List of domains to specifically exclude from the search results. --- # [pydantic_ai — Concurrency](https://pydantic.dev/docs/ai/api/pydantic-ai/concurrency/) # pydantic\_ai -- Concurrency ### ConcurrencyLimitedModel **Bases:** `WrapperModel` A model wrapper that limits concurrent requests to the underlying model. This wrapper applies concurrency limiting at the model level, ensuring that the number of concurrent requests to the model does not exceed the configured limit. This is useful for: - Respecting API rate limits - Managing resource usage - Sharing a concurrency pool across multiple models Example usage: ```python from pydantic_ai import Agent from pydantic_ai.models.concurrency import ConcurrencyLimitedModel # Limit to 5 concurrent requests model = ConcurrencyLimitedModel('openai:gpt-4o', limiter=5) agent = Agent(model) # Or share a limiter across multiple models from pydantic_ai import ConcurrencyLimiter # noqa E402 shared_limiter = ConcurrencyLimiter(max_running=10, name='openai-pool') model1 = ConcurrencyLimitedModel('openai:gpt-4o', limiter=shared_limiter) model2 = ConcurrencyLimitedModel('openai:gpt-4o-mini', limiter=shared_limiter) ``` #### Methods ##### \_\_init\_\_ ```python def __init__( wrapped: Model | KnownModelName, limiter: int | ConcurrencyLimit | AbstractConcurrencyLimiter, ) ``` Initialize the ConcurrencyLimitedModel. ###### Parameters **`wrapped`** : `Model` | `KnownModelName` The model to wrap, either a Model instance or a known model name. **`limiter`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`ConcurrencyLimit`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimit) | [`AbstractConcurrencyLimiter`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.AbstractConcurrencyLimiter) The concurrency limit configuration. Can be: - An `int`: Simple limit on concurrent operations (unlimited queue). - A `ConcurrencyLimit`: Full configuration with optional backpressure. - An `AbstractConcurrencyLimiter`: A pre-created limiter for sharing across models. ##### request `@async` ```python def request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a request to the model with concurrency limiting. ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### count\_tokens `@async` ```python def count_tokens( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage ``` Count tokens with concurrency limiting. ###### Returns [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) ##### request\_stream `@async` ```python def request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a streaming request to the model with concurrency limiting. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\] ### limit\_model\_concurrency ```python def limit_model_concurrency( model: Model | KnownModelName, limiter: AnyConcurrencyLimit, ) -> Model ``` Wrap a model with concurrency limiting. This is a convenience function to wrap a model with concurrency limiting. If the limiter is None, the model is returned unchanged. Example: ```python from pydantic_ai.models.concurrency import limit_model_concurrency model = limit_model_concurrency('openai:gpt-4o', limiter=5) ``` #### Returns `Model` -- The wrapped model with concurrency limiting, or the original model if limiter is None. #### Parameters **`model`** : `Model` | `KnownModelName` The model to wrap. **`limiter`** : [`AnyConcurrencyLimit`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.AnyConcurrencyLimit) The concurrency limit configuration. ### AbstractConcurrencyLimiter **Bases:** `ABC` Abstract base class for concurrency limiters. Subclass this to create custom concurrency limiters (e.g., Redis-backed distributed limiters). Example: ```python from pydantic_ai.concurrency import AbstractConcurrencyLimiter class RedisConcurrencyLimiter(AbstractConcurrencyLimiter): def __init__(self, redis_client, key: str, max_running: int): self._redis = redis_client self._key = key self._max_running = max_running async def acquire(self, source: str) -> None: # Implement Redis-based distributed locking ... def release(self) -> None: # Release the Redis lock ... ``` #### Methods ##### acquire `@abstractmethod` `@async` ```python def acquire(source: str) -> None ``` Acquire a slot, waiting if necessary. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`source`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) Identifier for observability (e.g., 'model:gpt-4o'). ##### release `@abstractmethod` ```python def release() -> None ``` Release a slot. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ### ConcurrencyLimiter **Bases:** [`AbstractConcurrencyLimiter`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.AbstractConcurrencyLimiter) A concurrency limiter that tracks waiting operations for observability. This class wraps an anyio.CapacityLimiter and tracks the number of waiting operations. When an operation has to wait to acquire a slot, a span is created for observability purposes. #### Attributes ##### name Name of the limiter for observability. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### waiting\_count Number of operations currently waiting to acquire a slot. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### running\_count Number of operations currently running. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### available\_count Number of slots available. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### max\_running Maximum concurrent operations allowed. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) #### Methods ##### \_\_init\_\_ ```python def __init__( max_running: int, max_queued: int | None = None, name: str | None = None, tracer: Tracer | None = None, ) ``` Initialize the ConcurrencyLimiter. ###### Parameters **`max_running`** : [`int`](https://docs.python.org/3/library/functions.html#int) Maximum number of concurrent operations. **`max_queued`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Maximum queue depth before raising ConcurrencyLimitExceeded. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional name for this limiter, used for observability when sharing a limiter across multiple models or agents. **`tracer`** : `Tracer` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` OpenTelemetry tracer for span creation. ##### from\_limit `@classmethod` ```python def from_limit( cls, limit: int | ConcurrencyLimit, name: str | None = None, tracer: Tracer | None = None, ) -> Self ``` Create a ConcurrencyLimiter from a ConcurrencyLimit configuration. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) -- A configured ConcurrencyLimiter. ###### Parameters **`limit`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`ConcurrencyLimit`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimit) Either an int for simple limiting or a ConcurrencyLimit for full config. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional name for this limiter, used for observability. **`tracer`** : `Tracer` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` OpenTelemetry tracer for span creation. ##### acquire `@async` ```python def acquire(source: str) -> None ``` Acquire a slot, creating a span if waiting is required. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`source`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) Identifier for the source of this acquisition (e.g., 'agent:my-agent' or 'model:gpt-4'). ##### release ```python def release() -> None ``` Release a slot. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ### ConcurrencyLimit Configuration for concurrency limiting with optional backpressure. #### Constructor Parameters **`max_running`** : [`int`](https://docs.python.org/3/library/functions.html#int) Maximum number of concurrent operations allowed. **`max_queued`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Maximum number of operations waiting in the queue. If None, the queue is unlimited. If exceeded, raises `ConcurrencyLimitExceeded`. ### AnyConcurrencyLimit Type alias for concurrency limit configuration. Can be: - An `int`: Simple limit on concurrent operations (unlimited queue). - A `ConcurrencyLimit`: Full configuration with optional backpressure. - An `AbstractConcurrencyLimiter`: A pre-created limiter instance for sharing across multiple models/agents. - `None`: No concurrency limiting (default). **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'int | ConcurrencyLimit | AbstractConcurrencyLimiter | None'` ### ConcurrencyLimitExceeded **Bases:** [`AgentRunError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.AgentRunError) Error raised when the concurrency queue depth exceeds max\_queued. --- # [pydantic_ai.direct](https://pydantic.dev/docs/ai/api/pydantic-ai/direct/) # pydantic\_ai.direct Methods for making imperative requests to language models with minimal abstraction. These methods allow you to make requests to LLMs where the only abstraction is input and output schema translation so you can use all models with the same API. These methods are thin wrappers around [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) implementations. ### StreamedResponseSync Synchronous wrapper to async streaming responses by running the async producer in a background thread and providing a synchronous iterator. This class must be used as a context manager with the `with` statement. #### Attributes ##### response Get the current state of the response. **Type:** [`messages.ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### model\_name Get the model name of the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp Get the timestamp of the response. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) #### Methods ##### \_\_iter\_\_ ```python def __iter__() -> Iterator[messages.ModelResponseStreamEvent] ``` Stream the response as an iterable of [`ModelResponseStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponseStreamEvent)s. ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`messages.ModelResponseStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponseStreamEvent)\] ##### get ```python def get() -> messages.ModelResponse ``` Build a ModelResponse from the data received from the stream so far. ###### Returns [`messages.ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### usage ```python def usage() -> RequestUsage ``` Get the usage of the response so far. ###### Returns [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) ### model\_request `@async` ```python def model_request( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> messages.ModelResponse ``` Make a non-streamed request to a model. model\_request\_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request async def main(): model_response = await model_request( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1) ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-haiku-4-5', timestamp=datetime.datetime(...), ) ''' ``` See [`ModelRequest.user_text_prompt`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest.user_text_prompt) for details. #### Returns [`messages.ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) -- The model response and token usage associated with the request. #### Parameters **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. **`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`messages.ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] Messages to send to the model **`model_settings`** : [`settings.ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` optional model settings **`model_request_parameters`** : [`models.ModelRequestParameters`](/docs/ai/api/models/base/#pydantic_ai.models.ModelRequestParameters) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` optional model request parameters **`instrument`** : `instrumented_models.InstrumentationSettings` | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_pydantic_ai) is used. ### model\_request\_sync ```python def model_request_sync( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> messages.ModelResponse ``` Make a Synchronous, non-streamed request to a model. This is a convenience method that wraps [`model_request`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. model\_request\_sync\_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync model_response = model_request_sync( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1) ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-haiku-4-5', timestamp=datetime.datetime(...), ) ''' ``` See [`ModelRequest.user_text_prompt`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest.user_text_prompt) for details. #### Returns [`messages.ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) -- The model response and token usage associated with the request. #### Parameters **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. **`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`messages.ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] Messages to send to the model **`model_settings`** : [`settings.ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` optional model settings **`model_request_parameters`** : [`models.ModelRequestParameters`](/docs/ai/api/models/base/#pydantic_ai.models.ModelRequestParameters) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` optional model request parameters **`instrument`** : `instrumented_models.InstrumentationSettings` | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_pydantic_ai) is used. ### model\_request\_stream ```python def model_request_stream( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> AbstractAsyncContextManager[models.StreamedResponse] ``` Make a streamed async request to a model. model\_request\_stream\_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_stream async def main(): messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] # (1) async with model_request_stream('openai:gpt-5-mini', messages) as stream: chunks = [] async for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), PartEndEvent( index=0, part=TextPart( content='Albert Einstein was a German-born theoretical physicist.' ), ), ] ''' ``` See [`ModelRequest.user_text_prompt`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest.user_text_prompt) for details. #### Returns `AbstractAsyncContextManager`\[[`models.StreamedResponse`](/docs/ai/api/models/base/#pydantic_ai.models.StreamedResponse)\] -- A [stream response](/docs/ai/api/models/base/#pydantic_ai.models.StreamedResponse) async context manager. #### Parameters **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. **`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`messages.ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] Messages to send to the model **`model_settings`** : [`settings.ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` optional model settings **`model_request_parameters`** : [`models.ModelRequestParameters`](/docs/ai/api/models/base/#pydantic_ai.models.ModelRequestParameters) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` optional model request parameters **`instrument`** : `instrumented_models.InstrumentationSettings` | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_pydantic_ai) is used. ### model\_request\_stream\_sync ```python def model_request_stream_sync( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> StreamedResponseSync ``` Make a streamed synchronous request to a model. This is the synchronous version of [`model_request_stream`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request_stream). It uses threading to run the asynchronous stream in the background while providing a synchronous iterator interface. model\_request\_stream\_sync\_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_stream_sync messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] with model_request_stream_sync('openai:gpt-5-mini', messages) as stream: chunks = [] for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), PartEndEvent( index=0, part=TextPart( content='Albert Einstein was a German-born theoretical physicist.' ), ), ] ''' ``` #### Returns `StreamedResponseSync` -- A [sync stream response](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.StreamedResponseSync) context manager. #### Parameters **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. **`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`messages.ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] Messages to send to the model **`model_settings`** : [`settings.ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` optional model settings **`model_request_parameters`** : [`models.ModelRequestParameters`](/docs/ai/api/models/base/#pydantic_ai.models.ModelRequestParameters) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` optional model request parameters **`instrument`** : `instrumented_models.InstrumentationSettings` | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_pydantic_ai) is used. --- # [pydantic_ai.durable_exec](https://pydantic.dev/docs/ai/api/pydantic-ai/durable_exec/) # pydantic\_ai.durable\_exec ### PydanticAIWorkflow Temporal Workflow base class that provides `__pydantic_ai_agents__` for direct agent registration. ### TemporalRunContext **Bases:** `RunContext[AgentDepsT]` The [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `run_id`, `metadata`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `tool_call_metadata`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to [`TemporalAgent`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent). #### Methods ##### serialize\_run\_context `@classmethod` ```python def serialize_run_context(cls, ctx: RunContext[Any]) -> dict[str, Any] ``` Serialize the run context to a `dict[str, Any]`. ###### Returns [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### deserialize\_run\_context `@classmethod` ```python def deserialize_run_context( cls, ctx: dict[str, Any], deps: Any, ) -> TemporalRunContext[Any] ``` Deserialize the run context from a `dict[str, Any]`. ###### Returns `TemporalRunContext`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ### LogfirePlugin **Bases:** `SimplePlugin` Temporal client plugin for Logfire. ### TemporalAgent **Bases:** `WrapperAgent[AgentDepsT, OutputDataT]` #### Methods ##### \_\_init\_\_ ```python def __init__( wrapped: AbstractAgent[AgentDepsT, OutputDataT], name: str | None = None, models: Mapping[str, Model] | None = None, provider_factory: TemporalProviderFactory | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, activity_config: ActivityConfig | None = None, model_activity_config: ActivityConfig | None = None, toolset_activity_config: dict[str, ActivityConfig] | None = None, tool_activity_config: dict[str, dict[str, ActivityConfig | Literal[False]]] | None = None, run_context_type: type[TemporalRunContext[AgentDepsT]] = TemporalRunContext[AgentDepsT], temporalize_toolset_func: Callable[[AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]], AbstractAgent[AgentDepsT, Any] | None], AbstractToolset[AgentDepsT]] = temporalize_toolset, ) ``` Wrap an agent to enable it to be used inside a Temporal workflow, by automatically offloading model requests, tool calls, and MCP server communication to Temporal activities. After wrapping, the original agent can still be used as normal outside of the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. ###### Parameters **`wrapped`** : `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\] The agent to wrap. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional unique agent name to use in the Temporal activities' names. If not provided, the agent's `name` will be used. **`models`** : [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `Model`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional mapping of model instances to register with the agent. Keys define the names that can be referenced at runtime and the values are `Model` instances. Registered model instances can be passed directly to `run(model=...)`. If the wrapped agent doesn't have a model set and none is provided to `run()`, the first model in this mapping will be used as the default. **`provider_factory`** : `TemporalProviderFactory` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional callable used when instantiating models from provider strings (those supplied at runtime). The callable receives the provider name and the current run context, allowing custom configuration such as injecting API keys stored on `deps`. Note: This factory is only used inside Temporal workflows. Outside workflows, model strings are resolved using the default provider behavior. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use instead of the one set on the wrapped agent. **`activity_config`** : `ActivityConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used. **`model_activity_config`** : `ActivityConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Temporal activity config to use for model request activities. This is merged with the base activity config. **`toolset_activity_config`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `ActivityConfig`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. **`tool_activity_config`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `ActivityConfig` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\]\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use IO, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. **`run_context_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`TemporalRunContext`\[`AgentDepsT`\]\] _Default:_ `TemporalRunContext[AgentDepsT]` The `TemporalRunContext` subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `run_id`, `metadata`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute. **`temporalize_toolset_func`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\], [`str`](https://docs.python.org/3/library/stdtypes.html#str), `ActivityConfig`, [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `ActivityConfig` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\]\], [`type`](https://docs.python.org/3/glossary.html#term-type)\[`AgentDepsT`\], [`type`](https://docs.python.org/3/glossary.html#term-type)\[`TemporalRunContext`\[`AgentDepsT`\]\], `AbstractAgent`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)\], [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] _Default:_ `temporalize_toolset` Optional function to use to prepare "leaf" toolsets (i.e. those that implement their own tool listing and calling) for Temporal by wrapping them in a `TemporalWrapperToolset` that moves methods that require IO to Temporal activities. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Temporal. The function takes the toolset, the activity name prefix, the toolset-specific activity config, the tool-specific activity configs and the run context type. ##### run `@async` ```python def run( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[OutputDataT] def run( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[RunOutputDataT] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` ###### Returns [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. Inside workflows, only registered model instances, registered names, or provider strings are valid. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### run\_sync ```python def run_sync( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[OutputDataT] def run_sync( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[RunOutputDataT] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` ###### Returns [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### run\_stream `@async` ```python def run_stream( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]] def run_stream( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]] ``` Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedRunResult`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### run\_stream\_events ```python def run_stream_events( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]] def run_stream_events( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), except that `event_stream_handler` is now allowed. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_messages.AgentStreamEvent` | [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_messages.AgentStreamEvent` | [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- run result. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### iter `@async` ```python def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, _deprecated_kwargs: Never = {}, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]] def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, _deprecated_kwargs: Never = {}, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### override ```python def override( name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: _instructions.AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET, model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET, spec: dict[str, Any] | AgentSpec | None = None, ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures). ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The name to use instead of the name passed to the agent constructor and agent run. **`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET` The dependencies to use instead of the dependencies passed to the agent run. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The model to use instead of the model passed to the agent run. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The toolsets to use instead of the toolsets passed to the agent constructor and agent run. **`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The tools to use instead of the tools registered with the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The instructions to use instead of the instructions registered with the agent. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply as overrides. ### TemporalWrapperToolset **Bases:** `WrapperToolset[AgentDepsT]`, `ABC` ### PydanticAIPlugin **Bases:** `SimplePlugin` Temporal client and worker plugin for Pydantic AI. ### AgentPlugin **Bases:** `SimplePlugin` Temporal worker plugin for a specific Pydantic AI agent. ### StepConfig **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Configuration for a step in the DBOS workflow. ### DBOSMCPServer **Bases:** `DBOSMCPToolset[AgentDepsT]` A wrapper for MCPServer that integrates with DBOS, turning call\_tool and get\_tools into DBOS steps. Tool definitions are cached across steps to avoid redundant MCP server round-trips, respecting the wrapped server's `cache_tools` setting. ### DBOSModel **Bases:** `WrapperModel` A wrapper for Model that integrates with DBOS, turning request and request\_stream to DBOS steps. ### DBOSAgent **Bases:** `WrapperAgent[AgentDepsT, OutputDataT]`, `DBOSConfiguredInstance` #### Methods ##### \_\_init\_\_ ```python def __init__( wrapped: AbstractAgent[AgentDepsT, OutputDataT], name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_step_config: StepConfig | None = None, model_step_config: StepConfig | None = None, parallel_execution_mode: DBOSParallelExecutionMode = 'parallel_ordered_events', ) ``` Wrap an agent to enable it with DBOS durable workflows, by automatically offloading model requests, tool calls, and MCP server communication to DBOS steps. After wrapping, the original agent can still be used as normal outside of the DBOS workflow. ###### Parameters **`wrapped`** : `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\] The agent to wrap. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional unique agent name to use as the DBOS configured instance name. If not provided, the agent's `name` will be used. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use instead of the one set on the wrapped agent. **`mcp_step_config`** : `StepConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base DBOS step config to use for MCP server steps. If no config is provided, use the default settings of DBOS. **`model_step_config`** : `StepConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The DBOS step config to use for model request steps. If no config is provided, use the default settings of DBOS. **`parallel_execution_mode`** : `DBOSParallelExecutionMode` _Default:_ `'parallel_ordered_events'` The mode for executing tool calls: - 'parallel\_ordered\_events' (default): Run tool calls in parallel, but events are emitted in order, after all calls complete. - 'sequential': Run tool calls one at a time in order. ##### run `@async` ```python def run( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[OutputDataT] def run( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[RunOutputDataT] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` ###### Returns [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### run\_sync ```python def run_sync( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[OutputDataT] def run_sync( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[RunOutputDataT] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` ###### Returns [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### run\_stream `@async` ```python def run_stream( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]] def run_stream( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]] ``` Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedRunResult`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### run\_stream\_events ```python def run_stream_events( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]] def run_stream_events( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), except that `event_stream_handler` is now allowed. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_messages.AgentStreamEvent` | [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_messages.AgentStreamEvent` | [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- run result. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### iter `@async` ```python def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, _deprecated_kwargs: Never = {}, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]] def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, _deprecated_kwargs: Never = {}, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### override ```python def override( name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: _instructions.AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET, model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET, spec: dict[str, Any] | AgentSpec | None = None, ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures). ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The name to use instead of the name passed to the agent constructor and agent run. **`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET` The dependencies to use instead of the dependencies passed to the agent run. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The model to use instead of the model passed to the agent run. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The toolsets to use instead of the toolsets passed to the agent constructor and agent run. **`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The tools to use instead of the tools registered with the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The instructions to use instead of the instructions registered with the agent. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply as overrides. ### DBOSParallelExecutionMode The mode for executing tool calls in DBOS durable workflows. This is a subset of the ParallelExecutionMode because 'parallel' cannot guarantee deterministic ordering. **Default:** `Literal['sequential', 'parallel_ordered_events']` ### TaskConfig **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Configuration for a task in Prefect. These options are passed to the `@task` decorator. #### Attributes ##### retries Maximum number of retries for the task. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### retry\_delay\_seconds Delay between retries in seconds. Can be a single value or a list for custom backoff. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`float`](https://docs.python.org/3/library/functions.html#float)\] ##### timeout\_seconds Maximum time in seconds for the task to complete. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) ##### cache\_policy Prefect cache policy for the task. **Type:** `CachePolicy` ##### persist\_result Whether to persist the task result. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### result\_storage Prefect result storage for the task. Should be a storage block or a block slug like `s3-bucket/my-storage`. **Type:** `ResultStorage` ##### log\_prints Whether to log print statements from the task. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ### PrefectFunctionToolset **Bases:** `PrefectWrapperToolset[AgentDepsT]` A wrapper for FunctionToolset that integrates with Prefect, turning tool calls into Prefect tasks. #### Methods ##### call\_tool `@async` ```python def call_tool( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any ``` Call a tool, wrapped as a Prefect task with a descriptive name. ###### Returns [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ### PrefectMCPServer **Bases:** `PrefectWrapperToolset[AgentDepsT]`, `ABC` A wrapper for MCPServer that integrates with Prefect, turning call\_tool and get\_tools into Prefect tasks. #### Methods ##### call\_tool `@async` ```python def call_tool( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult ``` Call an MCP tool, wrapped as a Prefect task with a descriptive name. ###### Returns `ToolResult` ### PrefectModel **Bases:** `WrapperModel` A wrapper for Model that integrates with Prefect, turning request and request\_stream into Prefect tasks. #### Methods ##### request `@async` ```python def request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a model request, wrapped as a Prefect task when in a flow. ###### Returns [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) ##### request\_stream `@async` ```python def request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a streaming model request. When inside a Prefect flow, the stream is consumed within a task and a non-streaming response is returned. When not in a flow, behaves normally. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\] ### PrefectAgent **Bases:** `WrapperAgent[AgentDepsT, OutputDataT]` #### Methods ##### \_\_init\_\_ ```python def __init__( wrapped: AbstractAgent[AgentDepsT, OutputDataT], name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_task_config: TaskConfig | None = None, model_task_config: TaskConfig | None = None, tool_task_config: TaskConfig | None = None, tool_task_config_by_name: dict[str, TaskConfig | None] | None = None, event_stream_handler_task_config: TaskConfig | None = None, prefectify_toolset_func: Callable[[AbstractToolset[AgentDepsT], TaskConfig, TaskConfig, dict[str, TaskConfig | None]], AbstractToolset[AgentDepsT]] = prefectify_toolset, ) ``` Wrap an agent to enable it with Prefect durable flows, by automatically offloading model requests, tool calls, and MCP server communication to Prefect tasks. After wrapping, the original agent can still be used as normal outside of the Prefect flow. ###### Parameters **`wrapped`** : `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\] The agent to wrap. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional unique agent name to use as the Prefect flow name prefix. If not provided, the agent's `name` will be used. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use instead of the one set on the wrapped agent. **`mcp_task_config`** : `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base Prefect task config to use for MCP server tasks. If no config is provided, use the default settings of Prefect. **`model_task_config`** : `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Prefect task config to use for model request tasks. If no config is provided, use the default settings of Prefect. **`tool_task_config`** : `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The default Prefect task config to use for tool calls. If no config is provided, use the default settings of Prefect. **`tool_task_config_by_name`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Per-tool task configuration. Keys are tool names, values are TaskConfig or None (None disables task wrapping for that tool). **`event_stream_handler_task_config`** : `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Prefect task config to use for the event stream handler task. If no config is provided, use the default settings of Prefect. **`prefectify_toolset_func`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\], `TaskConfig`, `TaskConfig`, [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None)\]\], [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] _Default:_ `prefectify_toolset` Optional function to use to prepare toolsets for Prefect by wrapping them in a `PrefectWrapperToolset` that moves methods that require IO to Prefect tasks. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Prefect. The function takes the toolset, the task config, the tool-specific task config, and the tool-specific task config by name. ##### run `@async` ```python def run( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[OutputDataT] def run( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[RunOutputDataT] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` ###### Returns [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### run\_sync ```python def run_sync( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[OutputDataT] def run_sync( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AgentRunResult[RunOutputDataT] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` ###### Returns [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### run\_stream `@async` ```python def run_stream( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]] def run_stream( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]] ``` Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedRunResult`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### run\_stream\_events ```python def run_stream_events( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]] def run_stream_events( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), except that `event_stream_handler` is now allowed. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_messages.AgentStreamEvent` | [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_messages.AgentStreamEvent` | [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- run result. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### iter `@async` ```python def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]] def iter( user_prompt: str | Sequence[_messages.UserContent] | None = None, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: AgentModelSettings[AgentDepsT] | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None, spec: dict[str, Any] | AgentSpec | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run. ###### Parameters **`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` User input to start/continue the conversation. **`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply for this run. ##### override ```python def override( name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: _instructions.AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET, model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET, spec: dict[str, Any] | AgentSpec | None = None, ) -> Iterator[None] ``` Context manager to temporarily override agent dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures). ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The name to use instead of the name passed to the agent constructor and agent run. **`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET` The dependencies to use instead of the dependencies passed to the agent run. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The model to use instead of the model passed to the agent run. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The toolsets to use instead of the toolsets passed to the agent constructor and agent run. **`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET` The tools to use instead of the tools registered with the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The instructions to use instead of the instructions registered with the agent. **`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored. **`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional agent spec to apply as overrides. --- # [pydantic_ai.embeddings](https://pydantic.dev/docs/ai/api/pydantic-ai/embeddings/) # pydantic\_ai.embeddings ### EmbeddingSettings **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Common settings for configuring embedding models. These settings apply across multiple embedding model providers. Not all settings are supported by all models - check the specific model's documentation for details. Provider-specific settings classes (e.g., [`OpenAIEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingSettings), [`CohereEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingSettings)) extend this with additional provider-prefixed options. #### Attributes ##### dimensions The number of dimensions for the output embeddings. Supported by: - OpenAI - Cohere - Google - Sentence Transformers - Bedrock - VoyageAI **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### truncate Whether to truncate inputs that exceed the model's context length. Defaults to `False`. If `True`, inputs that are too long will be truncated. If `False`, an error will be raised for inputs that exceed the context length. For more control over truncation, you can use [`max_input_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.max_input_tokens) and [`count_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.count_tokens) to implement your own truncation logic. Provider-specific truncation settings (e.g., `cohere_truncate`, `bedrock_cohere_truncate`) take precedence if specified. Supported by: - Cohere - Bedrock (Cohere and Nova models) - VoyageAI **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### extra\_headers Extra headers to send to the model. Supported by: - OpenAI - Cohere **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### extra\_body Extra body to send to the model. Supported by: - OpenAI - Cohere **Type:** [`object`](https://docs.python.org/3/glossary.html#term-object) ### EmbeddingModel **Bases:** `ABC` Abstract base class for embedding models. Implement this class to create a custom embedding model. For most use cases, use one of the built-in implementations: - [`OpenAIEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingModel) - [`CohereEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingModel) - [`GoogleEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.google.GoogleEmbeddingModel) - [`BedrockEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingModel) - [`SentenceTransformerEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.sentence_transformers.SentenceTransformerEmbeddingModel) #### Attributes ##### settings Get the default settings for this model. **Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### base\_url The base URL for the provider API, if available. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### model\_name The name of the embedding model. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### system The embedding model provider/system identifier (e.g., 'openai', 'cohere'). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__(settings: EmbeddingSettings | None = None) -> None ``` Initialize the model with optional settings. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific settings that will be used as defaults for this model. ##### embed `@abstractmethod` `@async` ```python def embed( inputs: str | Sequence[str], input_type: EmbedInputType, settings: EmbeddingSettings | None = None, ) -> EmbeddingResult ``` Generate embeddings for the given inputs. ###### Returns [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- An [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- the embeddings and metadata. ###### Parameters **`inputs`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] A single string or sequence of strings to embed. **`input_type`** : `EmbedInputType` Whether the inputs are queries or documents. **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to override the model's defaults. ##### prepare\_embed ```python def prepare_embed( inputs: str | Sequence[str], settings: EmbeddingSettings | None = None, ) -> tuple[list[str], EmbeddingSettings] ``` Prepare the inputs and settings for embedding. This method normalizes inputs to a list and merges settings. Subclasses should call this at the start of their `embed()` implementation. ###### Returns [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\], [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings)\] -- A tuple of (normalized inputs list, merged settings). ###### Parameters **`inputs`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] A single string or sequence of strings. **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to merge with defaults. ##### max\_input\_tokens `@async` ```python def max_input_tokens() -> int | None ``` Get the maximum number of tokens that can be input to the model. ###### Returns [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) -- The maximum token count, or `None` if unknown. ##### count\_tokens `@async` ```python def count_tokens(text: str) -> int ``` Count the number of tokens in the given text. ###### Returns [`int`](https://docs.python.org/3/library/functions.html#int) -- The number of tokens. ###### Parameters **`text`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The text to tokenize and count. ###### Raises - `NotImplementedError` -- If the model doesn't support token counting. - `UserError` -- If the model or tokenizer is not supported. ### WrapperEmbeddingModel **Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) Base class for embedding models that wrap another model. Use this as a base class to create custom embedding model wrappers that modify behavior (e.g., caching, logging, rate limiting) while delegating to an underlying model. By default, all methods are passed through to the wrapped model. Override specific methods to customize behavior. #### Attributes ##### wrapped The underlying embedding model being wrapped. **Type:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) **Default:** `infer_embedding_model(wrapped) if isinstance(wrapped, str) else wrapped` ##### settings Get the settings from the wrapped embedding model. **Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) #### Methods ##### \_\_init\_\_ ```python def __init__(wrapped: EmbeddingModel | str) ``` Initialize the wrapper with an embedding model. ###### Parameters **`wrapped`** : [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The model to wrap. Can be an [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) instance or a model name string (e.g., `'openai:text-embedding-3-small'`). ### EmbeddingResult The result of an embedding operation. This class contains the generated embeddings along with metadata about the operation, including the original inputs, model information, usage statistics, and timing. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('What is AI?') # Access embeddings by index print(len(result.embeddings[0])) #> 1536 # Access embeddings by original input text print(result['What is AI?'] == result.embeddings[0]) #> True # Check usage print(f'Tokens used: {result.usage.input_tokens}') #> Tokens used: 3 ``` #### Attributes ##### embeddings The computed embedding vectors, one per input text. Each embedding is a sequence of floats representing the text in vector space. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`float`](https://docs.python.org/3/library/functions.html#float)\]\] ##### inputs The original input texts that were embedded. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### input\_type Whether the inputs were embedded as queries or documents. **Type:** `EmbedInputType` ##### model\_name The name of the model that generated these embeddings. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_name The name of the provider (e.g., 'openai', 'cohere'). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp When the embedding request was made. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)` ##### usage Token usage statistics for this request. **Type:** [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) **Default:** `field(default_factory=RequestUsage)` ##### provider\_details Provider-specific details from the response. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_response\_id Unique identifier for this response from the provider, if available. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` #### Methods ##### \_\_getitem\_\_ ```python def __getitem__(item: int | str) -> Sequence[float] ``` Get the embedding for an input by index or by the original input text. ###### Returns [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`float`](https://docs.python.org/3/library/functions.html#float)\] -- The embedding vector for the specified input. ###### Parameters **`item`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) Either an integer index or the original input string. ###### Raises - `IndexError` -- If the index is out of range. - `ValueError` -- If the string is not found in the inputs. ##### cost ```python def cost() -> genai_types.PriceCalculation ``` Calculate the cost of the embedding request. Uses [`genai-prices`](https://github.com/pydantic/genai-prices) for pricing data. ###### Returns `genai_types.PriceCalculation` -- A price calculation object with `total_price`, `input_price`, and other cost details. ###### Raises - `LookupError` -- If pricing data is not available for this model/provider. ### TestEmbeddingModel **Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) A mock embedding model for testing. This model returns deterministic embeddings (all 1.0 values) and tracks the settings used in the last call via the `last_settings` attribute. Example: ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import TestEmbeddingModel test_model = TestEmbeddingModel() embedder = Embedder('openai:text-embedding-3-small') async def main(): with embedder.override(model=test_model): await embedder.embed_query('test') assert test_model.last_settings is not None ``` #### Attributes ##### last\_settings The settings used in the most recent embed call. **Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### model\_name The embedding model name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### system The embedding model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: str = 'test', provider_name: str = 'test', dimensions: int = 8, settings: EmbeddingSettings | None = None, ) ``` Initialize the test embedding model. ###### Parameters **`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'test'` The model name to report in results. **`provider_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'test'` The provider name to report in results. **`dimensions`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `8` The number of dimensions for the generated embeddings. **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional default settings for the model. ### InstrumentedEmbeddingModel **Bases:** `WrapperEmbeddingModel` Embedding model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. #### Attributes ##### instrumentation\_settings Instrumentation settings for this model. **Type:** [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) **Default:** `options or InstrumentationSettings()` ### Embedder High-level interface for generating text embeddings. The `Embedder` class provides a convenient way to generate vector embeddings from text using various embedding model providers. It handles model inference, settings management, and optional OpenTelemetry instrumentation. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('What is machine learning?') print(result.embeddings[0][:5]) # First 5 dimensions #> [1.0, 1.0, 1.0, 1.0, 1.0] ``` #### Attributes ##### instrument Options to automatically instrument with OpenTelemetry. Set to `True` to use default instrumentation settings, which will use Logfire if it's configured. Set to an instance of [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) to customize. If this isn't set, then the last value set by [`Embedder.instrument_all()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.instrument_all) will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. **Type:** [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `instrument` ##### model The embedding model used by this embedder. **Type:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) | `KnownEmbeddingModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model: EmbeddingModel | KnownEmbeddingModelName | str, settings: EmbeddingSettings | None = None, defer_model_check: bool = True, instrument: InstrumentationSettings | bool | None = None, ) -> None ``` Initialize an Embedder. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`model`** : [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) | `KnownEmbeddingModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The embedding model to use. Can be specified as: - A model name string in the format `'provider:model-name'` (e.g., `'openai:text-embedding-3-small'`) - An [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) instance **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for all embed calls. **`defer_model_check`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to defer model validation until first use. Set to `False` to validate the model immediately on construction. **`instrument`** : [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` OpenTelemetry instrumentation settings. Set to `True` to enable with defaults, or pass an [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) instance to customize. If `None`, uses the value from [`Embedder.instrument_all()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.instrument_all). ##### instrument\_all `@staticmethod` ```python def instrument_all(instrument: InstrumentationSettings | bool = True) -> None ``` Set the default instrumentation options for all embedders where `instrument` is not explicitly set. This is useful for enabling instrumentation globally without modifying each embedder individually. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`instrument`** : [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Instrumentation settings to use as the default. Set to `True` for default settings, `False` to disable, or pass an [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) instance to customize. ##### override ```python def override( model: EmbeddingModel | KnownEmbeddingModelName | str | _utils.Unset = _utils.UNSET, ) -> Iterator[None] ``` Context manager to temporarily override the embedding model. Useful for testing or dynamically switching models. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): # Temporarily use a different model with embedder.override(model='openai:text-embedding-3-large'): result = await embedder.embed_query('test') print(len(result.embeddings[0])) # 3072 dimensions for large model #> 3072 ``` ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ###### Parameters **`model`** : [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) | `KnownEmbeddingModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET` The embedding model to use within this context. ##### embed\_query `@async` ```python def embed_query( query: str | Sequence[str], settings: EmbeddingSettings | None = None, ) -> EmbeddingResult ``` Embed one or more query texts. Use this method when embedding search queries that will be compared against document embeddings. Some models optimize embeddings differently based on whether the input is a query or document. ###### Returns [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- An [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing the embeddings [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- and metadata about the operation. ###### Parameters **`query`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] A single query string or sequence of query strings to embed. **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to override the embedder's default settings for this call. ##### embed\_documents `@async` ```python def embed_documents( documents: str | Sequence[str], settings: EmbeddingSettings | None = None, ) -> EmbeddingResult ``` Embed one or more document texts. Use this method when embedding documents that will be stored and later searched against. Some models optimize embeddings differently based on whether the input is a query or document. ###### Returns [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- An [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing the embeddings [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- and metadata about the operation. ###### Parameters **`documents`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] A single document string or sequence of document strings to embed. **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to override the embedder's default settings for this call. ##### embed `@async` ```python def embed( inputs: str | Sequence[str], input_type: EmbedInputType, settings: EmbeddingSettings | None = None, ) -> EmbeddingResult ``` Embed text inputs with explicit input type specification. This is the low-level embedding method. For most use cases, prefer [`embed_query()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_query) or [`embed_documents()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_documents). ###### Returns [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- An [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing the embeddings [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- and metadata about the operation. ###### Parameters **`inputs`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] A single string or sequence of strings to embed. **`input_type`** : `EmbedInputType` The type of input, either `'query'` or `'document'`. **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to override the embedder's default settings for this call. ##### max\_input\_tokens `@async` ```python def max_input_tokens() -> int | None ``` Get the maximum number of tokens the model can accept as input. ###### Returns [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) -- The maximum token count, or `None` if the limit is unknown for this model. ##### count\_tokens `@async` ```python def count_tokens(text: str) -> int ``` Count the number of tokens in the given text. ###### Returns [`int`](https://docs.python.org/3/library/functions.html#int) -- The number of tokens in the text. ###### Parameters **`text`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The text to tokenize and count. ###### Raises - `NotImplementedError` -- If the model doesn't support token counting. - `UserError` -- If the model or tokenizer is not supported. ##### embed\_query\_sync ```python def embed_query_sync( query: str | Sequence[str], settings: EmbeddingSettings | None = None, ) -> EmbeddingResult ``` Synchronous version of [`embed_query()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_query). ###### Returns [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) ##### embed\_documents\_sync ```python def embed_documents_sync( documents: str | Sequence[str], settings: EmbeddingSettings | None = None, ) -> EmbeddingResult ``` Synchronous version of [`embed_documents()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_documents). ###### Returns [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) ##### embed\_sync ```python def embed_sync( inputs: str | Sequence[str], input_type: EmbedInputType, settings: EmbeddingSettings | None = None, ) -> EmbeddingResult ``` Synchronous version of [`embed()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed). ###### Returns [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) ##### max\_input\_tokens\_sync ```python def max_input_tokens_sync() -> int | None ``` Synchronous version of [`max_input_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.max_input_tokens). ###### Returns [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### count\_tokens\_sync ```python def count_tokens_sync(text: str) -> int ``` Synchronous version of [`count_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.count_tokens). ###### Returns [`int`](https://docs.python.org/3/library/functions.html#int) ### instrument\_embedding\_model ```python def instrument_embedding_model( model: EmbeddingModel, instrument: InstrumentationSettings | bool, ) -> EmbeddingModel ``` Instrument an embedding model with OpenTelemetry/logfire. #### Returns [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) ### merge\_embedding\_settings ```python def merge_embedding_settings( base: EmbeddingSettings | None, overrides: EmbeddingSettings | None, ) -> EmbeddingSettings | None ``` Merge two sets of embedding settings, with overrides taking precedence. #### Returns [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) -- Merged settings, or `None` if both inputs are `None`. #### Parameters **`base`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) Base settings (typically from the embedder or model). **`overrides`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) Settings that should override the base (typically per-call settings). ### infer\_embedding\_model ```python def infer_embedding_model( model: EmbeddingModel | KnownEmbeddingModelName | str, provider_factory: Callable[[str], Provider[Any]] = infer_provider, ) -> EmbeddingModel ``` Infer the model from the name. #### Returns [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) ### KnownEmbeddingModelName Known model names that can be used with the `model` parameter of [`Embedder`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder). `KnownEmbeddingModelName` is provided as a concise way to specify an embedding model. **Default:** `TypeAliasType('KnownEmbeddingModelName', Literal['google-gla:gemini-embedding-001', 'google-gla:gemini-embedding-2-preview', 'google-vertex:gemini-embedding-001', 'google-vertex:gemini-embedding-2-preview', 'google-vertex:text-embedding-005', 'google-vertex:text-multilingual-embedding-002', 'openai:text-embedding-ada-002', 'openai:text-embedding-3-small', 'openai:text-embedding-3-large', 'cohere:embed-v4.0', 'cohere:embed-english-v3.0', 'cohere:embed-english-light-v3.0', 'cohere:embed-multilingual-v3.0', 'cohere:embed-multilingual-light-v3.0', 'voyageai:voyage-4-large', 'voyageai:voyage-4', 'voyageai:voyage-4-lite', 'voyageai:voyage-3-large', 'voyageai:voyage-3.5', 'voyageai:voyage-3.5-lite', 'voyageai:voyage-code-3', 'voyageai:voyage-finance-2', 'voyageai:voyage-law-2', 'voyageai:voyage-code-2', 'bedrock:amazon.titan-embed-text-v1', 'bedrock:amazon.titan-embed-text-v2:0', 'bedrock:cohere.embed-english-v3', 'bedrock:cohere.embed-multilingual-v3', 'bedrock:cohere.embed-v4:0', 'bedrock:amazon.nova-2-multimodal-embeddings-v1:0'])` ### EmbeddingModel **Bases:** `ABC` Abstract base class for embedding models. Implement this class to create a custom embedding model. For most use cases, use one of the built-in implementations: - [`OpenAIEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingModel) - [`CohereEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingModel) - [`GoogleEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.google.GoogleEmbeddingModel) - [`BedrockEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingModel) - [`SentenceTransformerEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.sentence_transformers.SentenceTransformerEmbeddingModel) #### Attributes ##### settings Get the default settings for this model. **Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### base\_url The base URL for the provider API, if available. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### model\_name The name of the embedding model. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### system The embedding model provider/system identifier (e.g., 'openai', 'cohere'). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__(settings: EmbeddingSettings | None = None) -> None ``` Initialize the model with optional settings. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific settings that will be used as defaults for this model. ##### embed `@abstractmethod` `@async` ```python def embed( inputs: str | Sequence[str], input_type: EmbedInputType, settings: EmbeddingSettings | None = None, ) -> EmbeddingResult ``` Generate embeddings for the given inputs. ###### Returns [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- An [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- the embeddings and metadata. ###### Parameters **`inputs`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] A single string or sequence of strings to embed. **`input_type`** : `EmbedInputType` Whether the inputs are queries or documents. **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to override the model's defaults. ##### prepare\_embed ```python def prepare_embed( inputs: str | Sequence[str], settings: EmbeddingSettings | None = None, ) -> tuple[list[str], EmbeddingSettings] ``` Prepare the inputs and settings for embedding. This method normalizes inputs to a list and merges settings. Subclasses should call this at the start of their `embed()` implementation. ###### Returns [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\], [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings)\] -- A tuple of (normalized inputs list, merged settings). ###### Parameters **`inputs`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] A single string or sequence of strings. **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to merge with defaults. ##### max\_input\_tokens `@async` ```python def max_input_tokens() -> int | None ``` Get the maximum number of tokens that can be input to the model. ###### Returns [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) -- The maximum token count, or `None` if unknown. ##### count\_tokens `@async` ```python def count_tokens(text: str) -> int ``` Count the number of tokens in the given text. ###### Returns [`int`](https://docs.python.org/3/library/functions.html#int) -- The number of tokens. ###### Parameters **`text`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The text to tokenize and count. ###### Raises - `NotImplementedError` -- If the model doesn't support token counting. - `UserError` -- If the model or tokenizer is not supported. ### EmbeddingResult The result of an embedding operation. This class contains the generated embeddings along with metadata about the operation, including the original inputs, model information, usage statistics, and timing. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('What is AI?') # Access embeddings by index print(len(result.embeddings[0])) #> 1536 # Access embeddings by original input text print(result['What is AI?'] == result.embeddings[0]) #> True # Check usage print(f'Tokens used: {result.usage.input_tokens}') #> Tokens used: 3 ``` #### Attributes ##### embeddings The computed embedding vectors, one per input text. Each embedding is a sequence of floats representing the text in vector space. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`float`](https://docs.python.org/3/library/functions.html#float)\]\] ##### inputs The original input texts that were embedded. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### input\_type Whether the inputs were embedded as queries or documents. **Type:** `EmbedInputType` ##### model\_name The name of the model that generated these embeddings. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_name The name of the provider (e.g., 'openai', 'cohere'). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp When the embedding request was made. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)` ##### usage Token usage statistics for this request. **Type:** [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) **Default:** `field(default_factory=RequestUsage)` ##### provider\_details Provider-specific details from the response. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_response\_id Unique identifier for this response from the provider, if available. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` #### Methods ##### \_\_getitem\_\_ ```python def __getitem__(item: int | str) -> Sequence[float] ``` Get the embedding for an input by index or by the original input text. ###### Returns [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`float`](https://docs.python.org/3/library/functions.html#float)\] -- The embedding vector for the specified input. ###### Parameters **`item`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) Either an integer index or the original input string. ###### Raises - `IndexError` -- If the index is out of range. - `ValueError` -- If the string is not found in the inputs. ##### cost ```python def cost() -> genai_types.PriceCalculation ``` Calculate the cost of the embedding request. Uses [`genai-prices`](https://github.com/pydantic/genai-prices) for pricing data. ###### Returns `genai_types.PriceCalculation` -- A price calculation object with `total_price`, `input_price`, and other cost details. ###### Raises - `LookupError` -- If pricing data is not available for this model/provider. ### EmbedInputType The type of input to the embedding model. - `'query'`: Text that will be used as a search query - `'document'`: Text that will be stored and searched against Some embedding models optimize differently for queries vs documents. **Default:** `Literal['query', 'document']` ### EmbeddingSettings **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Common settings for configuring embedding models. These settings apply across multiple embedding model providers. Not all settings are supported by all models - check the specific model's documentation for details. Provider-specific settings classes (e.g., [`OpenAIEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingSettings), [`CohereEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingSettings)) extend this with additional provider-prefixed options. #### Attributes ##### dimensions The number of dimensions for the output embeddings. Supported by: - OpenAI - Cohere - Google - Sentence Transformers - Bedrock - VoyageAI **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### truncate Whether to truncate inputs that exceed the model's context length. Defaults to `False`. If `True`, inputs that are too long will be truncated. If `False`, an error will be raised for inputs that exceed the context length. For more control over truncation, you can use [`max_input_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.max_input_tokens) and [`count_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.count_tokens) to implement your own truncation logic. Provider-specific truncation settings (e.g., `cohere_truncate`, `bedrock_cohere_truncate`) take precedence if specified. Supported by: - Cohere - Bedrock (Cohere and Nova models) - VoyageAI **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### extra\_headers Extra headers to send to the model. Supported by: - OpenAI - Cohere **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### extra\_body Extra body to send to the model. Supported by: - OpenAI - Cohere **Type:** [`object`](https://docs.python.org/3/glossary.html#term-object) ### merge\_embedding\_settings ```python def merge_embedding_settings( base: EmbeddingSettings | None, overrides: EmbeddingSettings | None, ) -> EmbeddingSettings | None ``` Merge two sets of embedding settings, with overrides taking precedence. #### Returns [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) -- Merged settings, or `None` if both inputs are `None`. #### Parameters **`base`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) Base settings (typically from the embedder or model). **`overrides`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) Settings that should override the base (typically per-call settings). ### OpenAIEmbeddingSettings **Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) Settings used for an OpenAI embedding model request. All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported. ### OpenAIEmbeddingModel **Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) OpenAI embedding model implementation. This model works with OpenAI's embeddings API and any [OpenAI-compatible providers](/docs/ai/models/openai#openai-compatible-models). Example: ```python from pydantic_ai.embeddings.openai import OpenAIEmbeddingModel from pydantic_ai.providers.openai import OpenAIProvider # Using OpenAI directly model = OpenAIEmbeddingModel('text-embedding-3-small') # Using an OpenAI-compatible provider model = OpenAIEmbeddingModel( 'text-embedding-3-small', provider=OpenAIProvider(base_url='https://my-provider.com/v1'), ) ``` #### Attributes ##### model\_name The embedding model name. **Type:** `OpenAIEmbeddingModelName` ##### system The embedding model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: OpenAIEmbeddingModelName, provider: OpenAIEmbeddingsCompatibleProvider | Literal['openai'] | Provider[AsyncOpenAI] = 'openai', settings: EmbeddingSettings | None = None, ) ``` Initialize an OpenAI embedding model. ###### Parameters **`model_name`** : `OpenAIEmbeddingModelName` The name of the OpenAI model to use. See [OpenAI's embedding models](https://platform.openai.com/docs/guides/embeddings) for available options. **`provider`** : `OpenAIEmbeddingsCompatibleProvider` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['openai'\] | `Provider`\[`AsyncOpenAI`\] _Default:_ `'openai'` The provider to use for authentication and API access. Can be: - `'openai'` (default): Uses the standard OpenAI API - A provider name string (e.g., `'azure'`, `'deepseek'`) - A [`Provider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.Provider) instance for custom configuration See [OpenAI-compatible providers](/docs/ai/models/openai#openai-compatible-models) for a list of supported providers. **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for this model. ### OpenAIEmbeddingModelName Possible OpenAI embeddings model names. See the [OpenAI embeddings documentation](https://platform.openai.com/docs/guides/embeddings) for available models. **Default:** `str | LatestOpenAIEmbeddingModelNames` ### CohereEmbeddingSettings **Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) Settings used for a Cohere embedding model request. All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported, plus Cohere-specific settings prefixed with `cohere_`. #### Attributes ##### cohere\_max\_tokens The maximum number of tokens to embed. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### cohere\_input\_type The Cohere-specific input type for the embedding. Overrides the standard `input_type` argument. Options include: `'search_query'`, `'search_document'`, `'classification'`, `'clustering'`, and `'image'`. **Type:** `CohereEmbedInputType` ##### cohere\_truncate The truncation strategy to use: - `'NONE'` (default): Raise an error if input exceeds max tokens. - `'END'`: Truncate the end of the input text. - `'START'`: Truncate the start of the input text. Note: This setting overrides the standard `truncate` boolean setting when specified. **Type:** `V2EmbedRequestTruncate` ### CohereEmbeddingModel **Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) Cohere embedding model implementation. This model works with Cohere's embeddings API, which offers multilingual support and various model sizes. Example: ```python from pydantic_ai.embeddings.cohere import CohereEmbeddingModel model = CohereEmbeddingModel('embed-v4.0') ``` #### Attributes ##### base\_url The base URL for the provider API, if available. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### model\_name The embedding model name. **Type:** `CohereEmbeddingModelName` ##### system The embedding model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: CohereEmbeddingModelName, provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere', settings: EmbeddingSettings | None = None, ) ``` Initialize a Cohere embedding model. ###### Parameters **`model_name`** : `CohereEmbeddingModelName` The name of the Cohere model to use. See [Cohere Embed documentation](https://docs.cohere.com/docs/cohere-embed) for available models. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['cohere'\] | `Provider`\[`AsyncClientV2`\] _Default:_ `'cohere'` The provider to use for authentication and API access. Can be: - `'cohere'` (default): Uses the standard Cohere API - A [`CohereProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.cohere.CohereProvider) instance for custom configuration **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for this model. ### LatestCohereEmbeddingModelNames Latest Cohere embeddings models. See the [Cohere Embed documentation](https://docs.cohere.com/docs/cohere-embed) for available models and their capabilities. **Default:** `Literal['embed-v4.0', 'embed-english-v3.0', 'embed-english-light-v3.0', 'embed-multilingual-v3.0', 'embed-multilingual-light-v3.0']` ### CohereEmbeddingModelName Possible Cohere embeddings model names. **Default:** `str | LatestCohereEmbeddingModelNames` ### GoogleEmbeddingSettings **Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) Settings used for a Google embedding model request. All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported, plus Google-specific settings prefixed with `google_`. #### Attributes ##### google\_task\_type The task type for the embedding. Overrides the automatic task type selection based on `input_type`. See [Google's task type documentation](https://ai.google.dev/gemini-api/docs/embeddings#task-types) for available options. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### google\_title Optional title for the content being embedded. Only applicable when task\_type is `RETRIEVAL_DOCUMENT`. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### GoogleEmbeddingModel **Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) Google embedding model implementation. This model works with Google's embeddings API via the `google-genai` SDK, supporting both the Gemini API (Google AI Studio) and Vertex AI. Example: ```python from pydantic_ai.embeddings.google import GoogleEmbeddingModel from pydantic_ai.providers.google import GoogleProvider # Using Gemini API (requires GOOGLE_API_KEY env var) model = GoogleEmbeddingModel('gemini-embedding-001') # Using Vertex AI model = GoogleEmbeddingModel( 'gemini-embedding-001', provider=GoogleProvider(vertexai=True, project='my-project', location='us-central1'), ) ``` #### Attributes ##### model\_name The embedding model name. **Type:** `GoogleEmbeddingModelName` ##### system The embedding model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: GoogleEmbeddingModelName, provider: Literal['google-gla', 'google-vertex'] | Provider[Client] = 'google-gla', settings: EmbeddingSettings | None = None, ) ``` Initialize a Google embedding model. ###### Parameters **`model_name`** : `GoogleEmbeddingModelName` The name of the Google model to use. See [Google Embeddings documentation](https://ai.google.dev/gemini-api/docs/embeddings) for available models. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['google-gla', 'google-vertex'\] | `Provider`\[`Client`\] _Default:_ `'google-gla'` The provider to use for authentication and API access. Can be: - `'google-gla'` (default): Uses the Gemini API (Google AI Studio) - `'google-vertex'`: Uses Vertex AI - A [`GoogleProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.google.GoogleProvider) instance for custom configuration **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for this model. ### LatestGoogleGLAEmbeddingModelNames Latest Google Gemini API (GLA) embedding models. See the [Google Embeddings documentation](https://ai.google.dev/gemini-api/docs/embeddings) for available models and their capabilities. **Default:** `Literal['gemini-embedding-001', 'gemini-embedding-2-preview']` ### LatestGoogleVertexEmbeddingModelNames Latest Google Vertex AI embedding models. See the [Vertex AI Embeddings documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings) for available models and their capabilities. **Default:** `Literal['gemini-embedding-001', 'gemini-embedding-2-preview', 'text-embedding-005', 'text-multilingual-embedding-002']` ### LatestGoogleEmbeddingModelNames All latest Google embedding models (union of GLA and Vertex AI models). **Default:** `LatestGoogleGLAEmbeddingModelNames | LatestGoogleVertexEmbeddingModelNames` ### GoogleEmbeddingModelName Possible Google embeddings model names. **Default:** `str | LatestGoogleEmbeddingModelNames` ### BedrockEmbeddingSettings **Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) Settings used for a Bedrock embedding model request. All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported, plus Bedrock-specific settings prefixed with `bedrock_`. All settings are optional - if not specified, model defaults are used. **Note on `dimensions` parameter support:** - **Titan v1** (`amazon.titan-embed-text-v1`): Not supported (fixed: 1536) - **Titan v2** (`amazon.titan-embed-text-v2:0`): Supported (default: 1024, accepts 256/384/1024) - **Cohere v3** (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`): Not supported (fixed: 1024) - **Cohere v4** (`cohere.embed-v4:0`): Supported (default: 1536, accepts 256/512/1024/1536) - **Nova** (`amazon.nova-2-multimodal-embeddings-v1:0`): Supported (default: 3072, accepts 256/384/1024/3072) Unsupported settings are silently ignored. **Note on `truncate` parameter support:** - **Titan models** (`amazon.titan-embed-text-v1`, `amazon.titan-embed-text-v2:0`): Not supported - **Cohere models** (all versions): Supported (default: `False`, maps to `'END'` when `True`) - **Nova** (`amazon.nova-2-multimodal-embeddings-v1:0`): Supported (default: `False`, maps to `'END'` when `True`) For fine-grained truncation control, use model-specific settings: `bedrock_cohere_truncate` or `bedrock_nova_truncate`. #### Attributes ##### bedrock\_titan\_normalize Whether to normalize embedding vectors for Titan models. **Supported by:** `amazon.titan-embed-text-v2:0` (default: `True`) **Not supported by:** `amazon.titan-embed-text-v1` (silently ignored) When enabled, vectors are normalized for direct cosine similarity calculations. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### bedrock\_cohere\_max\_tokens The maximum number of tokens to embed for Cohere models. **Supported by:** `cohere.embed-v4:0` (default: 128000) **Not supported by:** `cohere.embed-english-v3`, `cohere.embed-multilingual-v3` (silently ignored) **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### bedrock\_cohere\_input\_type The input type for Cohere models. **Supported by:** All Cohere models (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`, `cohere.embed-v4:0`) By default, `embed_query()` uses `'search_query'` and `embed_documents()` uses `'search_document'`. Also accepts `'classification'` or `'clustering'`. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['search\_document', 'search\_query', 'classification', 'clustering'\] ##### bedrock\_cohere\_truncate The truncation strategy for Cohere models. Overrides base `truncate` setting. **Supported by:** All Cohere models (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`, `cohere.embed-v4:0`) Default: `'NONE'` - `'NONE'`: Raise an error if input exceeds max tokens. - `'START'`: Truncate the start of the input. - `'END'`: Truncate the end of the input. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['NONE', 'START', 'END'\] ##### bedrock\_nova\_truncate The truncation strategy for Nova models. Overrides base `truncate` setting. **Supported by:** `amazon.nova-2-multimodal-embeddings-v1:0` Default: `'NONE'` - `'NONE'`: Raise an error if input exceeds max tokens. - `'START'`: Truncate the start of the input. - `'END'`: Truncate the end of the input. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['NONE', 'START', 'END'\] ##### bedrock\_nova\_embedding\_purpose The embedding purpose for Nova models. **Supported by:** `amazon.nova-2-multimodal-embeddings-v1:0` By default, `embed_query()` uses `'GENERIC_RETRIEVAL'` and `embed_documents()` uses `'GENERIC_INDEX'`. Also accepts `'TEXT_RETRIEVAL'`, `'CLASSIFICATION'`, or `'CLUSTERING'`. Note: Multimodal-specific purposes (`'IMAGE_RETRIEVAL'`, `'VIDEO_RETRIEVAL'`, `'DOCUMENT_RETRIEVAL'`, `'AUDIO_RETRIEVAL'`) are not supported as this embedding client only accepts text input. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['GENERIC\_INDEX', 'GENERIC\_RETRIEVAL', 'TEXT\_RETRIEVAL', 'CLASSIFICATION', 'CLUSTERING'\] ##### bedrock\_inference\_profile An [inference profile](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles.html) ARN to use as the `modelId` in API requests. When set, this value is used as the `modelId` in `invoke_model` API calls instead of the base `model_name`. This allows you to pass the base model name (e.g. `'amazon.titan-embed-text-v2:0'`) as `model_name` for detecting model capabilities, while routing requests through an inference profile for cost tracking or cross-region inference. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### bedrock\_max\_concurrency Maximum number of concurrent requests for models that don't support batch embedding. **Applies to:** `amazon.titan-embed-text-v1`, `amazon.titan-embed-text-v2:0`, `amazon.nova-2-multimodal-embeddings-v1:0` When embedding multiple texts with models that only support single-text requests, this controls how many requests run in parallel. Defaults to 5. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ### BedrockEmbeddingModel **Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) Bedrock embedding model implementation. This model works with AWS Bedrock's embedding models including Amazon Titan Embeddings and Cohere Embed models. Example: ```python from pydantic_ai.embeddings.bedrock import BedrockEmbeddingModel from pydantic_ai.providers.bedrock import BedrockProvider # Using default AWS credentials model = BedrockEmbeddingModel('amazon.titan-embed-text-v2:0') # Using explicit credentials model = BedrockEmbeddingModel( 'cohere.embed-english-v3', provider=BedrockProvider( region_name='us-east-1', aws_access_key_id='...', aws_secret_access_key='...', ), ) ``` #### Attributes ##### base\_url The base URL for the provider API. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### model\_name The embedding model name. **Type:** `BedrockEmbeddingModelName` ##### system The embedding model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: BedrockEmbeddingModelName, provider: Literal['bedrock'] | Provider[BaseClient] = 'bedrock', settings: EmbeddingSettings | None = None, ) ``` Initialize a Bedrock embedding model. ###### Parameters **`model_name`** : `BedrockEmbeddingModelName` The name of the Bedrock embedding model to use. See [Bedrock embedding models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for available options. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['bedrock'\] | `Provider`\[`BaseClient`\] _Default:_ `'bedrock'` The provider to use for authentication and API access. Can be: - `'bedrock'` (default): Uses default AWS credentials - A [`BedrockProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.bedrock.BedrockProvider) instance for custom configuration **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for this model. ##### max\_input\_tokens `@async` ```python def max_input_tokens() -> int | None ``` Get the maximum number of tokens that can be input to the model. ###### Returns [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) ### LatestBedrockEmbeddingModelNames Latest Bedrock embedding model names. See [the Bedrock docs](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for available embedding models. **Default:** `Literal['amazon.titan-embed-text-v1', 'amazon.titan-embed-text-v2:0', 'cohere.embed-english-v3', 'cohere.embed-multilingual-v3', 'cohere.embed-v4:0', 'amazon.nova-2-multimodal-embeddings-v1:0']` ### BedrockEmbeddingModelName Possible Bedrock embedding model names. **Default:** `str | LatestBedrockEmbeddingModelNames` ### VoyageAIEmbeddingSettings **Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) Settings used for a VoyageAI embedding model request. All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported, plus VoyageAI-specific settings prefixed with `voyageai_`. #### Attributes ##### voyageai\_input\_type The VoyageAI-specific input type for the embedding. Overrides the standard `input_type` argument. Options include: `'query'`, `'document'`, or `'none'` for direct embedding without prefix. **Type:** `VoyageAIEmbedInputType` ### VoyageAIEmbeddingModel **Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) VoyageAI embedding model implementation. VoyageAI provides state-of-the-art embedding models optimized for retrieval, with specialized models for code, finance, and legal domains. Example: ```python from pydantic_ai.embeddings.voyageai import VoyageAIEmbeddingModel model = VoyageAIEmbeddingModel('voyage-3.5') ``` #### Attributes ##### base\_url The base URL for the provider API. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### model\_name The embedding model name. **Type:** `VoyageAIEmbeddingModelName` ##### system The embedding model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: VoyageAIEmbeddingModelName, provider: Literal['voyageai'] | Provider[AsyncClient] = 'voyageai', settings: EmbeddingSettings | None = None, ) ``` Initialize a VoyageAI embedding model. ###### Parameters **`model_name`** : `VoyageAIEmbeddingModelName` The name of the VoyageAI model to use. See [VoyageAI models](https://docs.voyageai.com/docs/embeddings) for available options. **`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['voyageai'\] | `Provider`\[`AsyncClient`\] _Default:_ `'voyageai'` The provider to use for authentication and API access. Can be: - `'voyageai'` (default): Uses the standard VoyageAI API - A [`VoyageAIProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.voyageai.VoyageAIProvider) instance for custom configuration **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for this model. ### LatestVoyageAIEmbeddingModelNames Latest VoyageAI embedding models. See [VoyageAI Embeddings](https://docs.voyageai.com/docs/embeddings) for available models and their capabilities. **Default:** `Literal['voyage-4-large', 'voyage-4', 'voyage-4-lite', 'voyage-3-large', 'voyage-3.5', 'voyage-3.5-lite', 'voyage-code-3', 'voyage-finance-2', 'voyage-law-2', 'voyage-code-2']` ### VoyageAIEmbeddingModelName Possible VoyageAI embedding model names. **Default:** `str | LatestVoyageAIEmbeddingModelNames` ### VoyageAIEmbedInputType VoyageAI embedding input types. - `'query'`: For search queries; prepends retrieval-optimized prefix. - `'document'`: For documents; prepends document retrieval prefix. - `'none'`: Direct embedding without any prefix. **Default:** `Literal['query', 'document', 'none']` ### SentenceTransformersEmbeddingSettings **Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) Settings used for a Sentence-Transformers embedding model request. All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported, plus Sentence-Transformers-specific settings prefixed with `sentence_transformers_`. #### Attributes ##### sentence\_transformers\_device Device to run inference on. Examples: `'cpu'`, `'cuda'`, `'cuda:0'`, `'mps'` (Apple Silicon). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### sentence\_transformers\_normalize\_embeddings Whether to L2-normalize embeddings. When `True`, all embeddings will have unit length, which is useful for cosine similarity calculations. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### sentence\_transformers\_batch\_size Batch size to use during encoding. Larger batches may be faster but require more memory. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ### SentenceTransformerEmbeddingModel **Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) Local embedding model using the `sentence-transformers` library. This model runs embeddings locally on your machine, which is useful for: - Privacy-sensitive applications where data shouldn't leave your infrastructure - Reducing API costs for high-volume embedding workloads - Offline or air-gapped environments Models are downloaded from Hugging Face on first use. See the [Sentence-Transformers documentation](https://www.sbert.net/docs/sentence_transformer/pretrained_models.html) for available models. Example: ```python from sentence_transformers import SentenceTransformer from pydantic_ai.embeddings.sentence_transformers import ( SentenceTransformerEmbeddingModel, ) # Using a model name (downloads from Hugging Face) model = SentenceTransformerEmbeddingModel('all-MiniLM-L6-v2') # Using an existing SentenceTransformer instance st_model = SentenceTransformer('all-MiniLM-L6-v2') model = SentenceTransformerEmbeddingModel(st_model) ``` #### Attributes ##### base\_url No base URL -- runs locally. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### model\_name The embedding model name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### system The embedding model provider/system identifier. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model: SentenceTransformer | str, settings: EmbeddingSettings | None = None, ) -> None ``` Initialize a Sentence-Transformers embedding model. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`model`** : `SentenceTransformer` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The model to use. Can be: - A model name from Hugging Face (e.g., `'all-MiniLM-L6-v2'`) - A local path to a saved model - An existing `SentenceTransformer` instance **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Model-specific [`SentenceTransformersEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.sentence_transformers.SentenceTransformersEmbeddingSettings) to use as defaults for this model. ### TestEmbeddingModel **Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) A mock embedding model for testing. This model returns deterministic embeddings (all 1.0 values) and tracks the settings used in the last call via the `last_settings` attribute. Example: ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import TestEmbeddingModel test_model = TestEmbeddingModel() embedder = Embedder('openai:text-embedding-3-small') async def main(): with embedder.override(model=test_model): await embedder.embed_query('test') assert test_model.last_settings is not None ``` #### Attributes ##### last\_settings The settings used in the most recent embed call. **Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### model\_name The embedding model name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### system The embedding model provider. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_init\_\_ ```python def __init__( model_name: str = 'test', provider_name: str = 'test', dimensions: int = 8, settings: EmbeddingSettings | None = None, ) ``` Initialize the test embedding model. ###### Parameters **`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'test'` The model name to report in results. **`provider_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'test'` The provider name to report in results. **`dimensions`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `8` The number of dimensions for the generated embeddings. **`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional default settings for the model. ### WrapperEmbeddingModel **Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) Base class for embedding models that wrap another model. Use this as a base class to create custom embedding model wrappers that modify behavior (e.g., caching, logging, rate limiting) while delegating to an underlying model. By default, all methods are passed through to the wrapped model. Override specific methods to customize behavior. #### Attributes ##### wrapped The underlying embedding model being wrapped. **Type:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) **Default:** `infer_embedding_model(wrapped) if isinstance(wrapped, str) else wrapped` ##### settings Get the settings from the wrapped embedding model. **Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) #### Methods ##### \_\_init\_\_ ```python def __init__(wrapped: EmbeddingModel | str) ``` Initialize the wrapper with an embedding model. ###### Parameters **`wrapped`** : [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The model to wrap. Can be an [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) instance or a model name string (e.g., `'openai:text-embedding-3-small'`). ### InstrumentedEmbeddingModel **Bases:** `WrapperEmbeddingModel` Embedding model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. #### Attributes ##### instrumentation\_settings Instrumentation settings for this model. **Type:** [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) **Default:** `options or InstrumentationSettings()` ### instrument\_embedding\_model ```python def instrument_embedding_model( model: EmbeddingModel, instrument: InstrumentationSettings | bool, ) -> EmbeddingModel ``` Instrument an embedding model with OpenTelemetry/logfire. #### Returns [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) --- # [pydantic_ai.exceptions](https://pydantic.dev/docs/ai/api/pydantic-ai/exceptions/) # pydantic\_ai.exceptions ### ModelRetry **Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception) Exception to raise to request a model retry. Can be raised from tool functions, output validators, and capability hooks (such as `after_model_request`, `after_tool_execute`, etc.) to send a retry prompt back to the model asking it to try again. #### Attributes ##### message The message to return to the model. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message` #### Methods ##### \_\_get\_pydantic\_core\_schema\_\_ `@classmethod` ```python def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> core_schema.CoreSchema ``` Pydantic core schema to allow `ModelRetry` to be (de)serialized. ###### Returns `core_schema.CoreSchema` ### CallDeferred **Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception) Exception to raise when a tool call should be deferred. See [tools docs](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more information. #### Constructor Parameters **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional dictionary of metadata to attach to the deferred tool call. This metadata will be available in `DeferredToolRequests.metadata` keyed by `tool_call_id`. ### ApprovalRequired **Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception) Exception to raise when a tool call requires human-in-the-loop approval. See [tools docs](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more information. #### Constructor Parameters **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional dictionary of metadata to attach to the deferred tool call. This metadata will be available in `DeferredToolRequests.metadata` keyed by `tool_call_id`. ### SkipModelRequest **Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception) Exception to raise in before/wrap model request hooks to skip the model call. The provided response will be used instead of calling the model. Note: when raised in `before_model_request`, any message history modifications made by earlier capabilities in that hook will not be persisted to the agent's message history, since the request preparation is aborted. ### SkipToolValidation **Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception) Exception to raise in before/wrap tool validate hooks to skip validation. The provided args will be used as the validated arguments. ### SkipToolExecution **Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception) Exception to raise in before/wrap tool execute hooks to skip execution. The provided result will be used as the tool result. ### UserError **Bases:** [`RuntimeError`](https://docs.python.org/3/library/exceptions.html#RuntimeError) Error caused by a usage mistake by the application developer -- You! #### Attributes ##### message Description of the mistake. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message` ### AgentRunError **Bases:** [`RuntimeError`](https://docs.python.org/3/library/exceptions.html#RuntimeError) Base class for errors occurring during an agent run. #### Attributes ##### message The error message. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message` ### UsageLimitExceeded **Bases:** [`AgentRunError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.AgentRunError) Error raised when a Model's usage exceeds the specified limits. ### ConcurrencyLimitExceeded **Bases:** [`AgentRunError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.AgentRunError) Error raised when the concurrency queue depth exceeds max\_queued. ### UnexpectedModelBehavior **Bases:** [`AgentRunError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.AgentRunError) Error caused by unexpected Model behavior, e.g. an unexpected response code. #### Attributes ##### message Description of the unexpected behavior. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message` ##### body The body of the response, if available. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `json.dumps(json.loads(body), indent=2)` ### ContentFilterError **Bases:** [`UnexpectedModelBehavior`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UnexpectedModelBehavior) Raised when content filtering is triggered by the model provider resulting in an empty response. ### ModelAPIError **Bases:** [`AgentRunError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.AgentRunError) Raised when a model provider API request fails. #### Attributes ##### model\_name The name of the model associated with the error. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `model_name` ### ModelHTTPError **Bases:** [`ModelAPIError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelAPIError) Raised when a model provider response has a status code of 4xx or 5xx. #### Attributes ##### status\_code The HTTP status code returned by the API. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `status_code` ##### body The body of the response, if available. **Type:** [`object`](https://docs.python.org/3/glossary.html#term-object) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `body` ### FallbackExceptionGroup **Bases:** `ExceptionGroup[Any]` A group of exceptions that can be raised when all fallback models fail. ### ToolRetryError **Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception) Exception used to signal a `ToolRetry` message should be returned to the LLM. ### IncompleteToolCall **Bases:** [`UnexpectedModelBehavior`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UnexpectedModelBehavior) Error raised when a model stops due to token limit while emitting a tool call. --- # [pydantic_ai.ext](https://pydantic.dev/docs/ai/api/pydantic-ai/ext/) # pydantic\_ai.ext ### LangChainToolset **Bases:** [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) A toolset that wraps LangChain tools. ### tool\_from\_langchain ```python def tool_from_langchain(langchain_tool: LangChainTool) -> Tool ``` Creates a Pydantic AI tool proxy from a LangChain tool. #### Returns [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) -- A Pydantic AI tool that corresponds to the LangChain tool. #### Parameters **`langchain_tool`** : `LangChainTool` The LangChain tool to wrap. ### ACIToolset **Bases:** [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) A toolset that wraps ACI.dev tools. ### tool\_from\_aci ```python def tool_from_aci(aci_function: str, linked_account_owner_id: str) -> Tool ``` Creates a Pydantic AI tool proxy from an ACI.dev function. #### Returns [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) -- A Pydantic AI tool that corresponds to the ACI.dev tool. #### Parameters **`aci_function`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The ACI.dev function to wrap. **`linked_account_owner_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The ACI user ID to execute the function on behalf of. --- # [pydantic_ai.format_prompt](https://pydantic.dev/docs/ai/api/pydantic-ai/format_prompt/) # pydantic\_ai.format\_prompt ### format\_as\_xml ```python def format_as_xml( obj: Any, root_tag: str | None = None, item_tag: str = 'item', none_str: str = 'null', indent: str | None = ' ', include_field_info: Literal['once'] | bool = False, ) -> str ``` Format a Python object as XML. This is useful since LLMs often find it easier to read semi-structured data (e.g. examples) as XML, rather than JSON etc. Supports: `str`, `bytes`, `bytearray`, `bool`, `int`, `float`, `Decimal`, `date`, `datetime`, `time`, `timedelta`, `UUID`, `Enum`, `Mapping`, `Iterable`, `dataclass`, and `BaseModel`. Example: format\_as\_xml\_example.py ```python from pydantic_ai import format_as_xml print(format_as_xml({'name': 'John', 'height': 6, 'weight': 200}, root_tag='user')) ''' John 6 200 ''' ``` #### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) -- XML representation of the object. #### Parameters **`obj`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) Python Object to serialize to XML. **`root_tag`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Outer tag to wrap the XML in, use `None` to omit the outer tag. **`item_tag`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'item'` Tag to use for each item in an iterable (e.g. list), this is overridden by the class name for dataclasses and Pydantic models. **`none_str`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'null'` String to use for `None` values. **`indent`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `' '` Indentation string to use for pretty printing. **`include_field_info`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['once'\] | [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to include attributes like Pydantic `Field` attributes and dataclasses `field()` `metadata` as XML attributes. In both cases the allowed `Field` attributes and `field()` metadata keys are `title` and `description`. If a field is repeated in the data (e.g. in a list) by setting `once` the attributes are included only in the first occurrence of an XML element relative to the same field. --- # [pydantic_ai.function_signature](https://pydantic.dev/docs/ai/api/pydantic-ai/function_signature/) # pydantic\_ai.function\_signature Generate function signatures from functions and JSON schemas. This module provides utilities to represent tool definitions as human-readable function signatures, which LLMs can understand more easily than raw JSON schemas. Used by code mode to present tools as callable functions. ### SimpleTypeExpr A simple named type like `str`, `int`, `Any`, `None`. ### LiteralTypeExpr A Literal type expression like `Literal['a', 'b']` or `Literal[42]`. ### GenericTypeExpr A generic type expression like `list[User]`, `dict[str, User]`, `tuple[int, str]`. ### UnionTypeExpr A union type expression like `User | None`, `str | int`. ### TypeFieldSignature A single field in a TypedDict-style type definition. #### Methods ##### \_\_str\_\_ ```python def __str__() -> str ``` Render this field as a line in a TypedDict class body. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### TypeSignature A TypedDict-style class definition with named fields. #### Attributes ##### display\_name The type name, with tool-name prefix applied if rendering context is set. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### \_\_str\_\_ ```python def __str__() -> str ``` Return the type name (for use in type expressions like `def foo(x: User)`). ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### render\_definition ```python def render_definition( owner_name: str | None = None, conflicting_type_names: frozenset[str] = frozenset(), ) -> str ``` Render the full TypedDict class definition. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ###### Parameters **`owner_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The owning tool name, used to build prefixed type names for conflicting types (e.g. `get_user_Address`). **`conflicting_type_names`** : [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `frozenset()` Set of type names that need tool-name prefixes (from `get_conflicting_type_names`). Only effective when `owner_name` is also provided. ##### structurally\_equal ```python def structurally_equal(other: TypeSignature) -> bool ``` Compare two TypeSignatures structurally, ignoring descriptions. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ### FunctionParam A single parameter in a function signature. #### Methods ##### \_\_str\_\_ ```python def __str__() -> str ``` Render this parameter as a function parameter string. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### FunctionSignature Function signature shape with referenced type definitions. This class holds the structural data (params, return type, referenced types) needed to render a function signature. Name and description can be overridden at render time (e.g. from a `ToolDefinition`). #### Attributes ##### params Function parameters, all rendered as keyword-only (JSON schema doesn't distinguish positional/keyword). **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `FunctionParam`\] **Default:** `field(default_factory=(dict[str, FunctionParam]))` ##### return\_type The return type expression. **Type:** `TypeExpr` ##### referenced\_types TypedDict class definitions needed by the signature. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`TypeSignature`\] **Default:** `field(default_factory=(list[TypeSignature]))` ##### is\_async Whether the underlying function is async. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` #### Methods ##### render ```python def render( body: str, name: str | None = None, description: str | None = None, is_async: bool | None = None, conflicting_type_names: frozenset[str] = frozenset(), ) -> str ``` Render the signature with a specific body. Sets `_type_name_overrides` so that dedup-prefixed types resolve correctly during rendering. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ###### Parameters **`body`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The function body (e.g. `'...'` or `'return await tool()'`). **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The function name (also used for dedup prefix resolution). Falls back to `self.name`. **`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional docstring to include. Falls back to `self.description`. **`is_async`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Override async rendering. If `None`, uses `self.is_async`. **`conflicting_type_names`** : [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `frozenset()` Set of type names that need tool-name prefixes (from `get_conflicting_type_names`). ##### from\_schema `@classmethod` ```python def from_schema( cls, name: str, parameters_schema: dict[str, Any], return_schema: dict[str, Any] | None = None, ) -> FunctionSignature ``` Build a FunctionSignature from JSON schemas. `name` is stored on the resulting signature and also used for generating fallback type names (e.g. `GetUserAddress`) when the schema has no `title`. Parameter and return schemas are processed independently -- each resolves `$ref`s against its own `$defs`. Name collisions between parameter and return types (e.g. both define a `User` `$def` with different structures) are handled by `get_conflicting_type_names` at a later stage. ###### Returns `FunctionSignature` ##### get\_conflicting\_type\_names `@staticmethod` ```python def get_conflicting_type_names(signatures: list[FunctionSignature]) -> frozenset[str] ``` Identify TypedDict name conflicts across multiple tool signatures. Each signature keeps all its referenced types (so it remains self-contained), but identical types (same name and structure) are unified to the same object instance. Returns the set of type names that have conflicts (same name, different structure) and need tool-name prefixes at render time. Pass this set to `FunctionSignature.render(conflicting_type_names=...)`. Use `collect_unique_referenced_types()` when rendering to emit each definition once. ###### Returns [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### collect\_unique\_referenced\_types `@staticmethod` ```python def collect_unique_referenced_types( signatures: list[FunctionSignature], ) -> list[TypeSignature] ``` Collect unique TypeSignature objects from signatures, deduplicating by identity. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`TypeSignature`\] ##### render\_type\_definitions `@staticmethod` ```python def render_type_definitions( signatures: list[FunctionSignature], conflicting_type_names: frozenset[str], ) -> list[str] ``` Render unique TypedDict definitions for a set of function signatures. For types whose names conflict across signatures (as identified by `get_conflicting_type_names`), each definition is rendered with a tool-name prefix (e.g. `get_user_Address`). ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] -- A list of rendered TypedDict class definitions as strings. ###### Parameters **`signatures`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[`FunctionSignature`\] The function signatures (after `get_conflicting_type_names`). **`conflicting_type_names`** : [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] The set returned by `get_conflicting_type_names`. ### TypeExpr A type expression node in the signature's type tree. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'TypeSignature | SimpleTypeExpr | LiteralTypeExpr | GenericTypeExpr | UnionTypeExpr'` --- # [pydantic_ai.mcp](https://pydantic.dev/docs/ai/api/pydantic-ai/mcp/) # pydantic\_ai.mcp ### MCPError **Bases:** [`RuntimeError`](https://docs.python.org/3/library/exceptions.html#RuntimeError) Raised when an MCP server returns an error response. This exception wraps error responses from MCP servers, following the ErrorData schema from the MCP specification. #### Attributes ##### message The error message. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message` ##### code The error code returned by the server. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `code` ##### data Additional information about the error, if provided by the server. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `data` #### Methods ##### from\_mcp\_sdk `@classmethod` ```python def from_mcp_sdk(cls, error: mcp_exceptions.McpError) -> MCPError ``` Create an MCPError from an MCP SDK McpError. ###### Returns `MCPError` ###### Parameters **`error`** : `mcp_exceptions.McpError` An McpError from the MCP SDK. ### ResourceAnnotations Additional properties describing MCP entities. See the [resource annotations in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#annotations). #### Attributes ##### audience Intended audience for this entity. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`mcp_types.Role`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### priority Priority level for this entity, ranging from 0.0 to 1.0. **Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`float`](https://docs.python.org/3/library/functions.html#float), `Field`(`ge`\=0.0, `le`\=1.0)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` #### Methods ##### from\_mcp\_sdk `@classmethod` ```python def from_mcp_sdk(cls, mcp_annotations: mcp_types.Annotations) -> ResourceAnnotations ``` Convert from MCP SDK Annotations to ResourceAnnotations. ###### Returns `ResourceAnnotations` ###### Parameters **`mcp_annotations`** : `mcp_types.Annotations` The MCP SDK annotations object. ### BaseResource **Bases:** `ABC` Base class for MCP resources. #### Attributes ##### name The programmatic name of the resource. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### title Human-readable title for UI contexts. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### description A description of what this resource represents. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### mime\_type The MIME type of the resource, if known. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### annotations Optional annotations for the resource. **Type:** `ResourceAnnotations` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### metadata Optional metadata for the resource. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### Resource **Bases:** `BaseResource` A resource that can be read from an MCP server. See the [resources in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources). #### Attributes ##### uri The URI of the resource. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### size The size of the raw resource content in bytes (before base64 encoding), if known. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` #### Methods ##### from\_mcp\_sdk `@classmethod` ```python def from_mcp_sdk(cls, mcp_resource: mcp_types.Resource) -> Resource ``` Convert from MCP SDK Resource to PydanticAI Resource. ###### Returns `Resource` ###### Parameters **`mcp_resource`** : `mcp_types.Resource` The MCP SDK Resource object. ### ResourceTemplate **Bases:** `BaseResource` A template for parameterized resources on an MCP server. See the [resource templates in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#resource-templates). #### Attributes ##### uri\_template URI template (RFC 6570) for constructing resource URIs. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### from\_mcp\_sdk `@classmethod` ```python def from_mcp_sdk(cls, mcp_template: mcp_types.ResourceTemplate) -> ResourceTemplate ``` Convert from MCP SDK ResourceTemplate to PydanticAI ResourceTemplate. ###### Returns `ResourceTemplate` ###### Parameters **`mcp_template`** : `mcp_types.ResourceTemplate` The MCP SDK ResourceTemplate object. ### ServerCapabilities Capabilities that an MCP server supports. #### Attributes ##### experimental Experimental, non-standard capabilities that the server supports. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### logging Whether the server supports sending log messages to the client. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### prompts Whether the server offers any prompt templates. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### prompts\_list\_changed Whether the server will emit notifications when the list of prompts changes. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### resources Whether the server offers any resources to read. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### resources\_list\_changed Whether the server will emit notifications when the list of resources changes. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### tools Whether the server offers any tools to call. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### tools\_list\_changed Whether the server will emit notifications when the list of tools changes. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### completions Whether the server offers autocompletion suggestions for prompts and resources. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` #### Methods ##### from\_mcp\_sdk `@classmethod` ```python def from_mcp_sdk( cls, mcp_capabilities: mcp_types.ServerCapabilities, ) -> ServerCapabilities ``` Convert from MCP SDK ServerCapabilities to PydanticAI ServerCapabilities. ###### Returns `ServerCapabilities` ###### Parameters **`mcp_capabilities`** : `mcp_types.ServerCapabilities` The MCP SDK ServerCapabilities object. ### MCPServer **Bases:** `AbstractToolset[Any]`, `ABC` Base class for attaching agents to MCP servers. See [https://modelcontextprotocol.io](https://modelcontextprotocol.io) for more information. #### Attributes ##### tool\_prefix A prefix to add to all tools that are registered with the server. If not empty, will include a trailing underscore(`_`). e.g. if `tool_prefix='foo'`, then a tool named `bar` will be registered as `foo_bar` **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `tool_prefix` ##### log\_level The log level to set when connecting to the server, if any. See [https://modelcontextprotocol.io/specification/2025-03-26/server/utilities/logging#logging](https://modelcontextprotocol.io/specification/2025-03-26/server/utilities/logging#logging) for more details. If `None`, no log level will be set. **Type:** `mcp_types.LoggingLevel` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `log_level` ##### log\_handler A handler for logging messages from the server. **Type:** `LoggingFnT` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `log_handler` ##### timeout The timeout in seconds to wait for the client to initialize. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) **Default:** `timeout` ##### read\_timeout Maximum time in seconds to wait for new messages before timing out. This timeout applies to the long-lived connection after it's established. If no new messages are received within this time, the connection will be considered stale and may be closed. Defaults to 5 minutes (300 seconds). **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) **Default:** `read_timeout` ##### process\_tool\_call Hook to customize tool calling and optionally pass extra metadata. **Type:** `ProcessToolCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `process_tool_call` ##### allow\_sampling Whether to allow MCP sampling through this client. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `allow_sampling` ##### sampling\_model The model to use for sampling. **Type:** [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `sampling_model` ##### max\_retries The maximum number of times to retry a tool call. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `max_retries` ##### elicitation\_callback Callback function to handle elicitation requests from the server. **Type:** `ElicitationFnT` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `elicitation_callback` ##### cache\_tools Whether to cache the list of tools. When enabled (default), tools are fetched once and cached until either: - The server sends a `notifications/tools/list_changed` notification - `MCPServer.__aexit__` is called (when the last context exits) Set to `False` for servers that change tools dynamically without sending notifications. Note: When using durable execution (Temporal, DBOS), tool definitions are additionally cached at the wrapper level across activities/steps, to avoid redundant MCP connections. This wrapper-level cache is not invalidated by `tools/list_changed` notifications. Set to `False` to disable all caching if tools may change during a workflow. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `cache_tools` ##### cache\_resources Whether to cache the list of resources. When enabled (default), resources are fetched once and cached until either: - The server sends a `notifications/resources/list_changed` notification - `MCPServer.__aexit__` is called (when the last context exits) Set to `False` for servers that change resources dynamically without sending notifications. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `cache_resources` ##### include\_instructions Whether to include the server's instructions in the agent's instructions. Defaults to `False` for backward compatibility. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `include_instructions` ##### include\_return\_schema Whether to include return schemas in tool definitions sent to the model. When `None` (default), defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `include_return_schema` ##### server\_info Access the information send by the MCP server during initialization. **Type:** `mcp_types.Implementation` ##### capabilities Access the capabilities advertised by the MCP server during initialization. **Type:** `ServerCapabilities` ##### instructions Access the instructions sent by the MCP server during initialization. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### is\_running Check if the MCP server is running. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) #### Methods ##### client\_streams `@abstractmethod` `@async` ```python def client_streams( ) -> AsyncIterator[tuple[MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage]]] ``` Create the streams for the MCP server. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`MemoryObjectReceiveStream`\[`SessionMessage` | [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)\], `MemoryObjectSendStream`\[`SessionMessage`\]\]\] ##### get\_instructions `@async` ```python def get_instructions(ctx: RunContext[Any]) -> messages.InstructionPart | None ``` Return the MCP server's instructions for how to use its tools. If [`include_instructions`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.include_instructions) is `True`, returns the [`instructions`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.instructions) sent by the MCP server during initialization. Otherwise, returns `None`. Instructions from external servers are marked as dynamic since they may change between connections. ###### Returns [`messages.InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) | [`None`](https://docs.python.org/3/library/constants.html#None) -- An `InstructionPart` with the server's instructions if `include_instructions` is enabled, otherwise `None`. ###### Parameters **`ctx`** : [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] The run context for this agent run. ##### list\_tools `@async` ```python def list_tools() -> list[mcp_types.Tool] ``` Retrieve tools that are currently active on the server. Tools are cached by default, with cache invalidation on: - `notifications/tools/list_changed` notifications from the server - `__aexit__` when the last context exits Set `cache_tools=False` for servers that change tools without sending notifications. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`mcp_types.Tool`\] ##### direct\_call\_tool `@async` ```python def direct_call_tool( name: str, args: dict[str, Any], metadata: dict[str, Any] | None = None, ) -> ToolResult ``` Call a tool on the server. ###### Returns `ToolResult` -- The result of the tool call. ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The name of the tool to call. **`args`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] The arguments to pass to the tool. **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Request-level metadata (optional) ###### Raises - `ModelRetry` -- If the tool call fails. ##### list\_resources `@async` ```python def list_resources() -> list[Resource] ``` Retrieve resources that are currently present on the server. Resources are cached by default, with cache invalidation on: - `notifications/resources/list_changed` notifications from the server - `__aexit__` when the last context exits Set `cache_resources=False` for servers that change resources without sending notifications. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Resource`\] ###### Raises - `MCPError` -- If the server returns an error. ##### list\_resource\_templates `@async` ```python def list_resource_templates() -> list[ResourceTemplate] ``` Retrieve resource templates that are currently present on the server. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ResourceTemplate`\] ###### Raises - `MCPError` -- If the server returns an error. ##### read\_resource `@async` ```python def read_resource( uri: str, ) -> str | messages.BinaryContent | list[str | messages.BinaryContent] def read_resource( uri: Resource, ) -> str | messages.BinaryContent | list[str | messages.BinaryContent] ``` Read the contents of a specific resource by URI. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`messages.BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) | [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`messages.BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent)\] -- The resource contents. If the resource has a single content item, returns that item directly. [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`messages.BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) | [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`messages.BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent)\] -- If the resource has multiple content items, returns a list of items. ###### Parameters **`uri`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `Resource` The URI of the resource to read, or a Resource object. ###### Raises - `MCPError` -- If the server returns an error. ##### \_\_aenter\_\_ `@async` ```python def __aenter__() -> Self ``` Enter the MCP server context. This will initialize the connection to the server. If this server is an [`MCPServerStdio`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStdio), the server will first be started as a subprocess. This is a no-op if the MCP server has already been entered. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) ### MCPServerStdio **Bases:** `MCPServer` Runs an MCP server in a subprocess and communicates with it over stdin/stdout. This class implements the stdio transport from the MCP specification. See [https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) for more information. Note Using this class as an async context manager will start the server as a subprocess when entering the context, and stop it when exiting the context. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( # (1) 'uv', args=['run', 'mcp-run-python', 'stdio'], timeout=10 ) agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` See [MCP Run Python](https://github.com/pydantic/mcp-run-python) for more information. #### Attributes ##### command The command to run. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `command` ##### args The arguments to pass to the command. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] **Default:** `args` ##### env The environment variables the CLI server will have access to. By default the subprocess will not inherit any environment variables from the parent process. If you want to inherit the environment variables from the parent process, use `env=os.environ`. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `env` ##### cwd The working directory to use when spawning the process. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `Path` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `cwd` #### Methods ##### \_\_init\_\_ ```python def __init__( command: str, args: Sequence[str], env: dict[str, str] | None = None, cwd: str | Path | None = None, tool_prefix: str | None = None, log_level: mcp_types.LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: models.Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, cache_tools: bool = True, cache_resources: bool = True, include_instructions: bool = False, include_return_schema: bool | None = None, id: str | None = None, client_info: mcp_types.Implementation | None = None, ) ``` Build a new MCP server. ###### Parameters **`command`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The command to run. **`args`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] The arguments to pass to the command. **`env`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The environment variables to set in the subprocess. **`cwd`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `Path` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The working directory to use when spawning the process. **`tool_prefix`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A prefix to add to all tools that are registered with the server. **`log_level`** : `mcp_types.LoggingLevel` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The log level to set when connecting to the server, if any. **`log_handler`** : `LoggingFnT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A handler for logging messages from the server. **`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) _Default:_ `5` The timeout in seconds to wait for the client to initialize. **`read_timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) _Default:_ `5 * 60` Maximum time in seconds to wait for new messages before timing out. **`process_tool_call`** : `ProcessToolCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Hook to customize tool calling and optionally pass extra metadata. **`allow_sampling`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to allow MCP sampling through this client. **`sampling_model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The model to use for sampling. **`max_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `1` The maximum number of times to retry a tool call. **`elicitation_callback`** : `ElicitationFnT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Callback function to handle elicitation requests from the server. **`cache_tools`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to cache the list of tools. See [`MCPServer.cache_tools`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.cache_tools). **`cache_resources`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to cache the list of resources. See [`MCPServer.cache_resources`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.cache_resources). **`include_instructions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to include the server's instructions in the agent's instructions. See [`MCPServer.include_instructions`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.include_instructions). **`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to include return schemas in tool definitions. See [`MCPServer.include_return_schema`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.include_return_schema). **`id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An optional unique ID for the MCP server. An MCP server needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the server's activities within the workflow. **`client_info`** : `mcp_types.Implementation` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Information describing the MCP client implementation. ### MCPServerSSE **Bases:** `_MCPServerHTTP` An MCP server that connects over streamable HTTP connections. This class implements the SSE transport from the MCP specification. See [https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` ### MCPServerHTTP **Bases:** `MCPServerSSE` An MCP server that connects over HTTP using the old SSE transport. This class implements the SSE transport from the MCP specification. See [https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerHTTP server = MCPServerHTTP('http://localhost:3001/sse') agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` ### MCPServerStreamableHTTP **Bases:** `_MCPServerHTTP` An MCP server that connects over HTTP using the Streamable HTTP transport. This class implements the Streamable HTTP transport from the MCP specification. See [https://modelcontextprotocol.io/introduction#streamable-http](https://modelcontextprotocol.io/introduction#streamable-http) for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` ### MCPServerConfig **Bases:** `BaseModel` Configuration for MCP servers. ### load\_mcp\_servers ```python def load_mcp_servers( config_path: str | Path, ) -> list[MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE] ``` Load MCP servers from a configuration file. Environment variables can be referenced in the configuration file using: - `${VAR_NAME}` syntax - expands to the value of VAR\_NAME, raises error if not defined - `${VAR_NAME:-default}` syntax - expands to VAR\_NAME if set, otherwise uses the default value #### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`MCPServerStdio` | `MCPServerStreamableHTTP` | `MCPServerSSE`\] -- A list of MCP servers. #### Parameters **`config_path`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `Path` The path to the configuration file. #### Raises - `FileNotFoundError` -- If the configuration file does not exist. - `ValidationError` -- If the configuration file does not match the schema. - `ValueError` -- If an environment variable referenced in the configuration is not defined and no default value is provided. ### ToolResult The result type of an MCP tool call. **Default:** `str | messages.BinaryContent | dict[str, Any] | list[Any] | Sequence[str | messages.BinaryContent | dict[str, Any] | list[Any]]` ### CallToolFunc A function type that represents a tool call. **Default:** `Callable[[str, dict[str, Any], dict[str, Any] | None], Awaitable[ToolResult]]` ### ProcessToolCallback A process tool callback. It accepts a run context, the original tool call function, a tool name, and arguments. Allows wrapping an MCP server tool call to customize it, including adding extra request metadata. **Default:** `Callable[[RunContext[Any], CallToolFunc, str, dict[str, Any]], Awaitable[ToolResult]]` --- # [pydantic_ai.messages](https://pydantic.dev/docs/ai/api/pydantic-ai/messages/) # pydantic\_ai.messages The structure of [`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage) can be shown as a graph: ```mermaid graph RL SystemPromptPart(SystemPromptPart) --- ModelRequestPart UserPromptPart(UserPromptPart) --- ModelRequestPart ToolReturnPart(ToolReturnPart) --- ModelRequestPart RetryPromptPart(RetryPromptPart) --- ModelRequestPart TextPart(TextPart) --- ModelResponsePart ToolCallPart(ToolCallPart) --- ModelResponsePart ThinkingPart(ThinkingPart) --- ModelResponsePart ModelRequestPart("ModelRequestPart
(Union)") --- ModelRequest ModelRequest("ModelRequest(parts=list[...])") --- ModelMessage ModelResponsePart("ModelResponsePart
(Union)") --- ModelResponse ModelResponse("ModelResponse(parts=list[...])") --- ModelMessage("ModelMessage
(Union)") ``` ### SystemPromptPart A system prompt, generally written by the application developer. This gives the model context and guidance on how to respond. #### Attributes ##### content The content of the prompt. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### timestamp The timestamp of the prompt. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)` ##### dynamic\_ref The ref of the dynamic system prompt function that generated this part. Only set if system prompt is dynamic, see [`system_prompt`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.system_prompt) for more information. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### part\_kind Part type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['system-prompt'\] **Default:** `'system-prompt'` ### FileUrl **Bases:** `ABC` Abstract base class for any URL-based file. #### Attributes ##### url The URL of the file. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### force\_download Controls whether the file is downloaded and how SSRF protection is applied: - If `False`, the URL is sent directly to providers that support it. For providers that don't, the file is downloaded with SSRF protection (blocks private IPs and cloud metadata). - If `True`, the file is always downloaded with SSRF protection (blocks private IPs and cloud metadata). - If `'allow-local'`, the file is always downloaded, allowing private IPs but still blocking cloud metadata. **Type:** `ForceDownloadMode` **Default:** `False` ##### vendor\_metadata Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `VideoUrl.vendor_metadata` is used as `video_metadata`: [https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing](https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing) - `OpenAIChatModel`, `OpenAIResponsesModel`: `ImageUrl.vendor_metadata['detail']` is used as `detail` setting for images - `XaiModel`: `ImageUrl.vendor_metadata['detail']` is used as `detail` setting for images **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### media\_type Return the media type of the file, based on the URL or the provided `media_type`. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### identifier The identifier of the file, such as a unique ID. This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `FileUrl`. This identifier is only automatically passed to the model when the `FileUrl` is returned by a tool. If you're passing the `FileUrl` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `FileUrl`. It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### format The file format. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### VideoUrl **Bases:** [`FileUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FileUrl) A URL to a video. #### Attributes ##### url The URL of the video. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### kind Type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['video-url'\] **Default:** `'video-url'` ##### is\_youtube True if the URL has a YouTube domain. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### format The file format of the video. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. **Type:** `VideoFormat` ### AudioUrl **Bases:** [`FileUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FileUrl) A URL to an audio file. #### Attributes ##### url The URL of the audio file. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### kind Type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['audio-url'\] **Default:** `'audio-url'` ##### format The file format of the audio file. **Type:** `AudioFormat` ### ImageUrl **Bases:** [`FileUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FileUrl) A URL to an image. #### Attributes ##### url The URL of the image. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### kind Type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['image-url'\] **Default:** `'image-url'` ##### format The file format of the image. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. **Type:** `ImageFormat` ### DocumentUrl **Bases:** [`FileUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FileUrl) The URL of the document. #### Attributes ##### url The URL of the document. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### kind Type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['document-url'\] **Default:** `'document-url'` ##### format The file format of the document. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. **Type:** `DocumentFormat` ### TextContent String content that is tagged with additional metadata. This is useful for including metadata that can be accessed programmatically by the application, but is not sent to the LLM. #### Attributes ##### content The content that is sent to the LLM. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### metadata Additional data that can be accessed programmatically by the application but is not sent to the LLM. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) **Default:** `None` ##### kind Type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text-content'\] **Default:** `'text-content'` ### BinaryContent Binary content, e.g. an audio or image file. #### Attributes ##### data The binary file data. Use `.base64` to get the base64-encoded string. **Type:** [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) ##### media\_type The media type of the binary data. **Type:** `AudioMediaType` | `ImageMediaType` | `DocumentMediaType` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### vendor\_metadata Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `BinaryContent.vendor_metadata` is used as `video_metadata`: [https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing](https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing) - `OpenAIChatModel`, `OpenAIResponsesModel`: `BinaryContent.vendor_metadata['detail']` is used as `detail` setting for images - `XaiModel`: `BinaryContent.vendor_metadata['detail']` is used as `detail` setting for images **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### kind Type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['binary'\] **Default:** `'binary'` ##### identifier Identifier for the binary content, such as a unique ID. This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `BinaryContent`. This identifier is only automatically passed to the model when the `BinaryContent` is returned by a tool. If you're passing the `BinaryContent` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `BinaryContent`. It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### data\_uri Convert the `BinaryContent` to a data URI. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### base64 Return the binary data as a base64-encoded string. Default encoding is UTF-8. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### is\_audio Return `True` if the media type is an audio type. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### is\_image Return `True` if the media type is an image type. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### is\_video Return `True` if the media type is a video type. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### is\_document Return `True` if the media type is a document type. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### format The file format of the binary content. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### narrow\_type `@staticmethod` ```python def narrow_type(bc: BinaryContent) -> BinaryContent | BinaryImage ``` Narrow the type of the `BinaryContent` to `BinaryImage` if it's an image. ###### Returns [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) | [`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage) ##### from\_data\_uri `@classmethod` ```python def from_data_uri(cls, data_uri: str) -> BinaryContent ``` Create a `BinaryContent` from a data URI. ###### Returns [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) ##### from\_path `@classmethod` ```python def from_path(cls, path: PathLike[str]) -> BinaryContent ``` Create a `BinaryContent` from a path. Defaults to 'application/octet-stream' if the media type cannot be inferred. ###### Returns [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) ###### Raises - `FileNotFoundError` -- if the file does not exist. - `PermissionError` -- if the file cannot be read. ### BinaryImage **Bases:** [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) Binary content that's guaranteed to be an image. ### CachePoint A cache point marker for prompt caching. Can be inserted into UserPromptPart.content to mark cache boundaries. Models that don't support caching will filter these out. Supported by: - Anthropic - Amazon Bedrock (Converse API) #### Attributes ##### kind Type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['cache-point'\] **Default:** `'cache-point'` ##### ttl The cache time-to-live, either "5m" (5 minutes) or "1h" (1 hour). Supported by: - Anthropic. See [https://docs.claude.com/en/docs/build-with-claude/prompt-caching#1-hour-cache-duration](https://docs.claude.com/en/docs/build-with-claude/prompt-caching#1-hour-cache-duration) for more information. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\] **Default:** `'5m'` ### UploadedFile A reference to a file uploaded to a provider's file storage by ID. This allows referencing files that have been uploaded via provider-specific file APIs rather than providing the file content directly. Supported by: - [`AnthropicModel`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModel) - [`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel) - [`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel) - [`BedrockConverseModel`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockConverseModel) - [`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) (GLA: [Files API](https://ai.google.dev/gemini-api/docs/files) URIs, Vertex: GCS `gs://` URIs) - [`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel) #### Attributes ##### file\_id The provider-specific file identifier. For most providers, this is the file ID returned by the provider's upload API. For GoogleModel (Vertex), this must be a GCS URI (`gs://bucket/path`). For GoogleModel (GLA), this must be a Google Files API URI (`https://generativelanguage.googleapis.com/...`). For BedrockConverseModel, this must be an S3 URI (`s3://bucket/key`). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_name The provider this file belongs to. This is required because file IDs are not portable across providers, and using a file ID with the wrong provider will always result in an error. Tip: Use `model.system` to get the provider name dynamically. **Type:** `UploadedFileProviderName` ##### vendor\_metadata Vendor-specific metadata for the file. The expected shape of this dictionary depends on the provider: Supported by: - `GoogleModel`: used as `video_metadata` for video files **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### kind Type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['uploaded-file'\] **Default:** `'uploaded-file'` ##### media\_type Return the media type of the file, inferred from `file_id` if not explicitly provided. Note: Inference relies on the file extension in `file_id`. For opaque file IDs (e.g., `'file-abc123'`), the media type will default to `'application/octet-stream'`. Inference relies on Python's `mimetypes` module, whose results may vary across platforms. Required by some providers (e.g., Bedrock) for certain file types. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### identifier The identifier of the file, such as a unique ID. This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `UploadedFile`. This identifier is only automatically passed to the model when the `UploadedFile` is returned by a tool. If you're passing the `UploadedFile` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `UploadedFile`. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### format A general-purpose media-type-to-format mapping. Maps media types to format strings (e.g. `'image/png'` -> `'png'`). Covers image, video, audio, and document types. Currently used by Bedrock, which requires explicit format strings. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### ToolReturn **Bases:** `Generic[_ToolReturnValueT]` A structured tool return that separates the tool result from additional content sent to the model. Can be parameterized with a type to enable return schema generation: - `ToolReturn[User]` -- generates a return schema for `User` - `ToolReturn` (bare) -- no return schema generated #### Attributes ##### return\_value The return value to be used in the tool response. **Type:** `ToolReturnContent` ##### content Content sent to the model as a separate `UserPromptPart`. Use this when you want content to appear outside the tool result message. For multimodal content that should be sent natively in the tool result, return it directly from the tool function or include it in `return_value`. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### metadata Additional data accessible by the application but not sent to the LLM. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) **Default:** `None` ### UserPromptPart A user prompt, generally written by the end user. Content comes from the `user_prompt` parameter of [`Agent.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`Agent.run_sync`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync), and [`Agent.run_stream`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream). #### Attributes ##### content The content of the prompt. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`UserContent`\] ##### timestamp The timestamp of the prompt. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)` ##### part\_kind Part type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['user-prompt'\] **Default:** `'user-prompt'` ### BaseToolReturnPart Base class for tool return parts. #### Attributes ##### tool\_name The name of the tool that was called. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### content The tool return content, which may include multimodal files. **Type:** `ToolReturnContent` ##### tool\_call\_id The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `field(default_factory=_generate_tool_call_id)` ##### metadata Additional data accessible by the application but not sent to the LLM. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) **Default:** `None` ##### timestamp The timestamp, when the tool returned. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)` ##### outcome The outcome of the tool call. - `'success'`: The tool executed successfully. - `'failed'`: The tool raised an error during execution. - `'denied'`: The tool call was denied by the approval mechanism. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['success', 'failed', 'denied'\] **Default:** `'success'` ##### files The multimodal file parts from `content` (`ImageUrl`, `AudioUrl`, `DocumentUrl`, `VideoUrl`, `BinaryContent`). **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`MultiModalContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.MultiModalContent)\] #### Methods ##### content\_items ```python def content_items(mode: Literal['raw'] = 'raw') -> list[ToolReturnContent] def content_items(mode: Literal['str']) -> list[str | MultiModalContent] def content_items(mode: Literal['jsonable']) -> list[Any | MultiModalContent] ``` Return content as a flat list for iteration, with optional serialization. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ToolReturnContent`\] | [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`MultiModalContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.MultiModalContent)\] | [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any) | [`MultiModalContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.MultiModalContent)\] ###### Parameters **`mode`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['raw', 'str', 'jsonable'\] _Default:_ `'raw'` Controls serialization of non-file items: - `'raw'`: No serialization. Returns items as-is. - `'str'`: Non-file items are serialized to strings via `tool_return_ta`. File items (`MultiModalContent`) pass through unchanged. - `'jsonable'`: Non-file items are serialized to JSON-compatible Python objects via `tool_return_ta`. File items pass through unchanged. ##### model\_response\_str ```python def model_response_str() -> str ``` Return a string representation of the data content for the model. This excludes multimodal files - use `.files` to get those separately. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### model\_response\_object ```python def model_response_object() -> dict[str, Any] ``` Return a dictionary representation of the data content, wrapping non-dict types appropriately. This excludes multimodal files - use `.files` to get those separately. Gemini supports JSON dict return values, but no other JSON types, hence we wrap anything else in a dict. ###### Returns [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### model\_response\_str\_and\_user\_content ```python def model_response_str_and_user_content() -> tuple[str, list[UserContent]] ``` Build a text-only tool result with multimodal files extracted for a trailing user message. For providers whose tool result API only accepts text. Multimodal files are referenced by identifier in the tool result text ('See file {id}.') and included in full in the returned file content list ('This is file {id}:' followed by the file). ###### Returns [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`list`](https://docs.python.org/3/glossary.html#term-list)\[`UserContent`\]\] ##### has\_content ```python def has_content() -> bool ``` Return `True` if the tool return has content. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ### ToolReturnPart **Bases:** [`BaseToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BaseToolReturnPart) A tool return message, this encodes the result of running a tool. #### Attributes ##### part\_kind Part type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['tool-return'\] **Default:** `'tool-return'` ### BuiltinToolReturnPart **Bases:** [`BaseToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BaseToolReturnPart) A tool return message from a built-in tool. #### Attributes ##### provider\_name The name of the provider that generated the response. Required to be set when `provider_details` is set. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_details Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### part\_kind Part type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['builtin-tool-return'\] **Default:** `'builtin-tool-return'` ### RetryPromptPart A message back to a model asking it to try again. This can be sent for a number of reasons: - Pydantic validation of tool arguments failed, here content is derived from a Pydantic [`ValidationError`](https://docs.pydantic.dev/latest/api/pydantic-core/pydantic_core/#pydantic_core.ValidationError) - a tool raised a [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exception - no tool was found for the tool name - the model returned plain text when a structured response was expected - Pydantic validation of a structured response failed, here content is derived from a Pydantic [`ValidationError`](https://docs.pydantic.dev/latest/api/pydantic-core/pydantic_core/#pydantic_core.ValidationError) - an output validator raised a [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exception #### Attributes ##### content Details of why and how the model should retry. If the retry was triggered by a [`ValidationError`](https://docs.pydantic.dev/latest/api/pydantic-core/pydantic_core/#pydantic_core.ValidationError), this will be a list of error details. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`pydantic_core.ErrorDetails`](https://docs.pydantic.dev/latest/api/pydantic-core/pydantic_core/#pydantic_core.ErrorDetails)\] | [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### tool\_name The name of the tool that was called, if any. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### tool\_call\_id The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `field(default_factory=_generate_tool_call_id)` ##### timestamp The timestamp, when the retry was triggered. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)` ##### part\_kind Part type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['retry-prompt'\] **Default:** `'retry-prompt'` #### Methods ##### model\_response ```python def model_response() -> str ``` Return a string message describing why the retry is requested. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### InstructionPart A single instruction block with metadata about its origin. Instructions are composed of one or more parts, each of which can be static (from a literal string) or dynamic (from a function, template, or toolset). This distinction allows model implementations to make intelligent caching decisions -- e.g. Anthropic's prompt caching can cache the static prefix while leaving dynamic instructions uncached. #### Attributes ##### content The text content of this instruction block. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### dynamic Whether this instruction came from a dynamic source (function, template, or toolset). Static instructions (`dynamic=False`) come from literal strings passed to `Agent(instructions=...)`. Dynamic instructions (`dynamic=True`) come from `@agent.instructions` functions, `TemplateStr`, or toolset `get_instructions()` methods. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### part\_kind Part type identifier, used as a discriminator for deserialization. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['instruction'\] **Default:** `'instruction'` #### Methods ##### join `@staticmethod` ```python def join(parts: Sequence[InstructionPart]) -> str | None ``` Join instruction parts into a single string, separated by double newlines. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### sorted `@staticmethod` ```python def sorted(parts: Sequence[InstructionPart]) -> list[InstructionPart] ``` Sort instruction parts with static (`dynamic=False`) before dynamic, preserving relative order. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart)\] ### ModelRequest A request generated by Pydantic AI and sent to a model, e.g. a message from the Pydantic AI app to the model. #### Attributes ##### parts The parts of the user message. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelRequestPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequestPart)\] ##### timestamp The timestamp when the request was sent to the model. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### instructions The instructions string for this request, rendered from structured instruction parts. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### kind Message type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['request'\] **Default:** `'request'` ##### run\_id The unique identifier of the agent run in which this message originated. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### metadata Additional data that can be accessed programmatically by the application but is not sent to the LLM. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` #### Methods ##### user\_text\_prompt `@classmethod` ```python def user_text_prompt( cls, user_prompt: str, instructions: str | None = None, ) -> ModelRequest ``` Create a `ModelRequest` with a single user prompt as text. ###### Returns [`ModelRequest`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest) ### TextPart A plain text response from a model. #### Attributes ##### content The text content of the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### id An optional identifier of the text part. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_name The name of the provider that generated the response. Required to be set when `provider_details` or `id` is set. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_details Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### part\_kind Part type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text'\] **Default:** `'text'` #### Methods ##### has\_content ```python def has_content() -> bool ``` Return `True` if the text content is non-empty. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ### ThinkingPart A thinking response from a model. #### Attributes ##### content The thinking content of the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### id The identifier of the thinking part. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### signature The signature of the thinking. Supported by: - Anthropic (corresponds to the `signature` field) - Bedrock (corresponds to the `signature` field) - Google (corresponds to the `thought_signature` field) - OpenAI (corresponds to the `encrypted_content` field) When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_name The name of the provider that generated the response. Signatures are only sent back to the same provider. Required to be set when `provider_details`, `id` or `signature` is set. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_details Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### part\_kind Part type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['thinking'\] **Default:** `'thinking'` #### Methods ##### has\_content ```python def has_content() -> bool ``` Return `True` if the thinking content is non-empty. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ### CompactionPart A compaction part that summarizes previous conversation history. Compaction parts contain an opaque or readable summary of prior messages, produced by provider-specific compaction mechanisms. They must be round-tripped back to the same provider in subsequent requests. For Anthropic, `content` contains a readable text summary. For OpenAI, `content` is `None` and the encrypted data is stored in `provider_details`. #### Attributes ##### content The compaction summary text, if available. For Anthropic: a readable text summary of compacted messages. For OpenAI: `None` (the compacted content is encrypted and stored in `provider_details`). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### id The identifier of the compaction part. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_name The name of the provider that generated the compaction. Compaction data is only sent back to the same provider. Required to be set when `provider_details` or `id` is set. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_details Additional data returned by the provider that can't be mapped to standard fields. For OpenAI: contains `encrypted_content` and other fields from `ResponseCompactionItem`. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### part\_kind Part type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['compaction'\] **Default:** `'compaction'` #### Methods ##### has\_content ```python def has_content() -> bool ``` Return `True` if the compaction content is non-empty. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ### FilePart A file response from a model. #### Attributes ##### content The file content of the response. **Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent), [`pydantic.AfterValidator`](https://docs.pydantic.dev/latest/api/pydantic/functional_validators/#pydantic.functional_validators.AfterValidator)(`BinaryImage.narrow_type`)\] ##### id The identifier of the file part. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_name The name of the provider that generated the response. Required to be set when `provider_details` or `id` is set. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_details Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### part\_kind Part type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['file'\] **Default:** `'file'` #### Methods ##### has\_content ```python def has_content() -> bool ``` Return `True` if the file content is non-empty. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ### BaseToolCallPart A tool call from a model. #### Attributes ##### tool\_name The name of the tool to call. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### args The arguments to pass to the tool. This is stored either as a JSON string or a Python dictionary depending on how data was received. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### tool\_call\_id The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `field(default_factory=_generate_tool_call_id)` ##### id An optional identifier of the tool call part, separate from the tool call ID. This is used by some APIs like OpenAI Responses. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_name The name of the provider that generated the response. Builtin tool calls are only sent back to the same provider. Required to be set when `provider_details` or `id` is set. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_details Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` #### Methods ##### args\_as\_dict ```python def args_as_dict(raise_if_invalid: bool = False) -> dict[str, Any] ``` Return the arguments as a Python dictionary. This is just for convenience with models that require dicts as input. ###### Returns [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ###### Parameters **`raise_if_invalid`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` If `True`, a `ValueError` or `AssertionError` caused by malformed JSON in `args` will be re-raised. When `False` (the default), malformed JSON is handled gracefully by returning `{'INVALID_JSON': ''}` so that the value can still be sent to a model API (e.g. during a retry flow) without crashing. ##### args\_as\_json\_str ```python def args_as_json_str() -> str ``` Return the arguments as a JSON string. This is just for convenience with models that require JSON strings as input. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### has\_content ```python def has_content() -> bool ``` Return `True` if the tool call has content. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ### ToolCallPart **Bases:** [`BaseToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BaseToolCallPart) A tool call from a model. #### Attributes ##### part\_kind Part type identifier, this is available on all parts as a discriminator. Note that this is different from `ToolCallPartDelta.part_delta_kind`. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['tool-call'\] **Default:** `'tool-call'` ### BuiltinToolCallPart **Bases:** [`BaseToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BaseToolCallPart) A tool call to a built-in tool. #### Attributes ##### part\_kind Part type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['builtin-tool-call'\] **Default:** `'builtin-tool-call'` ### ModelResponse A response from a model, e.g. a message from the model to the Pydantic AI app. #### Attributes ##### parts The parts of the model message. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart)\] ##### usage Usage information for the request. This has a default to make tests easier, and to support loading old messages where usage will be missing. **Type:** [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) **Default:** `field(default_factory=RequestUsage)` ##### model\_name The name of the model that generated the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### timestamp The timestamp when the response was received locally. This is always a high-precision local datetime. Provider-specific timestamps (if available) are stored in `provider_details['timestamp']`. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)` ##### kind Message type identifier, this is available on all parts as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['response'\] **Default:** `'response'` ##### provider\_name The name of the LLM provider that generated the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_url The base URL of the LLM provider that generated the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_details Additional data returned by the provider that can't be mapped to standard fields. **Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None), [`pydantic.Field`](https://docs.pydantic.dev/latest/api/pydantic/fields/#pydantic.fields.Field)(`validation_alias`\=([`pydantic.AliasChoices`](https://docs.pydantic.dev/latest/api/pydantic/aliases/#pydantic.aliases.AliasChoices)(`provider_details`, `vendor_details`)))\] **Default:** `None` ##### provider\_response\_id request ID as specified by the model provider. This can be used to track the specific request to the model. **Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None), [`pydantic.Field`](https://docs.pydantic.dev/latest/api/pydantic/fields/#pydantic.fields.Field)(`validation_alias`\=([`pydantic.AliasChoices`](https://docs.pydantic.dev/latest/api/pydantic/aliases/#pydantic.aliases.AliasChoices)(`provider_response_id`, `vendor_id`)))\] **Default:** `None` ##### finish\_reason Reason the model finished generating the response, normalized to OpenTelemetry values. **Type:** [`FinishReason`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FinishReason) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### run\_id The unique identifier of the agent run in which this message originated. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### metadata Additional data that can be accessed programmatically by the application but is not sent to the LLM. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### text Get the text in the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### thinking Get the thinking in the response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### files Get the files in the response. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent)\] ##### images Get the images in the response. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage)\] ##### tool\_calls Get the tool calls in the response. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)\] ##### builtin\_tool\_calls Get the builtin tool calls and results in the response. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart), [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart)\]\] #### Methods ##### price `@deprecated` ```python def price() -> genai_types.PriceCalculation ``` ###### Returns `genai_types.PriceCalculation` ##### cost ```python def cost() -> genai_types.PriceCalculation ``` Calculate the cost of the usage. Uses [`genai-prices`](https://github.com/pydantic/genai-prices). ###### Returns `genai_types.PriceCalculation` ##### otel\_events ```python def otel_events(settings: InstrumentationSettings) -> list[LogRecord] ``` Return OpenTelemetry events for the response. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`LogRecord`\] ### TextPartDelta A partial update (delta) for a `TextPart` to append new text content. #### Attributes ##### content\_delta The incremental text content to add to the existing `TextPart` content. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_name The name of the provider that generated the response. This is required to be set when `provider_details` is set and the initial TextPart does not have a `provider_name` or it has changed. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_details Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### part\_delta\_kind Part delta type identifier, used as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text'\] **Default:** `'text'` #### Methods ##### apply ```python def apply(part: ModelResponsePart) -> TextPart ``` Apply this text delta to an existing `TextPart`. ###### Returns [`TextPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPart) -- A new `TextPart` with updated text content. ###### Parameters **`part`** : [`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart) The existing model response part, which must be a `TextPart`. ###### Raises - `ValueError` -- If `part` is not a `TextPart`. ### ThinkingPartDelta A partial update (delta) for a `ThinkingPart` to append new thinking content. #### Attributes ##### content\_delta The incremental thinking content to add to the existing `ThinkingPart` content. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### signature\_delta Optional signature delta. Note this is never treated as a delta -- it can replace None. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_name Optional provider name for the thinking part. Signatures are only sent back to the same provider. Required to be set when `provider_details` is set and the initial ThinkingPart does not have a `provider_name` or it has changed. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_details Additional data returned by the provider that can't be mapped to standard fields. Can be a dict to merge with existing details, or a callable that takes the existing details and returns updated details. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** `ProviderDetailsDelta` **Default:** `None` ##### part\_delta\_kind Part delta type identifier, used as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['thinking'\] **Default:** `'thinking'` #### Methods ##### apply ```python def apply(part: ModelResponsePart) -> ThinkingPart def apply( part: ModelResponsePart | ThinkingPartDelta, ) -> ThinkingPart | ThinkingPartDelta ``` Apply this thinking delta to an existing `ThinkingPart`. ###### Returns [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) | [`ThinkingPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPartDelta) -- A new `ThinkingPart` with updated thinking content. ###### Parameters **`part`** : [`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart) | [`ThinkingPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPartDelta) The existing model response part, which must be a `ThinkingPart`. ###### Raises - `ValueError` -- If `part` is not a `ThinkingPart`. ### ToolCallPartDelta A partial update (delta) for a `ToolCallPart` to modify tool name, arguments, or tool call ID. #### Attributes ##### tool\_name\_delta Incremental text to add to the existing tool name, if any. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### args\_delta Incremental data to add to the tool arguments. If this is a string, it will be appended to existing JSON arguments. If this is a dict, it will be merged with existing dict arguments. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### tool\_call\_id Optional tool call identifier, this is used by some models including OpenAI. Note this is never treated as a delta -- it can replace None, but otherwise if a non-matching value is provided an error will be raised. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_name The name of the provider that generated the response. This is required to be set when `provider_details` is set and the initial ToolCallPart does not have a `provider_name` or it has changed. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_details Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### part\_delta\_kind Part delta type identifier, used as a discriminator. Note that this is different from `ToolCallPart.part_kind`. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['tool\_call'\] **Default:** `'tool_call'` #### Methods ##### as\_part ```python def as_part() -> ToolCallPart | None ``` Convert this delta to a fully formed `ToolCallPart` if possible, otherwise return `None`. ###### Returns [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) | [`None`](https://docs.python.org/3/library/constants.html#None) -- A `ToolCallPart` if `tool_name_delta` is set, otherwise `None`. ##### apply ```python def apply(part: ModelResponsePart) -> ToolCallPart | BuiltinToolCallPart def apply( part: ModelResponsePart | ToolCallPartDelta, ) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta ``` Apply this delta to a part or delta, returning a new part or delta with the changes applied. ###### Returns [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) | [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) | [`ToolCallPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPartDelta) -- Either a new `ToolCallPart` or `BuiltinToolCallPart`, or an updated `ToolCallPartDelta`. ###### Parameters **`part`** : [`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart) | [`ToolCallPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPartDelta) The existing model response part or delta to update. ###### Raises - `ValueError` -- If `part` is neither a `ToolCallPart`, `BuiltinToolCallPart`, nor a `ToolCallPartDelta`. - `UnexpectedModelBehavior` -- If applying JSON deltas to dict arguments or vice versa. ### PartStartEvent An event indicating that a new part has started. If multiple `PartStartEvent`s are received with the same index, the new one should fully replace the old one. #### Attributes ##### index The index of the part within the overall response parts list. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### part The newly started `ModelResponsePart`. **Type:** [`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart) ##### previous\_part\_kind The kind of the previous part, if any. This is useful for UI event streams to know whether to group parts of the same kind together when emitting events. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text', 'thinking', 'tool-call', 'builtin-tool-call', 'builtin-tool-return', 'compaction', 'file'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### event\_kind Event type identifier, used as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['part\_start'\] **Default:** `'part_start'` ### PartDeltaEvent An event indicating a delta update for an existing part. #### Attributes ##### index The index of the part within the overall response parts list. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### delta The delta to apply to the specified part. **Type:** [`ModelResponsePartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePartDelta) ##### event\_kind Event type identifier, used as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['part\_delta'\] **Default:** `'part_delta'` ### PartEndEvent An event indicating that a part is complete. #### Attributes ##### index The index of the part within the overall response parts list. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### part The complete `ModelResponsePart`. **Type:** [`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart) ##### next\_part\_kind The kind of the next part, if any. This is useful for UI event streams to know whether to group parts of the same kind together when emitting events. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text', 'thinking', 'tool-call', 'builtin-tool-call', 'builtin-tool-return', 'compaction', 'file'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### event\_kind Event type identifier, used as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['part\_end'\] **Default:** `'part_end'` ### FinalResultEvent An event indicating the response to the current model request matches the output schema and will produce a result. #### Attributes ##### tool\_name The name of the output tool that was called. `None` if the result is from text content and not from a tool. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### tool\_call\_id The tool call ID, if any, that this result is associated with. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### event\_kind Event type identifier, used as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['final\_result'\] **Default:** `'final_result'` ### FunctionToolCallEvent An event indicating the start to a call to a function tool. #### Attributes ##### part The (function) tool call to make. **Type:** [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) ##### args\_valid Whether the tool arguments passed validation. See the [custom validation docs](https://ai.pydantic.dev/tools-advanced/#args-validator) for more info. - `True`: Schema validation and custom validation (if configured) both passed; args are guaranteed valid. - `False`: Validation was performed and failed. - `None`: Validation was not performed. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### event\_kind Event type identifier, used as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['function\_tool\_call'\] **Default:** `'function_tool_call'` ##### tool\_call\_id An ID used for matching details about the call to its result. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### call\_id An ID used for matching details about the call to its result. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### FunctionToolResultEvent An event indicating the result of a function tool call. #### Attributes ##### result The result of the call to the function tool. **Type:** [`ToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolReturnPart) | [`RetryPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.RetryPromptPart) ##### content The content that will be sent to the model as a UserPromptPart following the result. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### event\_kind Event type identifier, used as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['function\_tool\_result'\] **Default:** `'function_tool_result'` ##### tool\_call\_id An ID used to match the result to its original call. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### BuiltinToolCallEvent An event indicating the start to a call to a built-in tool. #### Attributes ##### part The built-in tool call to make. **Type:** [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) ##### event\_kind Event type identifier, used as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['builtin\_tool\_call'\] **Default:** `'builtin_tool_call'` ### BuiltinToolResultEvent An event indicating the result of a built-in tool call. #### Attributes ##### result The result of the call to the built-in tool. **Type:** [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) ##### event\_kind Event type identifier, used as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['builtin\_tool\_result'\] **Default:** `'builtin_tool_result'` ### is\_multi\_modal\_content ```python def is_multi_modal_content(obj: Any) -> TypeGuard[MultiModalContent] ``` Check if obj is a MultiModalContent type, enabling type narrowing. #### Returns [`TypeGuard`](https://docs.python.org/3/library/typing.html#typing.TypeGuard)\[[`MultiModalContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.MultiModalContent)\] ### FinishReason Reason the model finished generating the response, normalized to OpenTelemetry values. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['stop', 'length', 'content_filter', 'tool_call', 'error']` ### ForceDownloadMode Type for the force\_download parameter on FileUrl subclasses. - `False`: The URL is sent directly to providers that support it. For providers that don't, the file is downloaded with SSRF protection (blocks private IPs and cloud metadata). - `True`: The file is always downloaded with SSRF protection (blocks private IPs and cloud metadata). - `'allow-local'`: The file is always downloaded, allowing private IPs but still blocking cloud metadata. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `bool | Literal['allow-local']` ### ProviderDetailsDelta Type for provider\_details input: can be a static dict, a callback to update existing details, or None. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `dict[str, Any] | Callable[[dict[str, Any] | None], dict[str, Any]] | None` ### UploadedFileProviderName Provider names supported by [`UploadedFile`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.UploadedFile). **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['anthropic', 'openai', 'google-gla', 'google-vertex', 'bedrock', 'xai']` ### MultiModalContent Union of all multi-modal content types with a discriminator for Pydantic validation. **Default:** `Annotated[ImageUrl | AudioUrl | DocumentUrl | VideoUrl | BinaryContent | UploadedFile, pydantic.Discriminator('kind')]` ### RETURN\_VALUE\_KEY Key used to wrap non-dict tool return values in `model_response_object()`. **Default:** `'return_value'` ### ModelRequestPart A message part sent by Pydantic AI to a model. **Default:** `Annotated[SystemPromptPart | UserPromptPart | ToolReturnPart | RetryPromptPart, pydantic.Discriminator('part_kind')]` ### ModelResponsePart A message part returned by a model. **Default:** `Annotated[TextPart | ToolCallPart | BuiltinToolCallPart | BuiltinToolReturnPart | ThinkingPart | CompactionPart | FilePart, pydantic.Discriminator('part_kind')]` ### ModelMessage Any message sent to or returned by a model. **Default:** `Annotated[ModelRequest | ModelResponse, pydantic.Discriminator('kind')]` ### ModelMessagesTypeAdapter Pydantic [`TypeAdapter`](https://docs.pydantic.dev/latest/api/pydantic/type_adapter/#pydantic.type_adapter.TypeAdapter) for (de)serializing messages. **Default:** `pydantic.TypeAdapter(list[ModelMessage], config=(pydantic.ConfigDict(defer_build=True, ser_json_bytes='base64', val_json_bytes='base64')))` ### ModelResponsePartDelta A partial update (delta) for any model response part. **Default:** `Annotated[TextPartDelta | ThinkingPartDelta | ToolCallPartDelta, pydantic.Discriminator('part_delta_kind')]` ### ModelResponseStreamEvent An event in the model response stream, starting a new part, applying a delta to an existing one, indicating a part is complete, or indicating the final result. **Default:** `Annotated[PartStartEvent | PartDeltaEvent | PartEndEvent | FinalResultEvent, pydantic.Discriminator('event_kind')]` ### HandleResponseEvent An event yielded when handling a model response, indicating tool calls and results. **Default:** `Annotated[FunctionToolCallEvent | FunctionToolResultEvent | BuiltinToolCallEvent | BuiltinToolResultEvent, pydantic.Discriminator('event_kind')]` ### AgentStreamEvent An event in the agent stream: model response stream events and response-handling events. **Default:** `Annotated[ModelResponseStreamEvent | HandleResponseEvent, pydantic.Discriminator('event_kind')]` --- # [pydantic_ai.output](https://pydantic.dev/docs/ai/api/pydantic-ai/output/) # pydantic\_ai.output ### ToolOutput **Bases:** `Generic[OutputDataT]` Marker class to use a tool for output and optionally customize the tool. Example: tool\_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, ToolOutput class Fruit(BaseModel): name: str color: str class Vehicle(BaseModel): name: str wheels: int agent = Agent( 'openai:gpt-5.2', output_type=[ ToolOutput(Fruit, name='return_fruit'), ToolOutput(Vehicle, name='return_vehicle'), ], ) result = agent.run_sync('What is a banana?') print(repr(result.output)) #> Fruit(name='banana', color='yellow') ``` #### Attributes ##### output An output type or function. **Type:** `OutputTypeOrFunction`\[`OutputDataT`\] **Default:** `type_` ##### name The name of the tool that will be passed to the model. If not specified and only one output is provided, `final_result` will be used. If multiple outputs are provided, the name of the output type or function will be added to the tool name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name` ##### description The description of the tool that will be passed to the model. If not specified, the docstring of the output type or function will be used. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `description` ##### max\_retries The maximum number of retries for this specific output tool. Overrides the agent-level `retries`/`output_retries` for this tool. If not set, the agent-level value is used as the default. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `max_retries` ##### strict Whether to use strict mode for the tool. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `strict` ### NativeOutput **Bases:** `Generic[OutputDataT]` Marker class to use the model's native structured outputs functionality for outputs and optionally customize the name and description. Example: native\_output.py ```python from pydantic_ai import Agent, NativeOutput from tool_output import Fruit, Vehicle agent = Agent( 'openai:gpt-5.2', output_type=NativeOutput( [Fruit, Vehicle], name='Fruit or vehicle', description='Return a fruit or vehicle.' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` #### Attributes ##### outputs The output types or functions. **Type:** `OutputTypeOrFunction`\[`OutputDataT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`OutputTypeOrFunction`\[`OutputDataT`\]\] **Default:** `outputs` ##### name The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name` ##### description The description of the structured output that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `description` ##### strict Whether to use strict mode for the output, if the model supports it. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `strict` ##### template Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If no template is specified but the model's profile indicates that it requires the schema to be sent as a prompt, the default template specified on the profile will be used. Set to `False` to disable the schema prompt entirely. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `template` ### PromptedOutput **Bases:** `Generic[OutputDataT]` Marker class to use a prompt to tell the model what to output and optionally customize the prompt. Example: prompted\_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, PromptedOutput from tool_output import Vehicle class Device(BaseModel): name: str kind: str agent = Agent( 'openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], name='Vehicle or device', description='Return a vehicle or device.' ), ) result = agent.run_sync('What is a MacBook?') print(repr(result.output)) #> Device(name='MacBook', kind='laptop') agent = Agent( 'openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], template='Gimme some JSON: {schema}' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` #### Attributes ##### outputs The output types or functions. **Type:** `OutputTypeOrFunction`\[`OutputDataT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`OutputTypeOrFunction`\[`OutputDataT`\]\] **Default:** `outputs` ##### name The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name` ##### description The description that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `description` ##### template Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If not specified, the default template specified on the model's profile will be used. Set to `False` to disable the schema prompt entirely. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `template` ### TextOutput **Bases:** `Generic[OutputDataT]` Marker class to use text output for an output function taking a string argument. Example: ```python from pydantic_ai import Agent, TextOutput def split_into_words(text: str) -> list[str]: return text.split() agent = Agent( 'openai:gpt-5.2', output_type=TextOutput(split_into_words), ) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.'] ``` #### Attributes ##### output\_function The function that will be called to process the model's plain text output. The function must take a single string argument. **Type:** `TextOutputFunc`\[`OutputDataT`\] ### DeferredToolRequests Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object with the same tool call IDs. See [deferred tools docs](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more information. #### Attributes ##### calls Tool calls that require external execution. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)\] **Default:** `field(default_factory=(list[ToolCallPart]))` ##### approvals Tool calls that require human-in-the-loop approval. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)\] **Default:** `field(default_factory=(list[ToolCallPart]))` ##### metadata Metadata for deferred tool calls, keyed by `tool_call_id`. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] **Default:** `field(default_factory=(dict[str, dict[str, Any]]))` ### StructuredDict ```python def StructuredDict( json_schema: JsonSchemaValue, name: str | None = None, description: str | None = None, ) -> type[JsonSchemaValue] ``` Returns a `dict[str, Any]` subclass with a JSON schema attached that will be used for structured output. Example: structured\_dict.py ```python from pydantic_ai import Agent, StructuredDict schema = { 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'} }, 'required': ['name', 'age'] } agent = Agent('openai:gpt-5.2', output_type=StructuredDict(schema)) result = agent.run_sync('Create a person') print(result.output) #> {'name': 'John Doe', 'age': 30} ``` #### Returns [`type`](https://docs.python.org/3/glossary.html#term-type)\[`JsonSchemaValue`\] #### Parameters **`json_schema`** : `JsonSchemaValue` A JSON schema of type `object` defining the structure of the dictionary content. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional name of the structured output. If not provided, the `title` field of the JSON schema will be used if it's present. **`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional description of the structured output. If not provided, the `description` field of the JSON schema will be used if it's present. ### OutputDataT Covariant type variable for the output data type of a run. **Default:** `TypeVar('OutputDataT', default=str, covariant=True)` --- # [pydantic_ai.profiles](https://pydantic.dev/docs/ai/api/pydantic-ai/profiles/) # pydantic\_ai.profiles ### ModelProfile Describes how requests to and responses from specific models or families of models need to be constructed and processed to get the best results, independent of the model and provider classes used. #### Attributes ##### supports\_tools Whether the model supports tools. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True` ##### supports\_tool\_return\_schema Whether the model natively supports tool return schemas. When True, the model's API accepts a structured return schema alongside each tool definition. When False, return schemas are injected as JSON text into tool descriptions as a fallback. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### supports\_json\_schema\_output Whether the model supports JSON schema output. This is also referred to as 'native' support for structured output. Relates to the `NativeOutput` output type. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### supports\_json\_object\_output Whether the model supports a dedicated mode to enforce JSON output, without necessarily sending a schema. E.g. [OpenAI's JSON mode](https://platform.openai.com/docs/guides/structured-outputs#json-mode) Relates to the `PromptedOutput` output type. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### supports\_image\_output Whether the model supports image output. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### default\_structured\_output\_mode The default structured output mode to use for the model. **Type:** `StructuredOutputMode` **Default:** `'tool'` ##### prompted\_output\_template The instructions template to use for prompted structured output. The '{schema}' placeholder will be replaced with the JSON schema for the output. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `dedent("\n Always respond with a JSON object that's compatible with this schema:\n\n {schema}\n\n Don't include any text or Markdown fencing before or after.\n ")` ##### native\_output\_requires\_schema\_in\_instructions Whether to add prompted output template in native structured output mode **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### json\_schema\_transformer The transformer to use to make JSON schemas for tools and structured output compatible with the model. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`JsonSchemaTransformer`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### supports\_thinking Whether the model supports thinking/reasoning configuration. When False, the unified `thinking` setting in `ModelSettings` is silently ignored. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### thinking\_always\_enabled Whether the model always uses thinking/reasoning (e.g., OpenAI o-series, DeepSeek R1). When True, `thinking=False` is silently ignored since the model cannot disable thinking. Implies `supports_thinking=True`. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### thinking\_tags The tags used to indicate thinking parts in the model's output. Defaults to ('', ''). **Type:** [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] **Default:** `('', '')` ##### ignore\_streamed\_leading\_whitespace Whether to ignore leading whitespace when streaming a response. This is a workaround for models that emit \` `or an empty text part ahead of tool calls (e.g. Ollama + Qwen3), which we don't want to end up treating as a final result when using`run\_stream`with`str`a valid`output\_type\`. This is currently only used by `OpenAIChatModel`, `HuggingFaceModel`, and `GroqModel`. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### supported\_builtin\_tools The set of builtin tool types that this model/profile supports. Defaults to ALL builtin tools. Profile functions should explicitly restrict this based on model capabilities. **Type:** [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] **Default:** `field(default_factory=(lambda: SUPPORTED_BUILTIN_TOOLS))` #### Methods ##### from\_profile `@classmethod` ```python def from_profile(cls, profile: ModelProfile | None) -> Self ``` Build a ModelProfile subclass instance from a ModelProfile instance. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) ##### update ```python def update(profile: ModelProfile | None) -> Self ``` Update this ModelProfile (subclass) instance with the non-default values from another ModelProfile instance. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) ### OpenAIModelProfile **Bases:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) Profile for models used with `OpenAIChatModel`. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. #### Attributes ##### openai\_chat\_thinking\_field Non-standard field name used by some providers for model thinking content in Chat Completions API responses. Plenty of providers use custom field names for thinking content. Ollama and newer versions of vLLM use `reasoning`, while DeepSeek, older vLLM and some others use `reasoning_content`. Notice that the thinking field configured here is currently limited to `str` type content. If `openai_chat_send_back_thinking_parts` is set to `'field'`, this field must be set to a non-None value. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### openai\_chat\_send\_back\_thinking\_parts Whether the model includes thinking content in requests. This can be: - `'auto'` (default): Automatically detects how to send thinking content. If thinking was received in a custom field (tracked via `ThinkingPart.id` and `ThinkingPart.provider_name`), it's sent back in that same field. Otherwise, it's sent using tags. Only the `reasoning` and `reasoning_content` fields are checked by default when receiving responses. If your provider uses a different field name, you must explicitly set `openai_chat_thinking_field` to that field name. - `'tags'`: The thinking content is included in the main `content` field, enclosed within thinking tags as specified in `thinking_tags` profile option. - `'field'`: The thinking content is included in a separate field specified by `openai_chat_thinking_field`. - `False`: No thinking content is sent in the request. Defaults to `'auto'` to ensure thinking is sent back in the format expected by the model/provider. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', 'tags', 'field', [`False`](https://docs.python.org/3/library/constants.html#False)\] **Default:** `'auto'` ##### openai\_supports\_strict\_tool\_definition This can be set by a provider or user if the OpenAI-"compatible" API doesn't support strict tool definitions. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True` ##### openai\_supports\_sampling\_settings Turn off to don't send sampling settings like `temperature` and `top_p` to models that don't support them, like OpenAI's o-series reasoning models. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True` ##### openai\_unsupported\_model\_settings A list of model settings that are not supported by this model. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] **Default:** `()` ##### openai\_supports\_tool\_choice\_required Whether the provider accepts the value `tool_choice='required'` in the request payload. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True` ##### openai\_system\_prompt\_role The role to use for the system prompt message. If not provided, defaults to `'system'`. **Type:** `OpenAISystemPromptRole` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### openai\_chat\_supports\_web\_search Whether the model supports web search in Chat Completions API. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### openai\_chat\_audio\_input\_encoding The encoding to use for audio input in Chat Completions requests. - `'base64'`: Raw base64 encoded string. (Default, used by OpenAI) - `'uri'`: Data URI (e.g. `data:audio/wav;base64,...`). **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['base64', 'uri'\] **Default:** `'base64'` ##### openai\_chat\_supports\_file\_urls Whether the Chat API supports file URLs directly in the `file_data` field. OpenAI's native Chat API only supports base64-encoded data, but some providers like OpenRouter support passing URLs directly. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### openai\_supports\_encrypted\_reasoning\_content Whether the model supports including encrypted reasoning content in the response. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### openai\_supports\_reasoning Whether the model supports reasoning (o-series, GPT-5+). When True, sampling parameters may need to be dropped depending on reasoning\_effort setting. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### openai\_supports\_reasoning\_effort\_none Whether the model supports sampling parameters (temperature, top\_p, etc.) when reasoning\_effort='none'. Models like GPT-5.1 and GPT-5.2 default to reasoning\_effort='none' and support sampling params in that mode. When reasoning is enabled (low/medium/high/xhigh), sampling params are not supported. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### openai\_responses\_requires\_function\_call\_status\_none Whether the Responses API requires the `status` field on function tool calls to be `None`. This is required by vLLM Responses API versions before [https://github.com/vllm-project/vllm/pull/26706](https://github.com/vllm-project/vllm/pull/26706). See [https://github.com/pydantic/pydantic-ai/issues/3245](https://github.com/pydantic/pydantic-ai/issues/3245) for more details. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### openai\_chat\_supports\_document\_input Whether the Chat Completions API supports document content parts (`type='file'`). Some OpenAI-compatible providers (e.g. Azure) do not support document input via the Chat Completions API. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True` ### OpenAIJsonSchemaTransformer **Bases:** `JsonSchemaTransformer` Recursively handle the schema to make it compatible with OpenAI strict mode. See [https://platform.openai.com/docs/guides/function-calling?api-mode=responses#strict-mode](https://platform.openai.com/docs/guides/function-calling?api-mode=responses#strict-mode) for more details, but this basically just requires: - `additionalProperties` must be set to false for each object in the parameters - all fields in properties must be marked as required ### openai\_model\_profile ```python def openai_model_profile(model_name: str) -> ModelProfile ``` Get the model profile for an OpenAI model. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) ### OPENAI\_REASONING\_EFFORT\_MAP Maps unified thinking values to OpenAI reasoning\_effort strings. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`ThinkingLevel`, [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] **Default:** `{True: 'medium', False: 'none', 'minimal': 'minimal', 'low': 'low', 'medium': 'medium', 'high': 'high', 'xhigh': 'xhigh'}` ### SAMPLING\_PARAMS Sampling parameter names that are incompatible with reasoning. These parameters are not supported when reasoning is enabled (reasoning\_effort != 'none'). See [https://platform.openai.com/docs/guides/reasoning](https://platform.openai.com/docs/guides/reasoning) for details. **Default:** `('temperature', 'top_p', 'presence_penalty', 'frequency_penalty', 'logit_bias', 'openai_logprobs', 'openai_top_logprobs')` ### AnthropicModelProfile **Bases:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) Profile for models used with `AnthropicModel`. ALL FIELDS MUST BE `anthropic_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. #### Attributes ##### anthropic\_supports\_adaptive\_thinking Whether the model supports adaptive thinking (Sonnet 4.6+, Opus 4.6+). When True, unified `thinking` translates to `{'type': 'adaptive'}`. When False, it translates to `{'type': 'enabled', 'budget_tokens': N}`. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### anthropic\_supports\_effort Whether the model supports the `effort` parameter in `output_config` (Opus 4.5+, Sonnet 4.6+). When True and the unified thinking level is a string (e.g. 'high'), it is also mapped to `output_config.effort`. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ### anthropic\_model\_profile ```python def anthropic_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for an Anthropic model. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### ANTHROPIC\_THINKING\_BUDGET\_MAP Maps unified thinking values to Anthropic budget\_tokens for non-adaptive models. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`ThinkingLevel`, [`int`](https://docs.python.org/3/library/functions.html#int)\] **Default:** `{True: 10000, 'minimal': 1024, 'low': 2048, 'medium': 10000, 'high': 16384, 'xhigh': 32768}` ### GoogleModelProfile **Bases:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) Profile for models used with `GoogleModel`. ALL FIELDS MUST BE `google_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. #### Attributes ##### google\_supports\_native\_output\_with\_builtin\_tools Whether the model supports native output with builtin tools. See [https://ai.google.dev/gemini-api/docs/structured-output?example=recipe#structured\_outputs\_with\_tools](https://ai.google.dev/gemini-api/docs/structured-output?example=recipe#structured_outputs_with_tools) **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### google\_supported\_mime\_types\_in\_tool\_returns MIME types supported in native FunctionResponseDict.parts. See [https://ai.google.dev/gemini-api/docs/function-calling#multimodal-function-responses](https://ai.google.dev/gemini-api/docs/function-calling#multimodal-function-responses) **Type:** [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), ...\] **Default:** `()` ##### google\_supports\_thinking\_level Whether the model uses `thinking_level` (enum: LOW/MEDIUM/HIGH) instead of `thinking_budget` (int). Gemini 3+ models use `thinking_level`; Gemini 2.5 uses `thinking_budget`. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ### GoogleJsonSchemaTransformer **Bases:** `JsonSchemaTransformer` Transforms the JSON Schema from Pydantic to be suitable for Gemini. Gemini supports [a subset of OpenAPI v3.0.3](https://ai.google.dev/gemini-api/docs/function-calling#function_declarations). ### google\_model\_profile ```python def google_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Google model. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### meta\_model\_profile ```python def meta_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Meta model. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### amazon\_model\_profile ```python def amazon_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for an Amazon model. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### deepseek\_model\_profile ```python def deepseek_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a DeepSeek model. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### GrokModelProfile **Bases:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) Profile for Grok models (used with both GrokProvider and XaiProvider). ALL FIELDS MUST BE `grok_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. #### Attributes ##### grok\_supports\_builtin\_tools Whether the model supports builtin tools (web\_search, code\_execution, mcp). **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### grok\_supports\_tool\_choice\_required Whether the provider accepts the value `tool_choice='required'` in the request payload. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True` ### grok\_model\_profile ```python def grok_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Grok model. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### mistral\_model\_profile ```python def mistral_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Mistral model. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### qwen\_model\_profile ```python def qwen_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Qwen model. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) --- # [pydantic_ai.providers](https://pydantic.dev/docs/ai/api/pydantic-ai/providers/) # pydantic\_ai.providers ### Provider **Bases:** `ABC`, `Generic[InterfaceClient]` Abstract class for a provider. The provider is in charge of providing an authenticated client to the API. Each provider only supports a specific interface. An interface can be supported by multiple providers. For example, the `OpenAIChatModel` interface can be supported by the `OpenAIProvider` and the `DeepSeekProvider`. When used as an async context manager, providers that create their own HTTP client will close it on exit. This is handled automatically when using [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) as a context manager. #### Attributes ##### name The provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### base\_url The base URL for the provider API. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### client The client for the provider. **Type:** `InterfaceClient` #### Methods ##### model\_profile `@staticmethod` ```python def model_profile(model_name: str) -> ModelProfile | None ``` The model profile for the named model, if available. ###### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### gateway\_provider ```python def gateway_provider( upstream_provider: Literal['openai', 'openai-chat', 'openai-responses', 'chat', 'responses'], route: str | None = None, api_key: str | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> Provider[AsyncOpenAI] def gateway_provider( upstream_provider: Literal['groq'], route: str | None = None, api_key: str | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> Provider[AsyncGroq] def gateway_provider( upstream_provider: Literal['anthropic'], route: str | None = None, api_key: str | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> Provider[AsyncAnthropicClient] def gateway_provider( upstream_provider: Literal['bedrock', 'converse'], route: str | None = None, api_key: str | None = None, base_url: str | None = None, ) -> Provider[BaseClient] def gateway_provider( upstream_provider: Literal['gemini', 'google-vertex'], route: str | None = None, api_key: str | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> Provider[GoogleClient] def gateway_provider( upstream_provider: str, route: str | None = None, api_key: str | None = None, base_url: str | None = None, ) -> Provider[Any] ``` Create a new Gateway provider. #### Returns `Provider`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] #### Parameters **`upstream_provider`** : `UpstreamProvider` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The upstream provider to use. **`route`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The name of the provider or routing group to use to handle the request. If not provided, the default routing group for the API format will be used. **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication. If not provided, the `PYDANTIC_AI_GATEWAY_API_KEY` environment variable will be used if available. **`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base URL to use for the Gateway. If not provided, the `PYDANTIC_AI_GATEWAY_BASE_URL` environment variable will be used if available. Otherwise, defaults to `https://gateway.pydantic.dev/proxy`. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The HTTP client to use for the Gateway. ### AnthropicProvider **Bases:** `Provider[AsyncAnthropicClient]` Provider for Anthropic API. #### Methods ##### \_\_init\_\_ ```python def __init__(anthropic_client: AsyncAnthropicClient | None = None) -> None def __init__( api_key: str | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None ``` Create a new Anthropic provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication, if not provided, the `ANTHROPIC_API_KEY` environment variable will be used if available. **`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base URL to use for the Anthropic API. **`anthropic_client`** : `AsyncAnthropicClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing Anthropic client to use. Accepts [`AsyncAnthropic`](https://github.com/anthropics/anthropic-sdk-python), [`AsyncAnthropicBedrock`](https://docs.anthropic.com/en/api/claude-on-amazon-bedrock), [`AsyncAnthropicFoundry`](https://platform.claude.com/docs/en/build-with-claude/claude-in-microsoft-foundry), or [`AsyncAnthropicVertex`](https://docs.anthropic.com/en/api/claude-on-vertex-ai). If provided, the `api_key` and `http_client` arguments will be ignored. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `httpx.AsyncClient` to use for making HTTP requests. ### GoogleProvider **Bases:** `Provider[Client]` Provider for Google. #### Methods ##### \_\_init\_\_ ```python def __init__( api_key: str, http_client: httpx.AsyncClient | None = None, base_url: str | None = None, ) -> None def __init__( credentials: Credentials | None = None, project: str | None = None, location: VertexAILocation | Literal['global'] | str | None = None, http_client: httpx.AsyncClient | None = None, base_url: str | None = None, ) -> None def __init__(client: Client) -> None def __init__( vertexai: bool = False, api_key: str | None = None, http_client: httpx.AsyncClient | None = None, base_url: str | None = None, ) -> None ``` Create a new Google provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The `API key `\_ to use for authentication. It can also be set via the `GOOGLE_API_KEY` environment variable. **`credentials`** : `Credentials` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The credentials to use for authentication when calling the Vertex AI APIs. Credentials can be obtained from environment variables and default credentials. For more information, see Set up Application Default Credentials. Applies to the Vertex AI API only. **`project`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Google Cloud project ID to use for quota. Can be obtained from environment variables (for example, GOOGLE\_CLOUD\_PROJECT). Applies to the Vertex AI API only. **`location`** : `VertexAILocation` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['global'\] | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The location to send API requests to (for example, us-central1). Can be obtained from environment variables. Applies to the Vertex AI API only. **`vertexai`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Force the use of the Vertex AI API. If `False`, the Google Generative Language API will be used. Defaults to `False` unless `location`, `project`, or `credentials` are provided. **`client`** : `Client` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A pre-initialized client to use. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `httpx.AsyncClient` to use for making HTTP requests. **`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base URL for the Google API. ### VertexAILocation Regions available for Vertex AI. More details [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#genai-locations). **Default:** `Literal['asia-east1', 'asia-east2', 'asia-northeast1', 'asia-northeast3', 'asia-south1', 'asia-southeast1', 'australia-southeast1', 'europe-central2', 'europe-north1', 'europe-southwest1', 'europe-west1', 'europe-west2', 'europe-west3', 'europe-west4', 'europe-west6', 'europe-west8', 'europe-west9', 'me-central1', 'me-central2', 'me-west1', 'northamerica-northeast1', 'southamerica-east1', 'us-central1', 'us-east1', 'us-east4', 'us-east5', 'us-south1', 'us-west1', 'us-west4']` ### OpenAIProvider **Bases:** `Provider[AsyncOpenAI]` Provider for OpenAI API. #### Methods ##### \_\_init\_\_ ```python def __init__(openai_client: AsyncOpenAI) -> None def __init__( base_url: str | None = None, api_key: str | None = None, openai_client: None = None, http_client: httpx.AsyncClient | None = None, ) -> None ``` Create a new OpenAI provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base url for the OpenAI requests. If not provided, the `OPENAI_BASE_URL` environment variable will be used if available. Otherwise, defaults to OpenAI's base url. **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication, if not provided, the `OPENAI_API_KEY` environment variable will be used if available. **`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `httpx.AsyncClient` to use for making HTTP requests. ### XaiProvider **Bases:** `Provider[AsyncClient]` Provider for xAI API (native xAI SDK). #### Methods ##### \_\_init\_\_ ```python def __init__() -> None def __init__(api_key: str) -> None def __init__(xai_client: AsyncClient) -> None ``` Create a new xAI provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication, if not provided, the `XAI_API_KEY` environment variable will be used if available. **`xai_client`** : `AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `xai_sdk.AsyncClient` to use. This takes precedence over `api_key`. ### DeepSeekProvider **Bases:** `Provider[AsyncOpenAI]` Provider for DeepSeek API. ### BedrockModelProfile **Bases:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) Profile for models used with BedrockModel. ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. #### Attributes ##### bedrock\_thinking\_variant Which thinking API shape to use for unified thinking translation. - `'anthropic'`: Uses `{'thinking': {'type': 'enabled', 'budget_tokens': N}}` - `'openai'`: Uses `{'reasoning_effort': 'low'|'medium'|'high'}` - `'qwen'`: Uses `{'reasoning_config': 'low'|'high'}` - `None`: No unified thinking support. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['anthropic', 'openai', 'qwen'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### BedrockProvider **Bases:** `Provider[BaseClient]` Provider for AWS Bedrock. #### Methods ##### \_\_init\_\_ ```python def __init__(bedrock_client: BaseClient) -> None def __init__( api_key: str, base_url: str | None = None, region_name: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None def __init__( aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, base_url: str | None = None, region_name: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None ``` Initialize the Bedrock provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`bedrock_client`** : `BaseClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A boto3 client for Bedrock Runtime. If provided, other arguments are ignored. **`aws_access_key_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The AWS access key ID. If not set, the `AWS_ACCESS_KEY_ID` environment variable will be used if available. **`aws_secret_access_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The AWS secret access key. If not set, the `AWS_SECRET_ACCESS_KEY` environment variable will be used if available. **`aws_session_token`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The AWS session token. If not set, the `AWS_SESSION_TOKEN` environment variable will be used if available. **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key for Bedrock client. Can be used instead of `aws_access_key_id`, `aws_secret_access_key`, and `aws_session_token`. If not set, the `AWS_BEARER_TOKEN_BEDROCK` environment variable will be used if available. **`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base URL for the Bedrock client. **`region_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The AWS region name. If not set, the `AWS_DEFAULT_REGION` environment variable will be used if available. **`profile_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The AWS profile name. **`aws_read_timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The read timeout for Bedrock client. **`aws_connect_timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The connect timeout for Bedrock client. ### bedrock\_amazon\_model\_profile ```python def bedrock_amazon_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for an Amazon model used via Bedrock. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### bedrock\_deepseek\_model\_profile ```python def bedrock_deepseek_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a DeepSeek model used via Bedrock. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### remove\_bedrock\_geo\_prefix ```python def remove_bedrock_geo_prefix(model_name: str) -> str ``` Remove inference geographic prefix from model ID if present. Bedrock supports cross-region inference using geographic prefixes like 'us.', 'eu.', 'apac.', etc. This function strips those prefixes. #### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### GroqProvider **Bases:** `Provider[AsyncGroq]` Provider for Groq API. #### Methods ##### \_\_init\_\_ ```python def __init__(groq_client: AsyncGroq | None = None) -> None def __init__( api_key: str | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None ``` Create a new Groq provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication, if not provided, the `GROQ_API_KEY` environment variable will be used if available. **`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base url for the Groq requests. If not provided, the `GROQ_BASE_URL` environment variable will be used if available. Otherwise, defaults to Groq's base url. **`groq_client`** : `AsyncGroq` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing [`AsyncGroq`](https://github.com/groq/groq-python?tab=readme-ov-file#async-usage) client to use. If provided, `api_key` and `http_client` must be `None`. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `AsyncClient` to use for making HTTP requests. ### groq\_moonshotai\_model\_profile ```python def groq_moonshotai_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for an MoonshotAI model used with the Groq provider. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### meta\_groq\_model\_profile ```python def meta_groq_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Meta model used with the Groq provider. #### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ### AzureProvider **Bases:** `Provider[AsyncOpenAI]` Provider for Azure OpenAI API. See [https://azure.microsoft.com/en-us/products/ai-foundry](https://azure.microsoft.com/en-us/products/ai-foundry) for more information. #### Methods ##### \_\_init\_\_ ```python def __init__(openai_client: AsyncAzureOpenAI) -> None def __init__( azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None ``` Create a new Azure provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`azure_endpoint`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The Azure endpoint to use for authentication, if not provided, the `AZURE_OPENAI_ENDPOINT` environment variable will be used if available. **`api_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API version to use for authentication, if not provided, the `OPENAI_API_VERSION` environment variable will be used if available. Not required (and not sent) when `azure_endpoint` targets the [Azure OpenAI v1 GA API](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle) (i.e. a path ending in `/v1`, such as `https://.openai.azure.com/openai/v1/`) or an Azure AI Foundry serverless model endpoint (`*.models.ai.azure.com`), both of which reject the `api-version` query parameter. **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication, if not provided, the `AZURE_OPENAI_API_KEY` environment variable will be used if available. **`openai_client`** : `AsyncAzureOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing [`AsyncAzureOpenAI`](https://github.com/openai/openai-python#microsoft-azure-openai) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `httpx.AsyncClient` to use for making HTTP requests. ### CohereProvider **Bases:** `Provider[AsyncClientV2]` Provider for Cohere API. #### Methods ##### \_\_init\_\_ ```python def __init__( api_key: str | None = None, cohere_client: AsyncClientV2 | None = None, http_client: httpx.AsyncClient | None = None, ) -> None ``` Create a new Cohere provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication, if not provided, the `CO_API_KEY` environment variable will be used if available. **`cohere_client`** : `AsyncClientV2` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing [AsyncClientV2](https://github.com/cohere-ai/cohere-python) client to use. If provided, `api_key` and `http_client` must be `None`. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `httpx.AsyncClient` to use for making HTTP requests. ### VoyageAIProvider **Bases:** `Provider[AsyncClient]` Provider for VoyageAI API. #### Methods ##### \_\_init\_\_ ```python def __init__(voyageai_client: AsyncClient) -> None def __init__(api_key: str | None = None) -> None ``` Create a new VoyageAI provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication, if not provided, the `VOYAGE_API_KEY` environment variable will be used if available. **`voyageai_client`** : `AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing [AsyncClient](https://github.com/voyage-ai/voyageai-python) client to use. If provided, `api_key` must be `None`. ### CerebrasProvider **Bases:** `Provider[AsyncOpenAI]` Provider for Cerebras API. #### Methods ##### \_\_init\_\_ ```python def __init__() -> None def __init__(api_key: str) -> None def __init__(api_key: str, http_client: httpx.AsyncClient) -> None def __init__(http_client: httpx.AsyncClient) -> None def __init__(openai_client: AsyncOpenAI | None = None) -> None ``` Create a new Cerebras provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication, if not provided, the `CEREBRAS_API_KEY` environment variable will be used if available. **`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `httpx.AsyncClient` to use for making HTTP requests. ### MistralProvider **Bases:** `Provider[Mistral]` Provider for Mistral API. #### Methods ##### \_\_init\_\_ ```python def __init__(mistral_client: Mistral | None = None) -> None def __init__( api_key: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None ``` Create a new Mistral provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication, if not provided, the `MISTRAL_API_KEY` environment variable will be used if available. **`mistral_client`** : `Mistral` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `Mistral` client to use, if provided, `api_key` and `http_client` must be `None`. **`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base url for the Mistral requests. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing async client to use for making HTTP requests. ### FireworksProvider **Bases:** `Provider[AsyncOpenAI]` Provider for Fireworks AI API. ### GrokProvider **Bases:** `Provider[AsyncOpenAI]` Provider for Grok API (OpenAI-compatible interface). ### TogetherProvider **Bases:** `Provider[AsyncOpenAI]` Provider for Together AI API. ### HerokuProvider **Bases:** `Provider[AsyncOpenAI]` Provider for Heroku API. ### GitHubProvider **Bases:** `Provider[AsyncOpenAI]` Provider for GitHub Models API. GitHub Models provides access to various AI models through an OpenAI-compatible API. See [https://docs.github.com/en/github-models](https://docs.github.com/en/github-models) for more information. #### Methods ##### \_\_init\_\_ ```python def __init__() -> None def __init__(api_key: str) -> None def __init__(api_key: str, http_client: httpx.AsyncClient) -> None def __init__(openai_client: AsyncOpenAI | None = None) -> None ``` Create a new GitHub Models provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The GitHub token to use for authentication. If not provided, the `GITHUB_API_KEY` environment variable will be used if available. **`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `httpx.AsyncClient` to use for making HTTP requests. ### OpenRouterProvider **Bases:** `Provider[AsyncOpenAI]` Provider for OpenRouter API. #### Methods ##### \_\_init\_\_ ```python def __init__(openai_client: AsyncOpenAI) -> None def __init__( api_key: str | None = None, app_url: str | None = None, app_title: str | None = None, openai_client: None = None, http_client: httpx.AsyncClient | None = None, ) -> None ``` Configure the provider with either an API key or prebuilt client. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` OpenRouter API key. Falls back to `OPENROUTER_API_KEY` when omitted and required unless `openai_client` is provided. **`app_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional url for app attribution. Falls back to `OPENROUTER_APP_URL` when omitted. **`app_title`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional title for app attribution. Falls back to `OPENROUTER_APP_TITLE` when omitted. **`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Existing `AsyncOpenAI` client to reuse instead of creating one internally. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom `httpx.AsyncClient` to pass into the `AsyncOpenAI` constructor when building a client. ###### Raises - `UserError` -- If no API key is available and no `openai_client` is provided. ### VercelProvider **Bases:** `Provider[AsyncOpenAI]` Provider for Vercel AI Gateway API. ### HuggingFaceProvider **Bases:** `Provider[AsyncInferenceClient]` Provider for Hugging Face. #### Methods ##### \_\_init\_\_ ```python def __init__(base_url: str, api_key: str | None = None) -> None def __init__(provider_name: str, api_key: str | None = None) -> None def __init__(hf_client: AsyncInferenceClient, api_key: str | None = None) -> None def __init__( hf_client: AsyncInferenceClient, base_url: str, api_key: str | None = None, ) -> None def __init__( hf_client: AsyncInferenceClient, provider_name: str, api_key: str | None = None, ) -> None def __init__(api_key: str | None = None) -> None ``` Create a new Hugging Face provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base url for the Hugging Face requests. **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication, if not provided, the `HF_TOKEN` environment variable will be used if available. **`hf_client`** : `AsyncInferenceClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client to use. If not provided, a new instance will be created. **`http_client`** : `AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` (currently ignored) An existing `httpx.AsyncClient` to use for making HTTP requests. **`provider_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Name of the provider to use for inference. available providers can be found in the [HF Inference Providers documentation](https://huggingface.co/docs/inference-providers/index#partners). defaults to "auto", which will select the first available provider for the model, the first of the providers available for the model, sorted by the user's order in [https://hf.co/settings/inference-providers](https://hf.co/settings/inference-providers). If `base_url` is passed, then `provider_name` is not used. ### MoonshotAIProvider **Bases:** `Provider[AsyncOpenAI]` Provider for MoonshotAI platform (Kimi models). ### OllamaProvider **Bases:** `Provider[AsyncOpenAI]` Provider for local or remote Ollama API. #### Methods ##### \_\_init\_\_ ```python def __init__( base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None ``` Create a new Ollama provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The base url for the Ollama requests. If not provided, the `OLLAMA_BASE_URL` environment variable will be used if available. **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The API key to use for authentication, if not provided, the `OLLAMA_API_KEY` environment variable will be used if available. **`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An existing `httpx.AsyncClient` to use for making HTTP requests. ### LiteLLMProvider **Bases:** `Provider[AsyncOpenAI]` Provider for LiteLLM API. #### Methods ##### \_\_init\_\_ ```python def __init__(api_key: str | None = None, api_base: str | None = None) -> None def __init__( api_key: str | None = None, api_base: str | None = None, http_client: AsyncHTTPClient, ) -> None def __init__(openai_client: AsyncOpenAI) -> None ``` Initialize a LiteLLM provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` API key for the model provider. If None, LiteLLM will try to get it from environment variables. **`api_base`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Base URL for the model provider. Use this for custom endpoints or self-hosted models. **`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Pre-configured OpenAI client. If provided, other parameters are ignored. **`http_client`** : `AsyncHTTPClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom HTTP client to use. ### NebiusProvider **Bases:** `Provider[AsyncOpenAI]` Provider for Nebius AI Studio API. ### OVHcloudProvider **Bases:** `Provider[AsyncOpenAI]` Provider for OVHcloud AI Endpoints. ### AlibabaProvider **Bases:** `Provider[AsyncOpenAI]` Provider for Alibaba Cloud Model Studio (DashScope) OpenAI-compatible API. ### SambaNovaProvider **Bases:** `Provider[AsyncOpenAI]` Provider for SambaNova AI models. SambaNova uses an OpenAI-compatible API. #### Attributes ##### name Return the provider name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### base\_url Return the base URL. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### client Return the AsyncOpenAI client. **Type:** `AsyncOpenAI` #### Methods ##### model\_profile `@staticmethod` ```python def model_profile(model_name: str) -> ModelProfile | None ``` Get model profile for SambaNova models. SambaNova serves models from multiple families including Meta Llama, DeepSeek, Qwen, and Mistral. Model profiles are matched based on model name prefixes. ###### Returns [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### \_\_init\_\_ ```python def __init__( api_key: str | None = None, base_url: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None ``` Initialize SambaNova provider. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` SambaNova API key. If not provided, reads from SAMBANOVA\_API\_KEY env var. **`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom API base URL. Defaults to [https://api.sambanova.ai/v1](https://api.sambanova.ai/v1) **`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional pre-configured OpenAI client **`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional custom httpx.AsyncClient for making HTTP requests ###### Raises - `UserError` -- If API key is not provided and SAMBANOVA\_API\_KEY env var is not set --- # [pydantic_ai.result](https://pydantic.dev/docs/ai/api/pydantic-ai/result/) # pydantic\_ai.result ### StreamedRunResult **Bases:** `Generic[AgentDepsT, OutputDataT]` Result of a streamed run that returns structured data via a tool call. #### Attributes ##### is\_complete Whether the stream has all been received. This is set to `True` when one of [`stream_output`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_output), [`stream_text`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text), [`stream_responses`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_responses) or [`get_output`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.get_output) completes. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `field(default=False, init=False)` ##### response Return the current state of the response. **Type:** `_messages.ModelResponse` ##### metadata Metadata associated with this agent run, if configured. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### run\_id The unique identifier for the agent run. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### all\_messages ```python def all_messages( output_tool_return_content: str | None = None, ) -> list[_messages.ModelMessage] ``` Return the history of \_messages. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### all\_messages\_json ```python def all_messages_json(output_tool_return_content: str | None = None) -> bytes ``` Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.all_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### new\_messages ```python def new_messages( output_tool_return_content: str | None = None, ) -> list[_messages.ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of new messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### new\_messages\_json ```python def new_messages_json(output_tool_return_content: str | None = None) -> bytes ``` Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.new_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### stream `@async` `@deprecated` ```python def stream(debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT] ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`OutputDataT`\] ##### stream\_output `@async` ```python def stream_output(debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT] ``` Stream the output as an async iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`OutputDataT`\] -- An async iterable of the response data. ###### Parameters **`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1` by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. ##### stream\_text `@async` ```python def stream_text( delta: bool = False, debounce_by: float | None = 0.1, ) -> AsyncIterator[str] ``` Stream the text result as an async iterable. Note Result validators will NOT be called on the text result if `delta=True`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ###### Parameters **`delta`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point. **`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1` by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. ##### stream\_structured `@async` `@deprecated` ```python def stream_structured( debounce_by: float | None = 0.1, ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]] ``` ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`_messages.ModelResponse`, [`bool`](https://docs.python.org/3/library/functions.html#bool)\]\] ##### stream\_responses `@async` ```python def stream_responses( debounce_by: float | None = 0.1, ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]] ``` Stream the response as an async iterable of Structured LLM Messages. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`_messages.ModelResponse`, [`bool`](https://docs.python.org/3/library/functions.html#bool)\]\] -- An async iterable of the structured response message and whether that is the last message. ###### Parameters **`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1` by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. ##### get\_output `@async` ```python def get_output() -> OutputDataT ``` Stream the whole response, validate and return it. ###### Returns `OutputDataT` ##### usage ```python def usage() -> RunUsage ``` Return the usage of the whole run. Note This won't return the full usage until the stream is finished. ###### Returns [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) ##### timestamp ```python def timestamp() -> datetime ``` Get the timestamp of the response. ###### Returns [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ##### validate\_structured\_output `@async` `@deprecated` ```python def validate_structured_output( message: _messages.ModelResponse, allow_partial: bool = False, ) -> OutputDataT ``` ###### Returns `OutputDataT` ##### validate\_response\_output `@async` ```python def validate_response_output( message: _messages.ModelResponse, allow_partial: bool = False, ) -> OutputDataT ``` Validate a structured result message. ###### Returns `OutputDataT` ### StreamedRunResultSync **Bases:** `Generic[AgentDepsT, OutputDataT]` Synchronous wrapper for [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult) that only exposes sync methods. #### Attributes ##### response Return the current state of the response. **Type:** `_messages.ModelResponse` ##### run\_id The unique identifier for the agent run. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### metadata Metadata associated with this agent run, if configured. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### is\_complete Whether the stream has all been received. This is set to `True` when one of [`stream_output`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.stream_output), [`stream_text`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.stream_text), [`stream_responses`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.stream_responses) or [`get_output`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.get_output) completes. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) #### Methods ##### all\_messages ```python def all_messages( output_tool_return_content: str | None = None, ) -> list[_messages.ModelMessage] ``` Return the history of messages. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### all\_messages\_json ```python def all_messages_json(output_tool_return_content: str | None = None) -> bytes ``` Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.all_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### new\_messages ```python def new_messages( output_tool_return_content: str | None = None, ) -> list[_messages.ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of new messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### new\_messages\_json ```python def new_messages_json(output_tool_return_content: str | None = None) -> bytes ``` Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.new_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### stream\_output ```python def stream_output(debounce_by: float | None = 0.1) -> Iterator[OutputDataT] ``` Stream the output as an iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[`OutputDataT`\] -- An iterable of the response data. ###### Parameters **`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1` by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. ##### stream\_text ```python def stream_text(delta: bool = False, debounce_by: float | None = 0.1) -> Iterator[str] ``` Stream the text result as an iterable. Note Result validators will NOT be called on the text result if `delta=True`. ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ###### Parameters **`delta`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point. **`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1` by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. ##### stream\_responses ```python def stream_responses( debounce_by: float | None = 0.1, ) -> Iterator[tuple[_messages.ModelResponse, bool]] ``` Stream the response as an iterable of Structured LLM Messages. ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`_messages.ModelResponse`, [`bool`](https://docs.python.org/3/library/functions.html#bool)\]\] -- An iterable of the structured response message and whether that is the last message. ###### Parameters **`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1` by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. ##### get\_output ```python def get_output() -> OutputDataT ``` Stream the whole response, validate and return it. ###### Returns `OutputDataT` ##### usage ```python def usage() -> RunUsage ``` Return the usage of the whole run. Note This won't return the full usage until the stream is finished. ###### Returns [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) ##### timestamp ```python def timestamp() -> datetime ``` Get the timestamp of the response. ###### Returns [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ##### validate\_response\_output ```python def validate_response_output( message: _messages.ModelResponse, allow_partial: bool = False, ) -> OutputDataT ``` Validate a structured result message. ###### Returns `OutputDataT` --- # [pydantic_ai.retries](https://pydantic.dev/docs/ai/api/pydantic-ai/retries/) # pydantic\_ai.retries Retries utilities based on tenacity, especially for HTTP requests. This module provides HTTP transport wrappers and wait strategies that integrate with the tenacity library to add retry capabilities to HTTP requests. The transports can be used with HTTP clients that support custom transports (such as httpx), while the wait strategies can be used with any tenacity retry decorator. The module includes: - TenacityTransport: Synchronous HTTP transport with retry capabilities - AsyncTenacityTransport: Asynchronous HTTP transport with retry capabilities - wait\_retry\_after: Wait strategy that respects HTTP Retry-After headers ### RetryConfig **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) The configuration for tenacity-based retrying. These are precisely the arguments to the tenacity `retry` decorator, and they are generally used internally by passing them to that decorator via `@retry(**config)` or similar. All fields are optional, and if not provided, the default values from the `tenacity.retry` decorator will be used. #### Attributes ##### sleep A sleep strategy to use for sleeping between retries. Tenacity's default for this argument is `tenacity.nap.sleep`. **Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`int`](https://docs.python.org/3/library/functions.html#int) | [`float`](https://docs.python.org/3/library/functions.html#float)\], [`None`](https://docs.python.org/3/library/constants.html#None) | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]\] ##### stop A stop strategy to determine when to stop retrying. Tenacity's default for this argument is `tenacity.stop.stop_never`. **Type:** `StopBaseT` ##### wait A wait strategy to determine how long to wait between retries. Tenacity's default for this argument is `tenacity.wait.wait_none`. **Type:** `WaitBaseT` ##### retry A retry strategy to determine which exceptions should trigger a retry. Tenacity's default for this argument is `tenacity.retry.retry_if_exception_type()`. **Type:** `SyncRetryBaseT` | `RetryBaseT` ##### before A callable that is called before each retry attempt. Tenacity's default for this argument is `tenacity.before.before_nothing`. **Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`None`](https://docs.python.org/3/library/constants.html#None) | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]\] ##### after A callable that is called after each retry attempt. Tenacity's default for this argument is `tenacity.after.after_nothing`. **Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`None`](https://docs.python.org/3/library/constants.html#None) | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]\] ##### before\_sleep An optional callable that is called before sleeping between retries. Tenacity's default for this argument is `None`. **Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`None`](https://docs.python.org/3/library/constants.html#None) | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### reraise Whether to reraise the last exception if the retry attempts are exhausted, or raise a RetryError instead. Tenacity's default for this argument is `False`. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### retry\_error\_cls The exception class to raise when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `tenacity.RetryError`. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`RetryError`\] ##### retry\_error\_callback An optional callable that is called when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `None`. **Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) ### TenacityTransport **Bases:** `BaseTransport` Synchronous HTTP transport with tenacity-based retry functionality. This transport wraps another BaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate\_response function can be used to convert HTTP responses into exceptions that trigger retries. #### Constructor Parameters **`wrapped`** : `BaseTransport` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The underlying transport to wrap and add retry functionality to. **`config`** : `RetryConfig` The arguments to use for the tenacity `retry` decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. **`validate_response`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`Response`\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. #### Methods ##### handle\_request ```python def handle_request(request: Request) -> Response ``` Handle an HTTP request with retry logic. ###### Returns `Response` -- The HTTP response. ###### Parameters **`request`** : `Request` The HTTP request to handle. ###### Raises - `RuntimeError` -- If the retry controller did not make any attempts. - `Exception` -- Any exception raised by the wrapped transport or validation function. ### AsyncTenacityTransport **Bases:** `AsyncBaseTransport` Asynchronous HTTP transport with tenacity-based retry functionality. This transport wraps another AsyncBaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate\_response function can be used to convert HTTP responses into exceptions that trigger retries. #### Constructor Parameters **`wrapped`** : `AsyncBaseTransport` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The underlying async transport to wrap and add retry functionality to. **`config`** : `RetryConfig` The arguments to use for the tenacity `retry` decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. **`validate_response`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`Response`\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. #### Methods ##### handle\_async\_request `@async` ```python def handle_async_request(request: Request) -> Response ``` Handle an async HTTP request with retry logic. ###### Returns `Response` -- The HTTP response. ###### Parameters **`request`** : `Request` The HTTP request to handle. ###### Raises - `RuntimeError` -- If the retry controller did not make any attempts. - `Exception` -- Any exception raised by the wrapped transport or validation function. ### wait\_retry\_after ```python def wait_retry_after( fallback_strategy: Callable[[RetryCallState], float] | None = None, max_wait: float = 300, ) -> Callable[[RetryCallState], float] ``` Create a tenacity-compatible wait strategy that respects HTTP Retry-After headers. This wait strategy checks if the exception contains an HTTPStatusError with a Retry-After header, and if so, waits for the time specified in the header. If no header is present or parsing fails, it falls back to the provided strategy. The Retry-After header can be in two formats: - An integer representing seconds to wait - An HTTP date string representing when to retry #### Returns [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`float`](https://docs.python.org/3/library/functions.html#float)\] -- A wait function that can be used with tenacity retry decorators. #### Parameters **`fallback_strategy`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`float`](https://docs.python.org/3/library/functions.html#float)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Wait strategy to use when no Retry-After header is present or parsing fails. Defaults to exponential backoff with max 60s. **`max_wait`** : [`float`](https://docs.python.org/3/library/functions.html#float) _Default:_ `300` Maximum time to wait in seconds, regardless of header value. Defaults to 300 (5 minutes). --- # [pydantic_ai.run](https://pydantic.dev/docs/ai/api/pydantic-ai/run/) # pydantic\_ai.run ### AgentRun **Bases:** `Generic[AgentDepsT, OutputDataT]` A stateful, async-iterable run of an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent). You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) is reached, the run finishes and [`result`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.result) becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the [`next`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.next) method for more granular control. #### Attributes ##### ctx The current context of the agent run. **Type:** `GraphRunContext`\[`_agent_graph.GraphAgentState`, `_agent_graph.GraphAgentDeps`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ##### next\_node The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. **Type:** `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] ##### result The final result of the run if it has ended, otherwise `None`. Once the run returns an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node, `result` is populated with an [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult). **Type:** [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[`OutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### metadata Metadata associated with this agent run, if configured. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### run\_id The unique identifier for the agent run. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### all\_messages ```python def all_messages() -> list[_messages.ModelMessage] ``` Return all messages for the run so far. Messages from older runs are included. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] ##### all\_messages\_json ```python def all_messages_json(output_tool_return_content: str | None = None) -> bytes ``` Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.all_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages. ##### new\_messages ```python def new_messages() -> list[_messages.ModelMessage] ``` Return new messages for the run so far. Messages from older runs are excluded. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] ##### new\_messages\_json ```python def new_messages_json() -> bytes ``` Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.new_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages. ##### \_\_aiter\_\_ ```python def __aiter__( ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]] ``` Provide async-iteration over the nodes in the agent run. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\]\] ##### \_\_anext\_\_ `@async` ```python def __anext__( ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ``` Advance to the next node automatically based on the last returned node. Note: this uses the graph run's internal iteration which does NOT call node hooks (`before_node_run`, `wrap_node_run`, `after_node_run`, `on_node_run_error`). Use `next()` for capability-hooked iteration, or use `agent.run()` which drives via `next()` automatically. ###### Returns `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] ##### next `@async` ```python def next( node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ``` Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-5.2') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` ###### Returns `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] -- The next node returned by the graph logic, or an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node if `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] -- the run has completed. ###### Parameters **`node`** : `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] The node to run next in the graph. ##### usage ```python def usage() -> _usage.RunUsage ``` Get usage statistics for the run so far, including token usage, model requests, and so on. ###### Returns `_usage.RunUsage` ### AgentRunResult **Bases:** `Generic[OutputDataT]` The final result of an agent run. #### Attributes ##### output The output data from the agent run. **Type:** `OutputDataT` ##### response Return the last response from the message history. **Type:** `_messages.ModelResponse` ##### metadata Metadata associated with this agent run, if configured. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### run\_id The unique identifier for the agent run. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### all\_messages ```python def all_messages( output_tool_return_content: str | None = None, ) -> list[_messages.ModelMessage] ``` Return the history of \_messages. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### all\_messages\_json ```python def all_messages_json(output_tool_return_content: str | None = None) -> bytes ``` Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.all_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### new\_messages ```python def new_messages( output_tool_return_content: str | None = None, ) -> list[_messages.ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of new messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### new\_messages\_json ```python def new_messages_json(output_tool_return_content: str | None = None) -> bytes ``` Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.new_messages) as JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages. ###### Parameters **`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. ##### usage ```python def usage() -> _usage.RunUsage ``` Return the usage of the whole run. ###### Returns `_usage.RunUsage` ##### timestamp ```python def timestamp() -> datetime ``` Return the timestamp of last response. ###### Returns [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) ### AgentRunResultEvent **Bases:** `Generic[OutputDataT]` An event indicating the agent run ended and containing the final result of the agent run. #### Attributes ##### result The result of the run. **Type:** [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[`OutputDataT`\] ##### event\_kind Event type identifier, used as a discriminator. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['agent\_run\_result'\] **Default:** `'agent_run_result'` --- # [pydantic_ai.settings](https://pydantic.dev/docs/ai/api/pydantic-ai/settings/) # pydantic\_ai.settings ### ModelSettings **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Settings to configure an LLM. Here we include only settings which apply to multiple models / model providers, though not all of these settings are supported by all models. #### Attributes ##### max\_tokens The maximum number of tokens to generate before stopping. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock - MCP Sampling - Outlines (all providers) - xAI **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### temperature Amount of randomness injected into the response. Use `temperature` closer to `0.0` for analytical / multiple choice, and closer to a model's maximum `temperature` for creative and generative tasks. Note that even with `temperature` of `0.0`, the results will not be fully deterministic. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock - Outlines (Transformers, LlamaCpp, SgLang, VLLMOffline) - xAI **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) ##### top\_p An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top\_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should either alter `temperature` or `top_p`, but not both. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock - Outlines (Transformers, LlamaCpp, SgLang, VLLMOffline) - xAI **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) ##### timeout Override the client-level default timeout for a request, in seconds. Supported by: - Gemini - Anthropic - OpenAI - Groq - Mistral - xAI **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | `Timeout` ##### parallel\_tool\_calls Whether to allow parallel tool calls. Supported by: - OpenAI (some models, not o1) - Groq - Anthropic - xAI **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### seed The random seed to use for the model, theoretically allowing for deterministic results. Supported by: - OpenAI - Groq - Cohere - Mistral - Gemini - Outlines (LlamaCpp, VLLMOffline) **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### presence\_penalty Penalize new tokens based on whether they have appeared in the text so far. Supported by: - OpenAI - Groq - Cohere - Gemini - Mistral - Outlines (LlamaCpp, SgLang, VLLMOffline) - xAI **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) ##### frequency\_penalty Penalize new tokens based on their existing frequency in the text so far. Supported by: - OpenAI - Groq - Cohere - Gemini - Mistral - Outlines (LlamaCpp, SgLang, VLLMOffline) - xAI **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) ##### logit\_bias Modify the likelihood of specified tokens appearing in the completion. Supported by: - OpenAI - Groq - Outlines (Transformers, LlamaCpp, VLLMOffline) **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int)\] ##### stop\_sequences Sequences that will cause the model to stop generating. Supported by: - OpenAI - Anthropic - Bedrock - Mistral - Groq - Cohere - Google - xAI **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### extra\_headers Extra headers to send to the model. Supported by: - OpenAI - Anthropic - Gemini - Groq - xAI **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### thinking Enable or configure thinking/reasoning for the model. - `True`: Enable thinking with the provider's default effort level. - `False`: Disable thinking (silently ignored if the model always thinks). - `'minimal'`/`'low'`/`'medium'`/`'high'`/`'xhigh'`: Enable thinking at a specific effort level. When omitted, the model uses its default behavior (which may include thinking for reasoning models). Provider-specific thinking settings (e.g., `anthropic_thinking`, `openai_reasoning_effort`) take precedence over this unified field. Supported by: - Anthropic - OpenAI - Gemini - Groq - Bedrock - OpenRouter - Cerebras - xAI **Type:** `ThinkingLevel` ##### extra\_body Extra body to send to the model. Supported by: - OpenAI - Anthropic - Groq - Outlines (all providers) **Type:** [`object`](https://docs.python.org/3/glossary.html#term-object) --- # [pydantic_ai.tools](https://pydantic.dev/docs/ai/api/pydantic-ai/tools/) # pydantic\_ai.tools ### RunContext **Bases:** `Generic[RunContextAgentDepsT]` Information about the current call. #### Attributes ##### deps Dependencies for the agent. **Type:** `RunContextAgentDepsT` ##### model The model used in this run. **Type:** `Model` ##### usage LLM usage associated with the run. **Type:** [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) ##### agent The agent running this context, or `None` if not set. **Type:** `AbstractAgent`\[`RunContextAgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `field(default=None, repr=False)` ##### prompt The original user prompt passed to the run. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### messages Messages exchanged in the conversation so far. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] **Default:** `field(default_factory=(list[_messages.ModelMessage]))` ##### validation\_context Pydantic [validation context](https://docs.pydantic.dev/latest/concepts/validators/#validation-context) for tool args and run outputs. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) **Default:** `None` ##### tracer The tracer to use for tracing the run. **Type:** `Tracer` **Default:** `field(default_factory=NoOpTracer)` ##### trace\_include\_content Whether to include the content of the messages in the trace. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### instrumentation\_version Instrumentation settings version, if instrumentation is enabled. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `DEFAULT_INSTRUMENTATION_VERSION` ##### retries Number of retries for each tool so far. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int)\] **Default:** `field(default_factory=(dict[str, int]))` ##### tool\_call\_id The ID of the tool call. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### tool\_name Name of the tool being called. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### retry Number of retries so far. For tool calls, this is the number of retries of the specific tool. For output validation, this is the number of output validation retries. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### max\_retries The maximum number of retries allowed. For tool calls, this is the maximum retries for the specific tool. For output validation, this is the maximum output validation retries. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### run\_step The current step in the run. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### tool\_call\_approved Whether a tool call that required approval has now been approved. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### tool\_call\_metadata Metadata from `DeferredToolResults.metadata[tool_call_id]`, available when `tool_call_approved=True`. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) **Default:** `None` ##### partial\_output Whether the output passed to an output validator is partial. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### run\_id "Unique identifier for the agent run. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### metadata Metadata associated with this agent run, if configured. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### model\_settings The resolved model settings for the current run step. Populated before each model request, after all model settings layers (model defaults, agent-level, capability, and run-level) have been merged. Available in model request hooks (`before_model_request`, `wrap_model_request`, `after_model_request`). Currently `None` in tool hooks, output validators, and during agent construction. **Type:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### tool\_manager The tool manager for the current run step. Provides access to tool validation and execution, including tracing and capability hooks. Useful for toolsets that need to dispatch tool calls programmatically (e.g. code execution sandboxes). Not available in `TemporalRunContext` -- it is not serializable across Temporal activity boundaries. **Type:** `ToolManager`\[`RunContextAgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### last\_attempt Whether this is the last attempt at running this tool before an error is raised. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ### DeferredToolRequests Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object with the same tool call IDs. See [deferred tools docs](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more information. #### Attributes ##### calls Tool calls that require external execution. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)\] **Default:** `field(default_factory=(list[ToolCallPart]))` ##### approvals Tool calls that require human-in-the-loop approval. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)\] **Default:** `field(default_factory=(list[ToolCallPart]))` ##### metadata Metadata for deferred tool calls, keyed by `tool_call_id`. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] **Default:** `field(default_factory=(dict[str, dict[str, Any]]))` ### ToolApproved Indicates that a tool call has been approved and that the tool function should be executed. #### Attributes ##### override\_args Optional tool call arguments to use instead of the original arguments. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### ToolDenied Indicates that a tool call has been denied and that a denial message should be returned to the model. #### Attributes ##### message The message to return to the model. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'The tool call was denied.'` ### DeferredToolResults Results for deferred tool calls from a previous run that required approval or external execution. The tool call IDs need to match those from the [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output object from the previous run. See [deferred tools docs](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more information. #### Attributes ##### calls Map of tool call IDs to results for tool calls that required external execution. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `DeferredToolCallResult` | [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] **Default:** `field(default_factory=(dict[str, DeferredToolCallResult | Any]))` ##### approvals Map of tool call IDs to results for tool calls that required human-in-the-loop approval. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`bool`](https://docs.python.org/3/library/functions.html#bool) | `DeferredToolApprovalResult`\] **Default:** `field(default_factory=(dict[str, bool | DeferredToolApprovalResult]))` ##### metadata Metadata for deferred tool calls, keyed by `tool_call_id`. Each value will be available in the tool's RunContext as `tool_call_metadata`. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] **Default:** `field(default_factory=(dict[str, dict[str, Any]]))` ### Tool **Bases:** `Generic[ToolAgentDepsT]` A tool function for an agent. #### Attributes ##### function\_schema The base JSON schema for the tool's parameters. This schema may be modified by the `prepare` function or by the Model class prior to including it in an API request. **Type:** `_function_schema.FunctionSchema` **Default:** `function_schema or _function_schema.function_schema(function, schema_generator, tool_name=(self.name), takes_ctx=takes_ctx, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions)` #### Methods ##### \_\_init\_\_ ```python def __init__( function: ToolFuncEither[ToolAgentDepsT, ToolParams], takes_ctx: bool | None = None, max_retries: int | None = None, name: str | None = None, description: str | None = None, prepare: ToolPrepareFunc[ToolAgentDepsT] | None = None, args_validator: ArgsValidatorFunc[ToolAgentDepsT, ToolParams] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, defer_loading: bool = False, include_return_schema: bool | None = None, function_schema: _function_schema.FunctionSchema | None = None, ) ``` Create a new tool instance. Example usage: ```python from pydantic_ai import Agent, RunContext, Tool async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' agent = Agent('test', tools=[Tool(my_tool)]) ``` or with a custom prepare method: ```python from pydantic_ai import Agent, RunContext, Tool from pydantic_ai.tools import ToolDefinition async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' async def prep_my_tool( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: # only register the tool if `deps == 42` if ctx.deps == 42: return tool_def agent = Agent('test', tools=[Tool(my_tool, prepare=prep_my_tool)]) ``` ###### Parameters **`function`** : `ToolFuncEither`\[`ToolAgentDepsT`, `ToolParams`\] The Python function to call as the tool. **`takes_ctx`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether the function takes a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) first argument, this is inferred if unset. **`max_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Maximum number of retries allowed for this tool, set to the agent default if `None`. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Name of the tool, inferred from the function if `None`. **`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Description of the tool, inferred from the function if `None`. **`prepare`** : `ToolPrepareFunc`\[`ToolAgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc). **`args_validator`** : `ArgsValidatorFunc`\[`ToolAgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with `RunContext` as the first argument. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc). **`docstring_format`** : `DocstringFormat` _Default:_ `'auto'` The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. **`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` If True, raise an error if a parameter description is missing. Defaults to False. **`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] _Default:_ `GenerateToolJsonSchema` The JSON schema generator class to use. Defaults to `GenerateToolJsonSchema`. **`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info. **`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether the function requires a sequential/serial execution environment. Defaults to False. **`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. **`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). **`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to hide this tool until it's discovered via tool search. Defaults to False. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. **`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to include the return schema in the tool definition sent to the model. If `None`, defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used. **`function_schema`** : `_function_schema.FunctionSchema` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The function schema to use for the tool. If not provided, it will be generated. ##### from\_schema `@classmethod` ```python def from_schema( cls, function: Callable[..., Any], name: str, description: str | None, json_schema: JsonSchemaValue, takes_ctx: bool = False, sequential: bool = False, args_validator: ArgsValidatorFunc[Any, ...] | None = None, ) -> Self ``` Creates a Pydantic tool from a function and a JSON schema. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) -- A Pydantic tool that calls the function ###### Parameters **`function`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[..., [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] The function to call. This will be called with keywords only. Schema validation of the arguments is skipped, but a custom `args_validator` will still run if provided. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The unique name of the tool that clearly communicates its purpose **`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) Used to tell the model how/when/why to use the tool. You can provide few-shot examples as a part of the description. **`json_schema`** : `JsonSchemaValue` The schema for the function arguments **`takes_ctx`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` An optional boolean parameter indicating whether the function accepts the context object as an argument. **`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether the function requires a sequential/serial execution environment. Defaults to False. **`args_validator`** : `ArgsValidatorFunc`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), ...\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with `RunContext` as the first argument. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc). ##### prepare\_tool\_def `@async` ```python def prepare_tool_def(ctx: RunContext[ToolAgentDepsT]) -> ToolDefinition | None ``` Get the tool definition. By default, this method creates a tool definition, then either returns it, or calls `self.prepare` if it's set. ###### Returns [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) | [`None`](https://docs.python.org/3/library/constants.html#None) -- return a `ToolDefinition` or `None` if the tools should not be registered for this run. ### ToolDefinition Definition of a tool passed to a model. This is used for both function tools and output tools. #### Attributes ##### name The name of the tool. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### parameters\_json\_schema The JSON schema for the tool's parameters. **Type:** `ObjectJsonSchema` **Default:** `field(default_factory=(lambda: {'type': 'object', 'properties': {}}))` ##### description The description of the tool. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### outer\_typed\_dict\_key The key in the outer \[TypedDict\] that wraps an output tool. This will only be set for output tools which don't have an `object` JSON schema. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### strict Whether to enforce (vendor-specific) strict JSON schema validation for tool calls. Setting this to `True` while using a supported model generally imposes some restrictions on the tool's JSON schema in exchange for guaranteeing the API responses strictly match that schema. When `False`, the model may be free to generate other properties or types (depending on the vendor). When `None` (the default), the value will be inferred based on the compatibility of the parameters\_json\_schema. Note: this is currently supported by OpenAI and Anthropic models. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### sequential Whether this tool requires a sequential/serial execution environment. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### kind The kind of tool: - `'function'`: a tool that will be executed by Pydantic AI during an agent run and has its result returned to the model - `'output'`: a tool that passes through an output value that ends the run - `'external'`: a tool whose result will be produced outside of the Pydantic AI agent run in which it was called, because it depends on an upstream service (or user) or could take longer to generate than it's reasonable to keep the agent process running. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more info. - `'unapproved'`: a tool that requires human-in-the-loop approval. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. **Type:** `ToolKind` **Default:** `field(default='function')` ##### metadata Tool metadata that can be set by the toolset this tool came from. It is not sent to the model, but can be used for filtering and tool behavior customization. For MCP tools, this contains the `meta`, `annotations`, and `output_schema` fields from the tool definition. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### timeout Timeout in seconds for tool execution. If the tool takes longer than this, a retry prompt is returned to the model. Defaults to None (no timeout). **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### defer\_loading Whether this tool should be hidden from the model until discovered via tool search. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### prefer\_builtin If set, this function tool is a local fallback for the builtin tool with the given unique\_id. When the model supports the corresponding builtin tool natively, this function tool is removed from the request. When the model does not support the builtin, the builtin is removed and this function tool stays. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### return\_schema The JSON schema for the tool's return value. For models that natively support return schemas (e.g. Google Gemini), this is passed as a structured field in the API request. For other models, it is injected into the tool's description as JSON text. Only included when `include_return_schema` resolves to `True`. **Type:** `ObjectJsonSchema` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### include\_return\_schema Whether to include the return schema in the tool definition sent to the model. When `True`, the `return_schema` will be preserved and sent to the model. When `False`, the `return_schema` will be cleared before sending. When `None` (default), defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### function\_signature The function signature shape for this tool. If not provided, computed from the JSON schema in `__post_init__`. Name and description are not stored on the signature -- pass them at render time via `sig.render(body, name=td.name, description=td.description)`. **Type:** `FunctionSignature` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `field(default=None, repr=False, compare=False)` ##### defer Whether calls to this tool will be deferred. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more info. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) #### Methods ##### render\_signature ```python def render_signature(body: str, kwargs: Any = {}) -> str ``` Render the function signature with this tool's name and description. Convenience wrapper around `self.function_signature.render()` that supplies `name` and `description` from this tool definition. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### matches\_tool\_selector `@async` ```python def matches_tool_selector( selector: ToolSelector[AgentDepsT], ctx: RunContext[AgentDepsT], tool_def: ToolDefinition, ) -> bool ``` Check whether a tool definition matches a [`ToolSelector`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolSelector). #### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) -- `True` if the tool matches the selector. #### Parameters **`selector`** : `ToolSelector`\[`AgentDepsT`\] The selector to check against. **`ctx`** : [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[`AgentDepsT`\] The current run context. **`tool_def`** : [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) The tool definition to test. ### AgentDepsT Type variable for agent dependencies. **Default:** `TypeVar('AgentDepsT', default=None, contravariant=True)` ### ToolParams Retrieval function param spec. **Default:** `ParamSpec('ToolParams', default=...)` ### SystemPromptFunc A function that may or maybe not take `RunContext` as an argument, and may or may not be async. Functions which return None are excluded from model requests. Usage `SystemPromptFunc[AgentDepsT]`. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT]], str | None] | Callable[[RunContext[AgentDepsT]], Awaitable[str | None]] | Callable[[], str | None] | Callable[[], Awaitable[str | None]]` ### ToolFuncContext A tool function that takes `RunContext` as the first argument. Usage `ToolContextFunc[AgentDepsT, ToolParams]`. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[Concatenate[RunContext[AgentDepsT], ToolParams], Any]` ### ToolFuncPlain A tool function that does not take `RunContext` as the first argument. Usage `ToolPlainFunc[ToolParams]`. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[ToolParams, Any]` ### ToolFuncEither Either kind of tool function. This is just a union of [`ToolFuncContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolFuncContext) and [`ToolFuncPlain`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolFuncPlain). Usage `ToolFuncEither[AgentDepsT, ToolParams]`. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `ToolFuncContext[AgentDepsT, ToolParams] | ToolFuncPlain[ToolParams]` ### ArgsValidatorFunc A function that validates tool arguments before execution. The validator receives the same typed parameters as the tool function, with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as the first argument for dependency access. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[Concatenate[RunContext[AgentDepsT], ToolParams], Awaitable[None]] | Callable[Concatenate[RunContext[AgentDepsT], ToolParams], None]` ### ToolPrepareFunc Definition of a function that can prepare a tool definition at call time. Both sync and async functions are accepted. See [tool docs](/docs/ai/tools-toolsets/tools-advanced#tool-prepare) for more information. Example -- here `only_if_42` is valid as a `ToolPrepareFunc`: ```python from pydantic_ai import RunContext, Tool from pydantic_ai.tools import ToolDefinition def only_if_42( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: if ctx.deps == 42: return tool_def def hitchhiker(ctx: RunContext[int], answer: str) -> str: return f'{ctx.deps} {answer}' hitchhiker = Tool(hitchhiker, prepare=only_if_42) ``` Usage `ToolPrepareFunc[AgentDepsT]`. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT], 'ToolDefinition'], Union[Awaitable['ToolDefinition | None'], 'ToolDefinition', None]]` ### ToolsPrepareFunc Definition of a function that can prepare the tool definition of all tools for each step. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. Both sync and async functions are accepted. Example -- here `turn_on_strict_if_openai` is valid as a `ToolsPrepareFunc`: ```python from dataclasses import replace from pydantic_ai import Agent, RunContext from pydantic_ai.tools import ToolDefinition def turn_on_strict_if_openai( ctx: RunContext[None], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition] | None: if ctx.model.system == 'openai': return [replace(tool_def, strict=True) for tool_def in tool_defs] return tool_defs agent = Agent('openai:gpt-5.2', prepare_tools=turn_on_strict_if_openai) ``` Usage `ToolsPrepareFunc[AgentDepsT]`. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT], list['ToolDefinition']], Awaitable['list[ToolDefinition] | None'] | list['ToolDefinition'] | None]` ### ToolSelectorFunc A callable that decides whether a tool matches a selection criterion. Receives the run context and a tool definition, returns `True` if the tool is selected. Both sync and async functions are accepted. Usage `ToolSelectorFunc[AgentDepsT]`. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT], 'ToolDefinition'], bool | Awaitable[bool]]` ### ToolSelector Specifies which tools a capability or toolset wrapper should apply to. - `'all'`: matches every tool (default for most capabilities). - `Sequence[str]`: matches tools whose names are in the sequence. - `dict[str, Any]`: matches tools whose [`metadata`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition.metadata) contains all the specified key-value pairs (deep inclusion check -- nested dicts are compared recursively, and the tool's metadata may have additional keys). - `Callable[[RunContext, ToolDefinition], bool | Awaitable[bool]]`: custom sync or async predicate. The first three forms are serializable for use in agent specs (YAML/JSON). Usage `ToolSelector[AgentDepsT]`. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['all'] | Sequence[str] | dict[str, Any] | ToolSelectorFunc[AgentDepsT]` ### BuiltinToolFunc Definition of a function that can prepare a builtin tool at call time. This is useful if you want to customize the builtin tool based on the run context (e.g. user dependencies), or omit it completely from a step. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT]], Awaitable[AbstractBuiltinTool | None] | AbstractBuiltinTool | None]` ### AgentBuiltinTool A builtin tool or a function that dynamically produces one. This is a convenience alias for `AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]`. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]` ### DocstringFormat Supported docstring formats. - `'google'` -- [Google-style](https://google.github.io/styleguide/pyguide.html#381-docstrings) docstrings. - `'numpy'` -- [Numpy-style](https://numpydoc.readthedocs.io/en/latest/format.html) docstrings. - `'sphinx'` -- [Sphinx-style](https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html#the-sphinx-docstring-format) docstrings. - `'auto'` -- Automatically infer the format based on the structure of the docstring. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['google', 'numpy', 'sphinx', 'auto']` ### DeferredToolApprovalResult Result for a tool call that required human-in-the-loop approval. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Annotated[ToolApproved | ToolDenied, Discriminator('kind')]` ### DeferredToolCallResult Result for a tool call that required external execution. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Annotated[Annotated[ToolReturn, Tag('tool-return')] | Annotated[ModelRetry, Tag('model-retry')] | Annotated[RetryPromptPart, Tag('retry-prompt')], Discriminator(_deferred_tool_call_result_discriminator)]` ### DeferredToolResult Result for a tool call that required approval or external execution. **Default:** `DeferredToolApprovalResult | DeferredToolCallResult` ### ToolAgentDepsT Type variable for agent dependencies for a tool. **Default:** `TypeVar('ToolAgentDepsT', default=object, contravariant=True)` ### ObjectJsonSchema Type representing JSON schema of an object, e.g. where `"type": "object"`. This type is used to define tools parameters (aka arguments) in [ToolDefinition](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition). With PEP-728 this should be a TypedDict with `type: Literal['object']`, and `extra_parts=Any` **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `dict[str, Any]` ### ToolKind Kind of tool. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['function', 'output', 'external', 'unapproved']` --- # [pydantic_ai.toolsets](https://pydantic.dev/docs/ai/api/pydantic-ai/toolsets/) # pydantic\_ai.toolsets ### AbstractToolset **Bases:** `ABC`, `Generic[AgentDepsT]` A toolset is a collection of tools that can be used by an agent. It is responsible for: - Listing the tools it contains - Validating the arguments of the tools - Calling the tools See [toolset docs](/docs/ai/tools-toolsets/toolsets) for more information. #### Attributes ##### id An ID for the toolset that is unique among all toolsets registered with the same agent. If you're implementing a concrete implementation that users can instantiate more than once, you should let them optionally pass a custom ID to the constructor and return that here. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### label The name of the toolset for use in error messages. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### tool\_name\_conflict\_hint A hint for how to avoid name conflicts with other toolsets for use in error messages. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### for\_run `@async` ```python def for_run(ctx: RunContext[AgentDepsT]) -> AbstractToolset[AgentDepsT] ``` Return the toolset to use for this agent run. Called once per run, before `__aenter__`. Override this to return a fresh instance for per-run state isolation. Default: return `self` (shared across runs). ###### Returns [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\] ##### for\_run\_step `@async` ```python def for_run_step(ctx: RunContext[AgentDepsT]) -> AbstractToolset[AgentDepsT] ``` Return the toolset to use for this run step. Called at the start of each run step. Override this to return a modified instance for per-step state transitions. If returning a new instance, you are responsible for managing any lifecycle transitions (exiting old inner toolsets, entering new ones). Default: return `self` (no per-step changes). ###### Returns [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\] ##### \_\_aenter\_\_ `@async` ```python def __aenter__() -> Self ``` Enter the toolset context. This is where you can set up network connections in a concrete implementation. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) ##### \_\_aexit\_\_ `@async` ```python def __aexit__(args: Any = ()) -> bool | None ``` Exit the toolset context. This is where you can tear down network connections in a concrete implementation. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### get\_instructions `@async` ```python def get_instructions( ctx: RunContext[AgentDepsT], ) -> str | InstructionPart | Sequence[str | InstructionPart] | None ``` Return instructions for how to use this toolset's tools. Override this method to provide instructions that help the agent understand how to use the tools in this toolset effectively. Simple implementations can return a plain `str`; advanced implementations can return [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) objects to indicate whether each instruction block is static or dynamic for caching purposes. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart)\] | [`None`](https://docs.python.org/3/library/constants.html#None) -- Instruction string, `InstructionPart`, list of either, or `None`. [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart)\] | [`None`](https://docs.python.org/3/library/constants.html#None) -- Plain `str` values are treated as dynamic instructions by default. ###### Parameters **`ctx`** : [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[`AgentDepsT`\] The run context for this agent run. ##### get\_tools `@abstractmethod` `@async` ```python def get_tools(ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]] ``` The tools that are available in this toolset. ###### Returns [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `ToolsetTool`\[`AgentDepsT`\]\] ##### call\_tool `@abstractmethod` `@async` ```python def call_tool( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any ``` Call a tool with the given arguments. ###### Returns [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The name of the tool to call. **`tool_args`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] The arguments to pass to the tool. **`ctx`** : [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[`AgentDepsT`\] The run context. **`tool`** : `ToolsetTool`\[`AgentDepsT`\] The tool definition returned by [`get_tools`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.get_tools) that was called. ##### apply ```python def apply(visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None ``` Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling). ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### visit\_and\_replace ```python def visit_and_replace( visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]], ) -> AbstractToolset[AgentDepsT] ``` Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling) and replace them in the hierarchy with the result of the function. ###### Returns [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\] ##### filtered ```python def filtered( filter_func: Callable[[RunContext[AgentDepsT], ToolDefinition], bool | Awaitable[bool]], ) -> FilteredToolset[AgentDepsT] ``` Returns a new toolset that filters this toolset's tools using a filter function that takes the agent context and the tool definition. See [toolset docs](/docs/ai/tools-toolsets/toolsets#filtering-tools) for more information. ###### Returns [`FilteredToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FilteredToolset)\[`AgentDepsT`\] ##### prefixed ```python def prefixed(prefix: str) -> PrefixedToolset[AgentDepsT] ``` Returns a new toolset that prefixes the names of this toolset's tools. See [toolset docs](/docs/ai/tools-toolsets/toolsets#prefixing-tool-names) for more information. ###### Returns [`PrefixedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.PrefixedToolset)\[`AgentDepsT`\] ##### prepared ```python def prepared(prepare_func: ToolsPrepareFunc[AgentDepsT]) -> PreparedToolset[AgentDepsT] ``` Returns a new toolset that prepares this toolset's tools using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](/docs/ai/tools-toolsets/toolsets#preparing-tool-definitions) for more information. ###### Returns [`PreparedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.PreparedToolset)\[`AgentDepsT`\] ##### renamed ```python def renamed(name_map: dict[str, str]) -> RenamedToolset[AgentDepsT] ``` Returns a new toolset that renames this toolset's tools using a dictionary mapping new names to original names. See [toolset docs](/docs/ai/tools-toolsets/toolsets#renaming-tools) for more information. ###### Returns [`RenamedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.RenamedToolset)\[`AgentDepsT`\] ##### approval\_required ```python def approval_required( approval_required_func: Callable[[RunContext[AgentDepsT], ToolDefinition, dict[str, Any]], bool] = lambda ctx, tool_def, tool_args: True, ) -> ApprovalRequiredToolset[AgentDepsT] ``` Returns a new toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](/docs/ai/tools-toolsets/toolsets#requiring-tool-approval) for more information. ###### Returns [`ApprovalRequiredToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.ApprovalRequiredToolset)\[`AgentDepsT`\] ##### defer\_loading ```python def defer_loading( tool_names: Sequence[str] | None = None, ) -> DeferredLoadingToolset[AgentDepsT] ``` Returns a new toolset that marks tools for deferred loading, hiding them until discovered via tool search. See [toolset docs](/docs/ai/tools-toolsets/toolsets#deferred-loading) for more information. ###### Returns [`DeferredLoadingToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.DeferredLoadingToolset)\[`AgentDepsT`\] ###### Parameters **`tool_names`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional sequence of tool names to mark for deferred loading. If `None`, all tools are marked for deferred loading. ##### include\_return\_schemas ```python def include_return_schemas() -> IncludeReturnSchemasToolset[AgentDepsT] ``` Returns a new toolset that sets `include_return_schema=True` on all tools. This causes the model to receive return type information for the tools in this toolset. For models that natively support return schemas (e.g. Google Gemini), the schema is passed as a structured field. For other models, it is injected into the tool description as JSON text. This is the toolset-level equivalent of the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability, which can be used to enable return schemas across all toolsets or a subset matched by a [`ToolSelector`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolSelector). ###### Returns [`IncludeReturnSchemasToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.IncludeReturnSchemasToolset)\[`AgentDepsT`\] ##### with\_metadata ```python def with_metadata(metadata: Any = {}) -> SetMetadataToolset[AgentDepsT] ``` Returns a new toolset that merges the given metadata onto all tools. ###### Returns [`SetMetadataToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.SetMetadataToolset)\[`AgentDepsT`\] ### CombinedToolset **Bases:** `AbstractToolset[AgentDepsT]` A toolset that combines multiple toolsets. See [toolset docs](/docs/ai/tools-toolsets/toolsets#combining-toolsets) for more information. ### ExternalToolset **Bases:** `AbstractToolset[AgentDepsT]` A toolset that holds tools whose results will be produced outside of the Pydantic AI agent run in which they were called. See [toolset docs](/docs/ai/tools-toolsets/toolsets#external-toolset) for more information. ### ApprovalRequiredToolset **Bases:** `WrapperToolset[AgentDepsT]` A toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](/docs/ai/tools-toolsets/toolsets#requiring-tool-approval) for more information. ### FilteredToolset **Bases:** `WrapperToolset[AgentDepsT]` A toolset that filters the tools it contains using a filter function that takes the agent context and the tool definition. Both sync and async filter functions are accepted. See [toolset docs](/docs/ai/tools-toolsets/toolsets#filtering-tools) for more information. ### FunctionToolset **Bases:** `AbstractToolset[AgentDepsT]` A toolset that lets Python functions be used as tools. See [toolset docs](/docs/ai/tools-toolsets/toolsets#function-toolset) for more information. #### Methods ##### \_\_init\_\_ ```python def __init__( tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = [], max_retries: int = 1, timeout: float | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, defer_loading: bool = False, include_return_schema: bool | None = None, id: str | None = None, instructions: str | SystemPromptFunc[AgentDepsT] | Sequence[str | SystemPromptFunc[AgentDepsT]] | None = None, ) ``` Build a new function toolset. ###### Parameters **`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] _Default:_ `[]` The tools to add to the toolset. **`max_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `1` The maximum number of retries for each tool during a run. Applies to all tools, unless overridden when adding a tool. **`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Timeout in seconds for tool execution. If a tool takes longer than this, a retry prompt is returned to the model. Individual tools can override this with their own timeout. Defaults to None (no timeout). **`docstring_format`** : `DocstringFormat` _Default:_ `'auto'` Format of tool docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. Applies to all tools, unless overridden when adding a tool. **`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` If True, raise an error if a parameter description is missing. Defaults to False. Applies to all tools, unless overridden when adding a tool. **`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] _Default:_ `GenerateToolJsonSchema` The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. Applies to all tools, unless overridden when adding a tool. **`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info. **`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether the function requires a sequential/serial execution environment. Defaults to False. Applies to all tools, unless overridden when adding a tool. **`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. Applies to all tools, unless overridden when adding a tool. **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. Applies to all tools, unless overridden when adding a tool, which will be merged with the toolset's metadata. **`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to hide tools from the model until discovered via tool search. Defaults to False. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. Applies to all tools, unless overridden when adding a tool. **`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to include return schemas in tool definitions sent to the model. If `None`, defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used. Applies to all tools, unless overridden when adding a tool. **`id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An optional unique ID for the toolset. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. **`instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `SystemPromptFunc`\[`AgentDepsT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | `SystemPromptFunc`\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Instructions for this toolset that are automatically included in the model request. Can be a string, a function (sync or async, with or without `RunContext`), or a sequence of these. ##### tool ```python def tool( func: ToolFuncContext[AgentDepsT, ToolParams], ) -> ToolFuncContext[AgentDepsT, ToolParams] def tool( name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, args_validator: ArgsValidatorFunc[AgentDepsT, ToolParams] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, timeout: float | None = None, defer_loading: bool | None = None, include_return_schema: bool | None = None, ) -> Callable[[ToolFuncContext[AgentDepsT, ToolParams]], ToolFuncContext[AgentDepsT, ToolParams]] ``` Decorator to register a tool function which takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](/docs/ai/tools-toolsets/tools#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool` is obscured. Example: ```python from pydantic_ai import Agent, FunctionToolset, RunContext toolset = FunctionToolset() @toolset.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @toolset.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y agent = Agent('test', toolsets=[toolset], deps_type=int) result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` ###### Returns [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ###### Parameters **`func`** : `ToolFuncContext`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The tool function to register. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The name of the tool, defaults to the function name. **`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The description of the tool,defaults to the function docstring. **`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. **`prepare`** : `ToolPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc). **`args_validator`** : `ArgsValidatorFunc`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with `RunContext` as the first argument. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc). **`docstring_format`** : `DocstringFormat` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). If `None`, the default value is determined by the toolset. **`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. **`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. **`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info. If `None`, the default value is determined by the toolset. **`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset. **`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset. **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. **`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). **`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to hide this tool until it's discovered via tool search. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. If `None`, the default value is determined by the toolset. **`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to include the return schema in the tool definition sent to the model. If `None`, the default value is determined by the toolset. ##### tool\_plain ```python def tool_plain(func: ToolFuncPlain[ToolParams]) -> ToolFuncPlain[ToolParams] def tool_plain( name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, args_validator: ArgsValidatorFunc[AgentDepsT, ToolParams] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, timeout: float | None = None, defer_loading: bool | None = None, include_return_schema: bool | None = None, ) -> Callable[[ToolFuncPlain[ToolParams]], ToolFuncPlain[ToolParams]] ``` Decorator to register a tool function which DOES NOT take `RunContext` as an argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](/docs/ai/tools-toolsets/tools#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool_plain` is obscured. Example: ```python from pydantic_ai import Agent, FunctionToolset toolset = FunctionToolset() @toolset.tool_plain def foobar(x: int) -> int: return x + 1 @toolset.tool_plain(retries=2) async def spam(y: float) -> float: return y * 2.0 agent = Agent('test', toolsets=[toolset]) result = agent.run_sync('foobar') print(result.output) #> {"foobar":1,"spam":0.0} ``` ###### Returns [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ###### Parameters **`func`** : `ToolFuncPlain`\[`ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The tool function to register. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The name of the tool, defaults to the function name. **`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The description of the tool, defaults to the function docstring. **`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The number of retries to allow for this tool, defaults to the toolset's default retries, which defaults to 1. **`prepare`** : `ToolPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc). **`args_validator`** : `ArgsValidatorFunc`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as the first argument -- even though the tool function itself does not take `RunContext` when using `tool_plain`. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc). **`docstring_format`** : `DocstringFormat` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). If `None`, the default value is determined by the toolset. **`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. **`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. **`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info. If `None`, the default value is determined by the toolset. **`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset. **`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset. **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. **`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). **`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to hide this tool until it's discovered via tool search. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. If `None`, the default value is determined by the toolset. **`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to include the return schema in the tool definition sent to the model. If `None`, the default value is determined by the toolset. ##### instructions ```python def instructions(func: SystemPromptFunc[AgentDepsT]) -> SystemPromptFunc[AgentDepsT] ``` Decorator to register an instructions function for this toolset. The function can be sync or async, and can optionally take a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument. Example: ```python from pydantic_ai import FunctionToolset, RunContext toolset = FunctionToolset[int]() @toolset.instructions def my_instructions(ctx: RunContext[int]) -> str: return 'Always use the search tool when looking for information.' @toolset.tool def search(ctx: RunContext[int], query: str) -> str: return f'Results for: {query}' ``` ###### Returns `SystemPromptFunc`\[`AgentDepsT`\] ###### Parameters **`func`** : `SystemPromptFunc`\[`AgentDepsT`\] The instructions function to register. ##### add\_function ```python def add_function( func: ToolFuncEither[AgentDepsT, ToolParams], takes_ctx: bool | None = None, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, args_validator: ArgsValidatorFunc[AgentDepsT, ToolParams] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, defer_loading: bool | None = None, metadata: dict[str, Any] | None = None, timeout: float | None = None, include_return_schema: bool | None = None, ) -> Tool[AgentDepsT] ``` Add a function as a tool to the toolset. Can take a sync or async function. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](/docs/ai/tools-toolsets/tools#function-tools-and-schema). ###### Returns [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] ###### Parameters **`func`** : `ToolFuncEither`\[`AgentDepsT`, `ToolParams`\] The tool function to register. **`takes_ctx`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether the function takes a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument. If `None`, this is inferred from the function signature. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The name of the tool, defaults to the function name. **`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The description of the tool, defaults to the function docstring. **`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. **`prepare`** : `ToolPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc). **`args_validator`** : `ArgsValidatorFunc`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with `RunContext` as the first argument. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc). **`docstring_format`** : `DocstringFormat` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). If `None`, the default value is determined by the toolset. **`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. **`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. **`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info. If `None`, the default value is determined by the toolset. **`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset. **`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset. **`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to hide this tool until it's discovered via tool search. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. If `None`, the default value is determined by the toolset. **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. **`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). **`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Whether to include the return schema in the tool definition sent to the model. If `None`, the default value is determined by the toolset. ##### add\_tool ```python def add_tool(tool: Tool[AgentDepsT]) -> None ``` Add a tool to the toolset. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`tool`** : [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] The tool to add. ### IncludeReturnSchemasToolset **Bases:** `PreparedToolset[AgentDepsT]` A toolset that sets `include_return_schema=True` on all its tools. See [toolset docs](/docs/ai/tools-toolsets/toolsets) for more information. ### DeferredLoadingToolset **Bases:** `PreparedToolset[AgentDepsT]` A toolset that marks tools for deferred loading, hiding them from the model until discovered via tool search. See [toolset docs](/docs/ai/tools-toolsets/toolsets#deferred-loading) for more information. #### Attributes ##### tool\_names Optional set of tool names to mark for deferred loading. If `None`, all tools are marked for deferred loading. **Type:** [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `tool_names` ### PrefixedToolset **Bases:** `WrapperToolset[AgentDepsT]` A toolset that prefixes the names of the tools it contains. See [toolset docs](/docs/ai/tools-toolsets/toolsets#prefixing-tool-names) for more information. ### RenamedToolset **Bases:** `WrapperToolset[AgentDepsT]` A toolset that renames the tools it contains using a dictionary mapping new names to original names. See [toolset docs](/docs/ai/tools-toolsets/toolsets#renaming-tools) for more information. ### SetMetadataToolset **Bases:** `PreparedToolset[AgentDepsT]` A toolset that merges metadata key-value pairs onto all its tools. See [toolset docs](/docs/ai/tools-toolsets/toolsets) for more information. ### PreparedToolset **Bases:** `WrapperToolset[AgentDepsT]` A toolset that prepares the tools it contains using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](/docs/ai/tools-toolsets/toolsets#preparing-tool-definitions) for more information. ### WrapperToolset **Bases:** `AbstractToolset[AgentDepsT]` A toolset that wraps another toolset and delegates to it. See [toolset docs](/docs/ai/tools-toolsets/toolsets#wrapping-a-toolset) for more information. #### Methods ##### get\_instructions `@async` ```python def get_instructions( ctx: RunContext[AgentDepsT], ) -> str | InstructionPart | Sequence[str | InstructionPart] | None ``` Delegate instructions to the wrapped toolset. This explicit delegation ensures type safety and proper propagation of the instructions from wrapped toolsets to the agent's system prompt. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ### ToolsetFunc A sync/async function which takes a run context and returns a toolset. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT]], AbstractToolset[AgentDepsT] | None | Awaitable[AbstractToolset[AgentDepsT] | None]]` ### FastMCPToolset **Bases:** `AbstractToolset[AgentDepsT]` A FastMCP Toolset that uses the FastMCP Client to call tools from a local or remote MCP Server. The Toolset can accept a FastMCP Client, a FastMCP Transport, or any other object which a FastMCP Transport can be created from. See [https://gofastmcp.com/clients/transports](https://gofastmcp.com/clients/transports) for a full list of transports available. #### Attributes ##### client The FastMCP client to use. **Type:** `Client`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### max\_retries The maximum number of retries to attempt if a tool call fails. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `max_retries` ##### tool\_error\_behavior The behavior to take when a tool error occurs. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['model\_retry', 'error'\] **Default:** `tool_error_behavior` ##### include\_instructions Whether to include the server's instructions in the agent's instructions. Defaults to `False` for backward compatibility. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `include_instructions` ##### include\_return\_schema Whether to include return schemas in tool definitions sent to the model. When `None` (default), defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `include_return_schema` ##### instructions Access the instructions sent by the FastMCP server during initialization. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) #### Methods ##### get\_instructions `@async` ```python def get_instructions(ctx: RunContext[AgentDepsT]) -> messages.InstructionPart | None ``` Return the FastMCP server's instructions for how to use its tools. If [`include_instructions`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.fastmcp.FastMCPToolset.include_instructions) is `True`, returns the [`instructions`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.fastmcp.FastMCPToolset.instructions) sent by the FastMCP server during initialization. Otherwise, returns `None`. Instructions from external servers are marked as dynamic since they may change between connections. ###### Returns [`messages.InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) | [`None`](https://docs.python.org/3/library/constants.html#None) -- An `InstructionPart` with the server's instructions if `include_instructions` is enabled, otherwise `None`. ###### Parameters **`ctx`** : [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[`AgentDepsT`\] The run context for this agent run. --- # [pydantic_ai.usage](https://pydantic.dev/docs/ai/api/pydantic-ai/usage/) # pydantic\_ai.usage ### UsageBase #### Attributes ##### input\_tokens Number of input/prompt tokens. **Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`int`](https://docs.python.org/3/library/functions.html#int), `Field`(`validation_alias`\=(`AliasChoices`(`input_tokens`, `request_tokens`)))\] **Default:** `0` ##### cache\_write\_tokens Number of tokens written to the cache. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### cache\_read\_tokens Number of tokens read from the cache. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### output\_tokens Number of output/completion tokens. **Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`int`](https://docs.python.org/3/library/functions.html#int), `Field`(`validation_alias`\=(`AliasChoices`(`output_tokens`, `response_tokens`)))\] **Default:** `0` ##### input\_audio\_tokens Number of audio input tokens. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### cache\_audio\_read\_tokens Number of audio tokens read from the cache. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### output\_audio\_tokens Number of audio output tokens. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### details Any extra details returned by the model. **Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int)\], `BeforeValidator`([`lambda`](https://docs.python.org/3/glossary.html#term-lambda) `d`: `d` [`or`](https://docs.python.org/3/reference/expressions.html#or) {})\] **Default:** `dataclasses.field(default_factory=(dict[str, int]))` ##### total\_tokens Sum of `input_tokens + output_tokens`. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) #### Methods ##### \_\_copy\_\_ ```python def __copy__() -> UsageBase ``` Shallow copy that also copies mutable fields like `details`. ###### Returns `UsageBase` ##### opentelemetry\_attributes ```python def opentelemetry_attributes() -> dict[str, int] ``` Get the token usage values as OpenTelemetry attributes. ###### Returns [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int)\] ##### has\_values ```python def has_values() -> bool ``` Whether any values are set and non-zero. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ### RequestUsage **Bases:** `UsageBase` LLM usage associated with a single request. This is an implementation of `genai_prices.types.AbstractUsage` so it can be used to calculate the price of the request using [genai-prices](https://github.com/pydantic/genai-prices). #### Methods ##### incr ```python def incr(incr_usage: RequestUsage) -> None ``` Increment the usage in place. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`incr_usage`** : [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) The usage to increment by. ##### \_\_add\_\_ ```python def __add__(other: RequestUsage) -> RequestUsage ``` Add two RequestUsages together. This is provided so it's trivial to sum usage information from multiple parts of a response. **WARNING:** this CANNOT be used to sum multiple requests without breaking some pricing calculations. ###### Returns [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) ##### extract `@classmethod` ```python def extract( cls, data: Any, provider: str, provider_url: str, provider_fallback: str, api_flavor: str = 'default', details: dict[str, Any] | None = None, ) -> RequestUsage ``` Extract usage information from the response data using genai-prices. ###### Returns [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) ###### Parameters **`data`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) The response data from the model API. **`provider`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The actual provider ID **`provider_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The provider base\_url **`provider_fallback`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The fallback provider ID to use if the actual provider is not found in genai-prices. For example, an OpenAI model should set this to "openai" in case it has an obscure provider ID. **`api_flavor`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'default'` The API flavor to use when extracting usage information, e.g. 'chat' or 'responses' for OpenAI. **`details`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Becomes the `details` field on the returned `RequestUsage` for convenience. ### RunUsage **Bases:** `UsageBase` LLM usage associated with an agent run. Responsibility for calculating request usage is on the model; Pydantic AI simply sums the usage information across requests. #### Attributes ##### requests Number of requests made to the LLM API. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### tool\_calls Number of successful tool calls executed during the run. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### input\_tokens Total number of input/prompt tokens. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### cache\_write\_tokens Total number of tokens written to the cache. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### cache\_read\_tokens Total number of tokens read from the cache. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### input\_audio\_tokens Total number of audio input tokens. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### cache\_audio\_read\_tokens Total number of audio tokens read from the cache. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### output\_tokens Total number of output/completion tokens. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0` ##### details Any extra details returned by the model. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int)\] **Default:** `dataclasses.field(default_factory=(dict[str, int]))` #### Methods ##### incr ```python def incr(incr_usage: RunUsage | RequestUsage) -> None ``` Increment the usage in place. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`incr_usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) The usage to increment by. ##### \_\_add\_\_ ```python def __add__(other: RunUsage | RequestUsage) -> RunUsage ``` Add two RunUsages together. This is provided so it's trivial to sum usage information from multiple runs. ###### Returns [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) ### Usage **Bases:** [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) Deprecated alias for `RunUsage`. ### UsageLimits Limits on model usage. The request count is tracked by pydantic\_ai, and the request limit is checked before each request to the model. Token counts are provided in responses from the model, and the token limits are checked after each response. Each of the limits can be set to `None` to disable that limit. #### Attributes ##### request\_limit The maximum number of requests allowed to the model. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `request_limit` ##### tool\_calls\_limit The maximum number of successful tool calls allowed to be executed. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `tool_calls_limit` ##### input\_tokens\_limit The maximum number of input/prompt tokens allowed. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `input_tokens_limit if input_tokens_limit is not None else request_tokens_limit` ##### output\_tokens\_limit The maximum number of output/response tokens allowed. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `output_tokens_limit if output_tokens_limit is not None else response_tokens_limit` ##### total\_tokens\_limit The maximum number of tokens allowed in requests and responses combined. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `total_tokens_limit` ##### count\_tokens\_before\_request If True, perform a token counting pass before sending the request to the model, to enforce `request_tokens_limit` ahead of time. This may incur additional overhead (from calling the model's `count_tokens` API before making the actual request) and is disabled by default. Supported by: - Anthropic - Google - Bedrock Converse Support for OpenAI is in development: [https://github.com/pydantic/pydantic-ai/issues/3430](https://github.com/pydantic/pydantic-ai/issues/3430) **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `count_tokens_before_request` #### Methods ##### has\_token\_limits ```python def has_token_limits() -> bool ``` Returns `True` if this instance places any limits on token counts. If this returns `False`, the `check_tokens` method will never raise an error. This is useful because if we have token limits, we need to check them after receiving each streamed message. If there are no limits, we can skip that processing in the streaming response iterator. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### check\_before\_request ```python def check_before_request(usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the next request would exceed any of the limits. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### check\_tokens ```python def check_tokens(usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the usage exceeds any of the token limits. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### check\_before\_tool\_call ```python def check_before_tool_call(projected_usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the next tool call(s) would exceed the tool call limit. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) --- # [pydantic_evals.dataset](https://pydantic.dev/docs/ai/api/pydantic_evals/dataset/) # pydantic\_evals.dataset Dataset management for pydantic evals. This module provides functionality for creating, loading, saving, and evaluating datasets of test cases. Each case must have inputs, and can optionally have a name, expected output, metadata, and case-specific evaluators. Datasets can be loaded from and saved to YAML or JSON files, and can be evaluated against a task function to produce an evaluation report. ### Case **Bases:** `Generic[InputsT, OutputT, MetadataT]` A single row of a [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset). Each case represents a single test scenario with inputs to test. A case may optionally specify a name, expected outputs to compare against, and arbitrary metadata. Cases can also have their own specific evaluators which are run in addition to dataset-level evaluators. Example: ```python from pydantic_evals import Case case = Case( name='Simple addition', inputs={'a': 1, 'b': 2}, expected_output=3, metadata={'description': 'Tests basic addition'}, ) ``` #### Attributes ##### name Name of the case. This is used to identify the case in the report and can be used to filter cases. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name` ##### inputs Inputs to the task. This is the input to the task that will be evaluated. **Type:** `InputsT` **Default:** `inputs` ##### metadata Metadata to be used in the evaluation. This can be used to provide additional information about the case to the evaluators. **Type:** `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `metadata` ##### expected\_output Expected output of the task. This is the expected output of the task that will be evaluated. **Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `expected_output` ##### evaluators Evaluators to be used just on this case. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\] **Default:** `list(evaluators)` #### Methods ##### \_\_init\_\_ ```python def __init__( name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ) ``` Initialize a new test case. ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional name for the case. If not provided, a generic name will be assigned when added to a dataset. **`inputs`** : `InputsT` The inputs to the task being evaluated. **`metadata`** : `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata for the case, which can be used by evaluators. **`expected_output`** : `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional expected output of the task, used for comparison in evaluators. **`evaluators`** : [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\], ...\] _Default:_ `()` Tuple of evaluators specific to this case. These are in addition to any dataset-level evaluators. ### Dataset **Bases:** `BaseModel`, `Generic[InputsT, OutputT, MetadataT]` A dataset of test [cases](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case). Datasets allow you to organize a collection of test cases and evaluate them against a task function. They can be loaded from and saved to YAML or JSON files, and can have dataset-level evaluators that apply to all cases. Example: ```python # Create a dataset with two test cases from dataclasses import dataclass from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output dataset = Dataset( name='uppercase_tests', cases=[ Case(name='test1', inputs={'text': 'Hello'}, expected_output='HELLO'), Case(name='test2', inputs={'text': 'World'}, expected_output='WORLD'), ], evaluators=[ExactMatch()], ) # Evaluate the dataset against a task function async def uppercase(inputs: dict) -> str: return inputs['text'].upper() async def main(): report = await dataset.evaluate(uppercase) report.print() ''' Evaluation Summary: uppercase ┏━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ test1 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ test2 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └──────────┴────────────┴──────────┘ ''' ``` #### Attributes ##### name Name of the dataset. Required in future versions. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### cases List of test cases in the dataset. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Case`\[`InputsT`, `OutputT`, `MetadataT`\]\] ##### evaluators List of evaluators to be used on all cases in the dataset. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\] **Default:** `[]` ##### report\_evaluators Evaluators that operate on the full report to produce experiment-wide analyses. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\] **Default:** `[]` #### Methods ##### \_\_init\_\_ ```python def __init__( name: str | None = None, cases: Sequence[Case[InputsT, OutputT, MetadataT]], evaluators: Sequence[Evaluator[InputsT, OutputT, MetadataT]] = (), report_evaluators: Sequence[ReportEvaluator[InputsT, OutputT, MetadataT]] = (), ) ``` Initialize a new dataset with test cases and optional evaluators. ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Name for the dataset. Omitting this is deprecated and will raise an error in a future version. **`cases`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Case`\[`InputsT`, `OutputT`, `MetadataT`\]\] Sequence of test cases to include in the dataset. **`evaluators`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\] _Default:_ `()` Optional sequence of evaluators to apply to all cases in the dataset. **`report_evaluators`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\] _Default:_ `()` Optional sequence of report evaluators that run on the full evaluation report. ##### evaluate `@async` ```python def evaluate( task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, task_name: str | None = None, metadata: dict[str, Any] | None = None, repeat: int = 1, lifecycle: type[CaseLifecycle[InputsT, OutputT, MetadataT]] | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT] ``` Evaluates the test cases in the dataset using the given task. This method runs the task on each case in the dataset, applies evaluators, and collects results into a report. Cases are run concurrently, limited by `max_concurrency` if specified. ###### Returns `EvaluationReport`\[`InputsT`, `OutputT`, `MetadataT`\] -- A report containing the results of the evaluation. ###### Parameters **`task`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`InputsT`\], [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`OutputT`\]\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`InputsT`\], `OutputT`\] The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task\_name will be used; if that is not specified, the name of the task function is used. **`max_concurrency`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. **`progress`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to show a progress bar for the evaluation. Defaults to `True`. **`retry_task`** : `RetryConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional retry configuration for the task execution. **`retry_evaluators`** : `RetryConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional retry configuration for evaluator execution. **`task_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional override to the name of the task being executed, otherwise the name of the task function will be used. **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional dict of experiment metadata. **`repeat`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `1` Number of times to run each case. When > 1, each case is run multiple times and results are grouped by the original case name for aggregation. Defaults to 1. **`lifecycle`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`CaseLifecycle`\[`InputsT`, `OutputT`, `MetadataT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional lifecycle class for per-case setup, context preparation, and teardown hooks. A new instance is created for each case. See [`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle). ##### evaluate\_sync ```python def evaluate_sync( task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, task_name: str | None = None, metadata: dict[str, Any] | None = None, repeat: int = 1, lifecycle: type[CaseLifecycle[InputsT, OutputT, MetadataT]] | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT] ``` Evaluates the test cases in the dataset using the given task. This is a synchronous wrapper around [`evaluate`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate) provided for convenience. ###### Returns `EvaluationReport`\[`InputsT`, `OutputT`, `MetadataT`\] -- A report containing the results of the evaluation. ###### Parameters **`task`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`InputsT`\], [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`OutputT`\]\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`InputsT`\], `OutputT`\] The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task\_name will be used; if that is not specified, the name of the task function is used. **`max_concurrency`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. **`progress`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to show a progress bar for the evaluation. Defaults to `True`. **`retry_task`** : `RetryConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional retry configuration for the task execution. **`retry_evaluators`** : `RetryConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional retry configuration for evaluator execution. **`task_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional override to the name of the task being executed, otherwise the name of the task function will be used. **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional dict of experiment metadata. **`repeat`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `1` Number of times to run each case. When > 1, each case is run multiple times and results are grouped by the original case name for aggregation. Defaults to 1. **`lifecycle`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`CaseLifecycle`\[`InputsT`, `OutputT`, `MetadataT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional lifecycle class for per-case setup, context preparation, and teardown hooks. A new instance is created for each case. See [`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle). ##### add\_case ```python def add_case( name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ) -> None ``` Adds a case to the dataset. This is a convenience method for creating a [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) and adding it to the dataset. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional name for the case. If not provided, a generic name will be assigned. **`inputs`** : `InputsT` The inputs to the task being evaluated. **`metadata`** : `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata for the case, which can be used by evaluators. **`expected_output`** : `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The expected output of the task, used for comparison in evaluators. **`evaluators`** : [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\], ...\] _Default:_ `()` Tuple of evaluators specific to this case, in addition to dataset-level evaluators. ##### add\_evaluator ```python def add_evaluator( evaluator: Evaluator[InputsT, OutputT, MetadataT], specific_case: str | None = None, ) -> None ``` Adds an evaluator to the dataset or a specific case. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`evaluator`** : `Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\] The evaluator to add. **`specific_case`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` If provided, the evaluator will only be added to the case with this name. If None, the evaluator will be added to all cases in the dataset. ###### Raises - `ValueError` -- If `specific_case` is provided but no case with that name exists in the dataset. ##### from\_file `@classmethod` ```python def from_file( cls, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), custom_report_evaluator_types: Sequence[type[ReportEvaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self ``` Load a dataset from a file. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) -- A new Dataset instance loaded from the file. ###### Parameters **`path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) Path to the file to load. **`fmt`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['yaml', 'json'\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Format of the file. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. **`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()` Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. **`custom_report_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()` Custom report evaluator classes to use when deserializing the dataset. These are additional report evaluators beyond the default ones. ###### Raises - `ValidationError` -- If the file cannot be parsed as a valid dataset. - `ValueError` -- If the format cannot be inferred from the file extension. ##### from\_text `@classmethod` ```python def from_text( cls, contents: str, fmt: Literal['yaml', 'json'] = 'yaml', custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), custom_report_evaluator_types: Sequence[type[ReportEvaluator[InputsT, OutputT, MetadataT]]] = (), default_name: str | None = None, ) -> Self ``` Load a dataset from a string. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) -- A new Dataset instance parsed from the string. ###### Parameters **`contents`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The string content to parse. **`fmt`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['yaml', 'json'\] _Default:_ `'yaml'` Format of the content. Must be either 'yaml' or 'json'. **`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()` Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. **`custom_report_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()` Custom report evaluator classes to use when deserializing the dataset. These are additional report evaluators beyond the default ones. **`default_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default name of the dataset, to be used if not specified in the serialized contents. ###### Raises - `ValidationError` -- If the content cannot be parsed as a valid dataset. ##### from\_dict `@classmethod` ```python def from_dict( cls, data: dict[str, Any], custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), custom_report_evaluator_types: Sequence[type[ReportEvaluator[InputsT, OutputT, MetadataT]]] = (), default_name: str | None = None, ) -> Self ``` Load a dataset from a dictionary. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) -- A new Dataset instance created from the dictionary. ###### Parameters **`data`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] Dictionary representation of the dataset. **`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()` Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. **`custom_report_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()` Custom report evaluator classes to use when deserializing the dataset. These are additional report evaluators beyond the default ones. **`default_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Default name of the dataset, to be used if not specified in the data. ###### Raises - `ValidationError` -- If the dictionary cannot be converted to a valid dataset. ##### to\_file ```python def to_file( path: Path | str, fmt: Literal['yaml', 'json'] | None = None, schema_path: Path | str | None = DEFAULT_SCHEMA_PATH_TEMPLATE, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), custom_report_evaluator_types: Sequence[type[ReportEvaluator[InputsT, OutputT, MetadataT]]] = (), ) ``` Save the dataset to a file. ###### Parameters **`path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) Path to save the dataset to. **`fmt`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['yaml', 'json'\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Format to use. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. **`schema_path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `DEFAULT_SCHEMA_PATH_TEMPLATE` Path to save the JSON schema to. If None, no schema will be saved. Can be a string template with {stem} which will be replaced with the dataset filename stem. **`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()` Custom evaluator classes to include in the schema. **`custom_report_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()` Custom report evaluator classes to include in the schema. ##### model\_json\_schema\_with\_evaluators `@classmethod` ```python def model_json_schema_with_evaluators( cls, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), custom_report_evaluator_types: Sequence[type[ReportEvaluator[InputsT, OutputT, MetadataT]]] = (), ) -> dict[str, Any] ``` Generate a JSON schema for this dataset type, including evaluator details. This is useful for generating a schema that can be used to validate YAML-format dataset files. ###### Returns [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- A dictionary representing the JSON schema. ###### Parameters **`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()` Custom evaluator classes to include in the schema. **`custom_report_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()` Custom report evaluator classes to include in the schema. ### set\_eval\_attribute ```python def set_eval_attribute(name: str, value: Any) -> None ``` Set an attribute on the current task run. #### Returns [`None`](https://docs.python.org/3/library/constants.html#None) #### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The name of the attribute. **`value`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) The value of the attribute. ### increment\_eval\_metric ```python def increment_eval_metric(name: str, amount: int | float) -> None ``` Increment a metric on the current task run. #### Returns [`None`](https://docs.python.org/3/library/constants.html#None) #### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The name of the metric. **`amount`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`float`](https://docs.python.org/3/library/functions.html#float) The amount to increment by. ### InputsT Generic type for the inputs to the task being evaluated. **Default:** `TypeVar('InputsT', default=Any)` ### OutputT Generic type for the expected output of the task being evaluated. **Default:** `TypeVar('OutputT', default=Any)` ### MetadataT Generic type for the metadata associated with the task being evaluated. **Default:** `TypeVar('MetadataT', default=Any)` ### DEFAULT\_DATASET\_PATH Default path for saving/loading datasets. **Default:** `'./test_cases.yaml'` ### DEFAULT\_SCHEMA\_PATH\_TEMPLATE Default template for schema file paths, where {stem} is replaced with the dataset filename stem. **Default:** `'./{stem}_schema.json'` --- # [pydantic_evals.evaluators](https://pydantic.dev/docs/ai/api/pydantic_evals/evaluators/) # pydantic\_evals.evaluators ### ReportEvaluatorContext **Bases:** `Generic[InputsT, OutputT, MetadataT]` Context for report-level evaluation, containing the full experiment results. #### Attributes ##### name The experiment name. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### report The full evaluation report. **Type:** `EvaluationReport`\[`InputsT`, `OutputT`, `MetadataT`\] ##### experiment\_metadata Experiment-level metadata. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ### Equals **Bases:** `Evaluator[object, object, object]` Check if the output exactly equals the provided value. ### EvaluatorContext **Bases:** `Generic[InputsT, OutputT, MetadataT]` Context for evaluating a task execution. An instance of this class is the sole input to all Evaluators. It contains all the information needed to evaluate the task execution, including inputs, outputs, metadata, and telemetry data. Evaluators use this context to access the task inputs, actual output, expected output, and other information when evaluating the result of the task execution. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: # Use the context to access task inputs, outputs, and expected outputs return ctx.output == ctx.expected_output ``` #### Attributes ##### name The name of the case. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### inputs The inputs provided to the task for this case. **Type:** `InputsT` ##### metadata Metadata associated with the case, if provided. May be None if no metadata was specified. **Type:** `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### expected\_output The expected output for the case, if provided. May be None if no expected output was specified. **Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### output The actual output produced by the task for this case. **Type:** `OutputT` ##### duration The duration of the task run for this case. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) ##### attributes Attributes associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.set_eval_attribute` in any code executed during the evaluation task. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### metrics Metrics associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.increment_eval_metric` in any code executed during the evaluation task. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int) | [`float`](https://docs.python.org/3/library/functions.html#float)\] ##### span\_tree Get the `SpanTree` for this task execution. The span tree is a graph where each node corresponds to an OpenTelemetry span recorded during the task execution, including timing information and any custom spans created during execution. **Type:** `SpanTree` ### EvaluationReason The result of running an evaluator with an optional explanation. Contains a scalar value and an optional "reason" explaining the value. #### Constructor Parameters **`value`** : `EvaluationScalar` The scalar result of the evaluation (boolean, integer, float, or string). **`reason`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` An optional explanation of the evaluation result. ### ReportEvaluator **Bases:** `BaseEvaluator`, `Generic[InputsT, OutputT, MetadataT]` Base class for experiment-wide evaluators that analyze full reports. Unlike case-level Evaluators which assess individual task outputs, ReportEvaluators see all case results together and produce experiment-wide analyses like confusion matrices, precision-recall curves, or scalar statistics. #### Methods ##### evaluate `@abstractmethod` ```python def evaluate( ctx: ReportEvaluatorContext[InputsT, OutputT, MetadataT], ) -> ReportAnalysis | list[ReportAnalysis] | Awaitable[ReportAnalysis | list[ReportAnalysis]] ``` Evaluate the full report and return experiment-wide analysis/analyses. ###### Returns `ReportAnalysis` | [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportAnalysis`\] | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`ReportAnalysis` | [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportAnalysis`\]\] ##### evaluate\_async `@async` ```python def evaluate_async( ctx: ReportEvaluatorContext[InputsT, OutputT, MetadataT], ) -> ReportAnalysis | list[ReportAnalysis] ``` Evaluate, handling both sync and async implementations. ###### Returns `ReportAnalysis` | [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportAnalysis`\] ### EqualsExpected **Bases:** `Evaluator[object, object, object]` Check if the output exactly equals the expected output. ### EvaluationResult **Bases:** `Generic[EvaluationScalarT]` The details of an individual evaluation result. Contains the name, value, reason, and source evaluator for a single evaluation. #### Constructor Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The name of the evaluation. **`value`** : `EvaluationScalarT` The scalar result of the evaluation. **`reason`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) An optional explanation of the evaluation result. **`source`** : `EvaluatorSpec` The spec of the evaluator that produced this result. #### Methods ##### downcast ```python def downcast(value_types: type[T] = ()) -> EvaluationResult[T] | None ``` Attempt to downcast this result to a more specific type. ###### Returns `EvaluationResult`\[`T`\] | [`None`](https://docs.python.org/3/library/constants.html#None) -- A downcast version of this result if the value is an instance of one of the given types, `EvaluationResult`\[`T`\] | [`None`](https://docs.python.org/3/library/constants.html#None) -- otherwise None. ###### Parameters **`*value_types`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`T`\] _Default:_ `()` The types to check the value against. ### Contains **Bases:** `Evaluator[object, object, object]` Check if the output contains the expected output. For strings, checks if expected\_output is a substring of output. For lists/tuples, checks if expected\_output is in output. For dicts, checks if all key-value pairs in expected\_output are in output. For model-like types (BaseModel, dataclasses), converts to a dict and checks key-value pairs. Note: case\_sensitive only applies when both the value and output are strings. ### EvaluatorFailure Represents a failure raised during the execution of an evaluator. ### ConfusionMatrixEvaluator **Bases:** `ReportEvaluator` Computes a confusion matrix from case data. ### Evaluator **Bases:** `BaseEvaluator`, `Generic[InputsT, OutputT, MetadataT]` Base class for all evaluators. Evaluators can assess the performance of a task in a variety of ways, as a function of the EvaluatorContext. Subclasses must implement the `evaluate` method. Note it can be defined with either `def` or `async def`. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output ``` #### Methods ##### name `@classmethod` `@deprecated` ```python def name(cls) -> str ``` `name` has been renamed, use `get_serialization_name` instead. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### get\_default\_evaluation\_name ```python def get_default_evaluation_name() -> str ``` Return the default name to use in reports for the output of this evaluator. By default, if the evaluator has an attribute called `evaluation_name` of type string, that will be used. Otherwise, the serialization name of the evaluator (which is usually the class name) will be used. This can be overridden to get a more descriptive name in evaluation reports, e.g. using instance information. Note that evaluators that return a mapping of results will always use the keys of that mapping as the names of the associated evaluation results. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### evaluate `@abstractmethod` ```python def evaluate( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorOutput | Awaitable[EvaluatorOutput] ``` Evaluate the task output in the given context. This is the main evaluation method that subclasses must implement. It can be either synchronous or asynchronous, returning either an EvaluatorOutput directly or an Awaitable\[EvaluatorOutput\]. ###### Returns `EvaluatorOutput` | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`EvaluatorOutput`\] -- The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping `EvaluatorOutput` | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`EvaluatorOutput`\] -- of evaluation names to either of those. Can be returned either synchronously or as an `EvaluatorOutput` | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`EvaluatorOutput`\] -- awaitable for asynchronous evaluation. ###### Parameters **`ctx`** : `EvaluatorContext`\[`InputsT`, `OutputT`, `MetadataT`\] The context containing the inputs, outputs, and metadata for evaluation. ##### evaluate\_sync ```python def evaluate_sync(ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput ``` Run the evaluator synchronously, handling both sync and async implementations. This method ensures synchronous execution by running any async evaluate implementation to completion using run\_until\_complete. ###### Returns `EvaluatorOutput` -- The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping `EvaluatorOutput` -- of evaluation names to either of those. ###### Parameters **`ctx`** : `EvaluatorContext`\[`InputsT`, `OutputT`, `MetadataT`\] The context containing the inputs, outputs, and metadata for evaluation. ##### evaluate\_async `@async` ```python def evaluate_async( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorOutput ``` Run the evaluator asynchronously, handling both sync and async implementations. This method ensures asynchronous execution by properly awaiting any async evaluate implementation. For synchronous implementations, it returns the result directly. ###### Returns `EvaluatorOutput` -- The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping `EvaluatorOutput` -- of evaluation names to either of those. ###### Parameters **`ctx`** : `EvaluatorContext`\[`InputsT`, `OutputT`, `MetadataT`\] The context containing the inputs, outputs, and metadata for evaluation. ### IsInstance **Bases:** `Evaluator[object, object, object]` Check if the output is an instance of a type with the given name. ### MaxDuration **Bases:** `Evaluator[object, object, object]` Check if the execution time is under the specified maximum. ### PrecisionRecallEvaluator **Bases:** `ReportEvaluator` Computes a precision-recall curve from case data. Returns both a `PrecisionRecall` chart and a `ScalarResult` with the AUC value. The AUC is computed at full resolution (every unique score threshold) for accuracy, while the chart points are downsampled to `n_thresholds` for display. ### OutputConfig **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Configuration for the score and assertion outputs of the LLMJudge evaluator. ### LLMJudge **Bases:** `Evaluator[object, object, object]` Judge whether the output of a language model meets the criteria of a provided rubric. If you do not specify a model, it uses the default model for judging. This starts as 'openai:gpt-5.2', but can be overridden by calling [`set_default_judge_model`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.llm_as_a_judge.set_default_judge_model). ### ROCAUCEvaluator **Bases:** `ReportEvaluator` Computes an ROC curve and AUC from case data. Returns a `LinePlot` with the ROC curve (plus a dashed random-baseline diagonal) and a `ScalarResult` with the AUC value. ### HasMatchingSpan **Bases:** `Evaluator[object, object, object]` Check if the span tree contains a span that matches the specified query. ### KolmogorovSmirnovEvaluator **Bases:** `ReportEvaluator` Computes a Kolmogorov-Smirnov plot and statistic from case data. Plots the empirical CDFs of the score distribution for positive and negative cases, and computes the KS statistic (maximum vertical distance between the two CDFs). Returns a `LinePlot` with the two CDF curves and a `ScalarResult` with the KS statistic. ### EvaluatorSpec The specification of an evaluator to be run. This class is used to represent evaluators in a serializable format, supporting various short forms for convenience when defining evaluators in YAML or JSON dataset files. In particular, each of the following forms is supported for specifying an evaluator with name `MyEvaluator`: - `'MyEvaluator'` - Just the (string) name of the Evaluator subclass is used if its `__init__` takes no arguments - `{'MyEvaluator': first_arg}` - A single argument is passed as the first positional argument to `MyEvaluator.__init__` - `{'MyEvaluator': {k1: v1, k2: v2}}` - Multiple kwargs are passed to `MyEvaluator.__init__` **Default:** `NamedSpec` ### EvaluatorOutput Type for the output of an evaluator, which can be a scalar, an EvaluationReason, or a mapping of names to either. **Default:** `EvaluationScalar | EvaluationReason | Mapping[str, EvaluationScalar | EvaluationReason]` ### GradingOutput **Bases:** `BaseModel` The output of a grading operation. ### judge\_output `@async` ```python def judge_output( output: Any, rubric: str, model: models.Model | models.KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. #### Returns `GradingOutput` ### judge\_input\_output `@async` ```python def judge_input_output( inputs: Any, output: Any, rubric: str, model: models.Model | models.KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. #### Returns `GradingOutput` ### judge\_input\_output\_expected `@async` ```python def judge_input_output_expected( inputs: Any, output: Any, expected_output: Any, rubric: str, model: models.Model | models.KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. #### Returns `GradingOutput` ### judge\_output\_expected `@async` ```python def judge_output_expected( output: Any, expected_output: Any, rubric: str, model: models.Model | models.KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the expected output, output, and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. #### Returns `GradingOutput` ### set\_default\_judge\_model ```python def set_default_judge_model(model: models.Model | models.KnownModelName) -> None ``` Set the default model used for judging. This model is used if `None` is passed to the `model` argument of `judge_output` and `judge_input_output`. #### Returns [`None`](https://docs.python.org/3/library/constants.html#None) --- # [pydantic_evals.generation](https://pydantic.dev/docs/ai/api/pydantic_evals/generation/) # pydantic\_evals.generation Utilities for generating example datasets for pydantic\_evals. This module provides functions for generating sample datasets for testing and examples, using LLMs to create realistic test data with proper structure. ### generate\_dataset `@async` ```python def generate_dataset( dataset_type: type[Dataset[InputsT, OutputT, MetadataT]], path: Path | str | None = None, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), model: models.Model | models.KnownModelName = 'openai:gpt-5.2', n_examples: int = 3, extra_instructions: str | None = None, ) -> Dataset[InputsT, OutputT, MetadataT] ``` Use an LLM to generate a dataset of test cases, each consisting of input, expected output, and metadata. This function creates a properly structured dataset with the specified input, output, and metadata types. It uses an LLM to attempt to generate realistic test cases that conform to the types' schemas. #### Returns `Dataset`\[`InputsT`, `OutputT`, `MetadataT`\] -- A properly structured Dataset object with generated test cases. #### Parameters **`path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional path to save the generated dataset. If provided, the dataset will be saved to this location. **`dataset_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`Dataset`\[`InputsT`, `OutputT`, `MetadataT`\]\] The type of dataset to generate, with the desired input, output, and metadata types. **`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()` Optional sequence of custom evaluator classes to include in the schema. **`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) _Default:_ `'openai:gpt-5.2'` The Pydantic AI model to use for generation. Defaults to 'openai:gpt-5.2'. **`n_examples`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `3` Number of examples to generate. Defaults to 3. **`extra_instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional instructions to provide to the LLM. #### Raises - `ValidationError` -- If the LLM's response cannot be parsed as a valid dataset. ### InputsT Generic type for the inputs to the task being evaluated. **Default:** `TypeVar('InputsT', default=Any)` ### OutputT Generic type for the expected output of the task being evaluated. **Default:** `TypeVar('OutputT', default=Any)` ### MetadataT Generic type for the metadata associated with the task being evaluated. **Default:** `TypeVar('MetadataT', default=Any)` --- # [pydantic_evals.lifecycle](https://pydantic.dev/docs/ai/api/pydantic_evals/lifecycle/) # pydantic\_evals.lifecycle Case lifecycle hooks for pydantic evals. This module provides the [`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle) class, which allows defining setup, context preparation, and teardown hooks that run at different stages of case evaluation. ### CaseLifecycle **Bases:** `Generic[InputsT, OutputT, MetadataT]` Per-case lifecycle hooks for evaluation. A new instance is created for each case during evaluation. Subclass and override any methods you need -- all methods are no-ops by default. The evaluation flow for each case is: 1. `setup()` -- called before task execution 2. Task runs 3. `prepare_context()` -- called after task, before evaluators; can enrich metrics/attributes 4. Evaluators run 5. `teardown()` -- called after evaluators complete; receives the full result Exceptions raised by `setup()` or `prepare_context()` are caught and recorded as a `ReportCaseFailure`; `teardown()` is still called afterward so you can clean up. Exceptions raised by `teardown()` propagate to the caller and may abort the evaluation. If your teardown may raise and you don't want it to crash the evaluation run, handle exceptions within your `teardown()` implementation itself. #### Constructor Parameters **`case`** : `Case`\[`InputsT`, `OutputT`, `MetadataT`\] The case being evaluated. Available as `self.case` in all hooks. #### Attributes ##### case The case being evaluated. **Type:** `Case`\[`InputsT`, `OutputT`, `MetadataT`\] #### Methods ##### setup `@async` ```python def setup() -> None ``` Called before task execution. Override to perform per-case resource setup (e.g., create a test database, start a service). The case metadata is available via `self.case.metadata`. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### prepare\_context `@async` ```python def prepare_context( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorContext[InputsT, OutputT, MetadataT] ``` Called after the task completes, before evaluators run. Override to enrich the evaluator context with additional metrics or attributes derived from the task output, span tree, or external state. ###### Returns `EvaluatorContext`\[`InputsT`, `OutputT`, `MetadataT`\] -- The (possibly modified) evaluator context to pass to evaluators. ###### Parameters **`ctx`** : `EvaluatorContext`\[`InputsT`, `OutputT`, `MetadataT`\] The evaluator context produced by the task run. ##### teardown `@async` ```python def teardown( result: ReportCase[InputsT, OutputT, MetadataT] | ReportCaseFailure[InputsT, OutputT, MetadataT], ) -> None ``` Called after evaluators complete. Override to perform per-case resource cleanup. The result is provided so that teardown logic can vary based on success/failure (e.g., keep resources up for inspection on failure). ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`result`** : `ReportCase`\[`InputsT`, `OutputT`, `MetadataT`\] | `ReportCaseFailure`\[`InputsT`, `OutputT`, `MetadataT`\] The evaluation result -- either a `ReportCase` (success) or `ReportCaseFailure`. --- # [pydantic_evals.online](https://pydantic.dev/docs/ai/api/pydantic_evals/online/) # pydantic\_evals.online Online evaluation -- attach evaluators to live functions for automatic background evaluation. This module provides the infrastructure for running evaluators on production (or staging) traffic. The same `Evaluator` instances used with `Dataset.evaluate()` work here, the difference is in how they are wired up (decorator vs dataset) rather than what they are. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.online import evaluate @dataclass class IsNonEmpty(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return bool(ctx.output) @evaluate(IsNonEmpty()) async def my_function(x: int) -> int: return x ``` ### SamplingContext Context available when deciding whether to sample an evaluator. Contains the information available _before_ the decorated function runs -- the evaluator instance, function inputs, config metadata, and a per-call random seed. The function's output and duration are not yet available at sampling time. #### Attributes ##### evaluator The evaluator being sampled. **Type:** `Evaluator` ##### inputs The inputs to the decorated function. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ##### metadata Metadata from the [`OnlineEvalConfig`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig), if set. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### call\_seed A uniform random value in \[0, 1) generated once per decorated function call. Shared across all evaluators for the same call. In `'correlated'` sampling mode this is used automatically; in `'independent'` mode it is available for custom `sample_rate` callables that want to implement their own correlated logic. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) ### SpanReference Identifies a span that evaluation results should be associated with. Used by sinks to associate evaluation results with the original function execution span. #### Attributes ##### trace\_id The trace ID of the span. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### span\_id The span ID of the span. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### EvaluationSink **Bases:** [`Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol) Protocol for evaluation result destinations. Implementations receive evaluation results and can send them to any backend (Logfire annotations, custom callback, stdout, etc.). #### Methods ##### submit `@async` ```python def submit( results: Sequence[EvaluationResult], failures: Sequence[EvaluatorFailure], context: EvaluatorContext, span_reference: SpanReference | None, ) -> None ``` Submit evaluation results to the sink. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`results`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`EvaluationResult`\] Evaluation results from successful evaluator runs. **`failures`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`EvaluatorFailure`\] Failures from evaluator runs that raised exceptions. **`context`** : `EvaluatorContext` The full evaluator context for the function call. **`span_reference`** : `SpanReference` | [`None`](https://docs.python.org/3/library/constants.html#None) Reference to the OTel span for the function call, if available. ### CallbackSink An `EvaluationSink` that delegates to a user-provided callable. The callback receives the results, failures, and context. The span\_reference is not passed to the callback -- use a custom `EvaluationSink` implementation if you need it. ### OnlineEvaluator Wraps an `Evaluator` with per-evaluator online configuration. Different evaluators often need different settings -- a cheap heuristic should run on 100% of traffic while an expensive LLM judge might run on only 1%. #### Attributes ##### evaluator The evaluator to run. **Type:** `Evaluator` ##### sample\_rate Probability of running this evaluator (0.0-1.0), or a callable returning a float or bool. When a callable, it receives a [`SamplingContext`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SamplingContext) with the function inputs, config metadata, and evaluator name -- but not the output or duration (which aren't available yet at sampling time). Defaults to `None`, which uses the config's `default_sample_rate` at each call. Set explicitly to override. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`SamplingContext`\], [`float`](https://docs.python.org/3/library/functions.html#float) | [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### max\_concurrency Maximum number of concurrent evaluations for this evaluator. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `10` ##### sink Override sink(s) for this evaluator. If `None`, the config's `default_sink` is used. **Type:** `EvaluationSink` | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`EvaluationSink` | `SinkCallback`\] | `SinkCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### on\_max\_concurrency Called when an evaluation is dropped because `max_concurrency` was reached. Receives the `EvaluatorContext` that would have been evaluated. Can be sync or async. If `None` (the default), dropped evaluations are silently ignored. **Type:** `OnMaxConcurrencyCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### on\_sampling\_error Called synchronously when a `sample_rate` callable raises an exception. Receives the exception and the evaluator. Must be sync (not async), since sampling runs before the decorated function. If set, the evaluator is skipped. If `None`, uses the config's `on_sampling_error` default. If neither is set, the exception propagates to the caller. **Type:** `OnSamplingErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### on\_error Called when an exception occurs in a sink or on\_max\_concurrency callback. Receives the exception, evaluator context, evaluator instance, and a location string (`'sink'` or `'on_max_concurrency'`). Can be sync or async. If `None`, uses the config's `on_error` default. If neither is set, exceptions are silently suppressed. **Type:** `OnErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### EvaluatorContextSource **Bases:** [`Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol) Protocol for retrieving stored evaluator contexts. Implementations reconstruct [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext) objects from stored traces (e.g., Logfire). The batch method allows fetching contexts for multiple spans in a single call. #### Methods ##### fetch `@async` ```python def fetch(span: SpanReference) -> EvaluatorContext ``` Fetch an evaluator context for a single span. ###### Returns `EvaluatorContext` -- The evaluator context for the span. ###### Parameters **`span`** : `SpanReference` Reference to the span to fetch context for. ##### fetch\_many `@async` ```python def fetch_many(spans: Sequence[SpanReference]) -> list[EvaluatorContext] ``` Fetch evaluator contexts for multiple spans in a single batch. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`EvaluatorContext`\] -- Evaluator contexts in the same order as the input spans. ###### Parameters **`spans`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`SpanReference`\] References to the spans to fetch context for. ### OnlineEvalConfig Holds cross-evaluator defaults for online evaluation. Create instances for different evaluation configurations, or use the global `DEFAULT_CONFIG` via the module-level `evaluate()` and `configure()` functions. #### Attributes ##### default\_sink Default sink(s) for evaluators that don't specify their own. **Type:** `EvaluationSink` | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`EvaluationSink` | `SinkCallback`\] | `SinkCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### default\_sample\_rate Default sample rate for evaluators that don't specify their own. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`SamplingContext`\], [`float`](https://docs.python.org/3/library/functions.html#float) | [`bool`](https://docs.python.org/3/library/functions.html#bool)\] **Default:** `1.0` ##### sampling\_mode Controls how per-evaluator sample rates interact for a single call. - `'independent'` (default): each evaluator decides independently. - `'correlated'`: a shared random seed is used so that lower-rate evaluators' calls are a subset of higher-rate ones, minimising total overhead. See [`SamplingMode`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SamplingMode) for details. **Type:** `SamplingMode` **Default:** `'independent'` ##### enabled Whether online evaluation is enabled for this config. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True` ##### metadata Optional metadata to include in evaluator contexts. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### on\_max\_concurrency Default handler called when an evaluation is dropped because `max_concurrency` was reached. Receives the `EvaluatorContext` that would have been evaluated. Can be sync or async. If `None` (the default), dropped evaluations are silently ignored. Per-evaluator `OnlineEvaluator.on_max_concurrency` overrides this default. **Type:** `OnMaxConcurrencyCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### on\_sampling\_error Default handler called synchronously when a `sample_rate` callable raises. Receives the exception and the evaluator. Must be sync (not async). If set, the evaluator is skipped. If `None` (the default), the exception propagates to the caller. Per-evaluator `OnlineEvaluator.on_sampling_error` overrides this default. **Type:** `OnSamplingErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### on\_error Default handler called when an exception occurs in a sink or on\_max\_concurrency callback. Receives the exception, evaluator context, evaluator instance, and a location string (`'sink'` or `'on_max_concurrency'`). Can be sync or async. If `None` (the default), exceptions are silently suppressed. Per-evaluator `OnlineEvaluator.on_error` overrides this default. **Type:** `OnErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` #### Methods ##### evaluate ```python def evaluate( evaluators: Evaluator | OnlineEvaluator = (), ) -> Callable[[Callable[_P, _R]], Callable[_P, _R]] ``` Decorator to attach online evaluators to a function. Bare `Evaluator` instances are auto-wrapped in `OnlineEvaluator` at decoration time (so concurrency semaphores are shared across calls). Their `sample_rate` defaults to `None`, which resolves to the config's `default_sample_rate` at each call -- so changes to the config after decoration take effect. ###### Returns [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[`_P`, `_R`\]\], [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[`_P`, `_R`\]\] -- A decorator that wraps the function with online evaluation. ###### Parameters **`*evaluators`** : `Evaluator` | `OnlineEvaluator` _Default:_ `()` Evaluators to attach. Can be `Evaluator` or `OnlineEvaluator` instances. ### disable\_evaluation ```python def disable_evaluation() -> Iterator[None] ``` Context manager to disable all online evaluation in the current context. When active, decorated functions still execute normally but no evaluators are dispatched. #### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\] ### run\_evaluators `@async` ```python def run_evaluators( evaluators: Sequence[Evaluator], context: EvaluatorContext, ) -> tuple[list[EvaluationResult], list[EvaluatorFailure]] ``` Run evaluators on a context and return results. Useful for re-running evaluators from stored data. #### Returns [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`EvaluationResult`\], [`list`](https://docs.python.org/3/glossary.html#term-list)\[`EvaluatorFailure`\]\] -- A tuple of (results, failures). #### Parameters **`evaluators`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Evaluator`\] The evaluators to run. **`context`** : `EvaluatorContext` The evaluator context to evaluate against. ### evaluate ```python def evaluate( evaluators: Evaluator | OnlineEvaluator = (), ) -> Callable[[Callable[_P, _R]], Callable[_P, _R]] ``` Decorator to attach online evaluators to a function using the global default config. Equivalent to `DEFAULT_CONFIG.evaluate(...)`. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.online import evaluate @dataclass class IsNonEmpty(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return bool(ctx.output) @evaluate(IsNonEmpty()) async def my_function(x: int) -> int: return x ``` #### Returns [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[`_P`, `_R`\]\], [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[`_P`, `_R`\]\] -- A decorator that wraps the function with online evaluation. #### Parameters **`*evaluators`** : `Evaluator` | `OnlineEvaluator` _Default:_ `()` Evaluators to attach. Can be `Evaluator` or `OnlineEvaluator` instances. ### configure ```python def configure( default_sink: EvaluationSink | Sequence[EvaluationSink | SinkCallback] | SinkCallback | None | Unset = UNSET, default_sample_rate: float | Callable[[SamplingContext], float | bool] | Unset = UNSET, sampling_mode: SamplingMode | Unset = UNSET, enabled: bool | Unset = UNSET, metadata: dict[str, Any] | None | Unset = UNSET, on_max_concurrency: OnMaxConcurrencyCallback | None | Unset = UNSET, on_sampling_error: OnSamplingErrorCallback | None | Unset = UNSET, on_error: OnErrorCallback | None | Unset = UNSET, ) -> None ``` Configure the global default `OnlineEvalConfig`. Only provided values are updated; unset arguments are ignored. Pass `None` explicitly to clear `default_sink`, `metadata`, `on_max_concurrency`, `on_sampling_error`, or `on_error`. #### Returns [`None`](https://docs.python.org/3/library/constants.html#None) #### Parameters **`default_sink`** : `EvaluationSink` | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`EvaluationSink` | `SinkCallback`\] | `SinkCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) | `Unset` _Default:_ `UNSET` Default sink(s) for evaluators. Pass `None` to clear. **`default_sample_rate`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`SamplingContext`\], [`float`](https://docs.python.org/3/library/functions.html#float) | [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | `Unset` _Default:_ `UNSET` Default sample rate for evaluators. **`sampling_mode`** : `SamplingMode` | `Unset` _Default:_ `UNSET` Sampling mode (`'independent'` or `'correlated'`). **`enabled`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | `Unset` _Default:_ `UNSET` Whether online evaluation is enabled. **`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) | `Unset` _Default:_ `UNSET` Metadata to include in evaluator contexts. Pass `None` to clear. **`on_max_concurrency`** : `OnMaxConcurrencyCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) | `Unset` _Default:_ `UNSET` Default handler for dropped evaluations. Pass `None` to clear. **`on_sampling_error`** : `OnSamplingErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) | `Unset` _Default:_ `UNSET` Default handler for sample\_rate exceptions. Pass `None` to clear. **`on_error`** : `OnErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) | `Unset` _Default:_ `UNSET` Default handler for pipeline exceptions. Pass `None` to clear. ### wait\_for\_evaluations `@async` ```python def wait_for_evaluations(timeout: float = 30.0) -> None ``` Wait for all pending background evaluation tasks and threads to complete. This is useful in tests to deterministically wait for background evaluators to finish instead of relying on timing-based sleeps. For async decorated functions, evaluators run as tasks on the caller's event loop and are awaited directly. For sync decorated functions, evaluators run in background threads which are joined with the given timeout. #### Returns [`None`](https://docs.python.org/3/library/constants.html#None) #### Parameters **`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) _Default:_ `30.0` Maximum seconds to wait for each background thread. Defaults to 30. ### OnErrorLocation The location within the online evaluation pipeline where an error occurred. **Default:** `Literal['sink', 'on_max_concurrency']` ### SamplingMode Controls how per-evaluator sample rates interact across evaluators for a single call. - `'independent'` (default): Each evaluator flips its own coin. With N evaluators each at rate _r_, the probability of _any_ evaluation overhead is `1 − (1−r)^N`. - `'correlated'`: A single random seed is generated per call and shared across evaluators. An evaluator runs when `call_seed < rate`, so lower-rate evaluators' calls are always a subset of higher-rate ones. The probability of _any_ overhead equals `max(rate_i)`. **Default:** `Literal['independent', 'correlated']` ### OnMaxConcurrencyCallback Callback invoked when an evaluation is dropped due to concurrency limits. Receives the `EvaluatorContext` that would have been evaluated. Can be sync or async. **Default:** `Callable[[EvaluatorContext], None | Awaitable[None]]` ### OnSamplingErrorCallback Callback invoked when a `sample_rate` callable raises an exception. Called synchronously before the decorated function runs. Receives the exception and the evaluator whose `sample_rate` failed. Must be sync (not async). If set, the evaluator is skipped. If not set, the exception propagates to the caller. **Default:** `Callable[[Exception, Evaluator], None]` ### OnErrorCallback Callback invoked when an exception occurs in the online evaluation pipeline. Receives the exception, the evaluator context, the evaluator instance, and a location string indicating where the error occurred. Can be sync or async. **Default:** `Callable[[Exception, EvaluatorContext, Evaluator, OnErrorLocation], None | Awaitable[None]]` ### SinkCallback Type alias for bare callables accepted wherever an `EvaluationSink` is expected. Auto-wrapped in `CallbackSink` when passed as a `sink` parameter. **Default:** `Callable[[Sequence[EvaluationResult], Sequence[EvaluatorFailure], EvaluatorContext], None | Awaitable[None]]` ### DEFAULT\_CONFIG The global default `OnlineEvalConfig` instance. Module-level functions like `evaluate()` and `configure()` delegate to this instance. **Default:** `OnlineEvalConfig()` --- # [pydantic_evals.otel](https://pydantic.dev/docs/ai/api/pydantic_evals/otel/) # pydantic\_evals.otel ### SpanQuery **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) A serializable query for filtering SpanNodes based on various conditions. All fields are optional and combined with AND logic by default. #### Attributes ##### stop\_recursing\_when If present, stop recursing through ancestors or descendants at nodes that match this condition. **Type:** `SpanQuery` ### SpanNode A node in the span tree; provides references to parents/children for easy traversal and queries. #### Attributes ##### duration Return the span's duration as a timedelta, or None if start/end not set. **Type:** `timedelta` ##### descendants Return all descendants of this node in DFS order. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\] ##### ancestors Return all ancestors of this node. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\] #### Methods ##### add\_child ```python def add_child(child: SpanNode) -> None ``` Attach a child node to this node's list of children. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### find\_children ```python def find_children(predicate: SpanQuery | SpanPredicate) -> list[SpanNode] ``` Return all immediate children that satisfy the given predicate. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\] ##### first\_child ```python def first_child(predicate: SpanQuery | SpanPredicate) -> SpanNode | None ``` Return the first immediate child that satisfies the given predicate, or None if none match. ###### Returns `SpanNode` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### any\_child ```python def any_child(predicate: SpanQuery | SpanPredicate) -> bool ``` Returns True if there is at least one child that satisfies the predicate. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### find\_descendants ```python def find_descendants( predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None, ) -> list[SpanNode] ``` Return all descendant nodes that satisfy the given predicate in DFS order. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\] ##### first\_descendant ```python def first_descendant( predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None, ) -> SpanNode | None ``` DFS: Return the first descendant (in DFS order) that satisfies the given predicate, or `None` if none match. ###### Returns `SpanNode` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### any\_descendant ```python def any_descendant( predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None, ) -> bool ``` Returns `True` if there is at least one descendant that satisfies the predicate. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### find\_ancestors ```python def find_ancestors( predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None, ) -> list[SpanNode] ``` Return all ancestors that satisfy the given predicate. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\] ##### first\_ancestor ```python def first_ancestor( predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None, ) -> SpanNode | None ``` Return the closest ancestor that satisfies the given predicate, or `None` if none match. ###### Returns `SpanNode` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### any\_ancestor ```python def any_ancestor( predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None, ) -> bool ``` Returns True if any ancestor satisfies the predicate. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### matches ```python def matches(query: SpanQuery | SpanPredicate) -> bool ``` Check if the span node matches the query conditions or predicate. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### repr\_xml ```python def repr_xml( include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str ``` Return an XML-like string representation of the node. Optionally includes children, trace\_id, span\_id, start\_timestamp, and duration. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### SpanTree A container that builds a hierarchy of SpanNode objects from a list of finished spans. You can then search or iterate the tree to make your assertions (using DFS for traversal). #### Methods ##### add\_spans ```python def add_spans(spans: list[SpanNode]) -> None ``` Add a list of spans to the tree, rebuilding the tree structure. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### find ```python def find(predicate: SpanQuery | SpanPredicate) -> list[SpanNode] ``` Find all nodes in the entire tree that match the predicate, scanning from each root in DFS order. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\] ##### first ```python def first(predicate: SpanQuery | SpanPredicate) -> SpanNode | None ``` Find the first node that matches a predicate, scanning from each root in DFS order. Returns `None` if not found. ###### Returns `SpanNode` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### any ```python def any(predicate: SpanQuery | SpanPredicate) -> bool ``` Returns True if any node in the tree matches the predicate. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### \_\_iter\_\_ ```python def __iter__() -> Iterator[SpanNode] ``` Return an iterator over all nodes in the tree. ###### Returns [`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[`SpanNode`\] ##### repr\_xml ```python def repr_xml( include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str ``` Return an XML-like string representation of the tree, optionally including children, trace\_id, span\_id, duration, and timestamps. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) --- # [pydantic_evals.reporting](https://pydantic.dev/docs/ai/api/pydantic_evals/reporting/) # pydantic\_evals.reporting ### ConfusionMatrix **Bases:** `BaseModel` A confusion matrix comparing expected vs predicted labels across cases. #### Attributes ##### class\_labels Ordered list of class labels (used for both axes). **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### matrix matrixexpected\_idx = count of cases. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`int`](https://docs.python.org/3/library/functions.html#int)\]\] ### PrecisionRecallPoint **Bases:** `BaseModel` A single point on a precision-recall curve. ### PrecisionRecallCurve **Bases:** `BaseModel` A single precision-recall curve. #### Attributes ##### name Name of this curve (e.g., experiment name or evaluator name). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### points Points on the curve, ordered by threshold. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`PrecisionRecallPoint`\] ##### auc Area under the precision-recall curve. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### PrecisionRecall **Bases:** `BaseModel` Precision-recall curve data across cases. #### Attributes ##### curves One or more curves. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`PrecisionRecallCurve`\] ### ScalarResult **Bases:** `BaseModel` A single scalar statistic (e.g., F1 score, accuracy, BLEU). #### Attributes ##### unit Optional unit label (e.g., '%', 'ms'). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### ReportCase **Bases:** `Generic[InputsT, OutputT, MetadataT]` A single case in an evaluation report. #### Attributes ##### name The name of the [case](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### inputs The inputs to the task, from [`Case.inputs`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.inputs). **Type:** `InputsT` ##### metadata Any metadata associated with the case, from [`Case.metadata`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.metadata). **Type:** `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### expected\_output The expected output of the task, from [`Case.expected_output`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.expected_output). **Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### output The output of the task execution. **Type:** `OutputT` ##### source\_case\_name The original case name before run-indexing. Serves as the aggregation key for multi-run experiments. None when repeat == 1. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### trace\_id The trace ID of the case span. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### span\_id The span ID of the case span. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### TableResult **Bases:** `BaseModel` A generic table of data (fallback for custom analyses). #### Attributes ##### columns Column headers. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### rows Row data, one list per row. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`int`](https://docs.python.org/3/library/functions.html#int) | [`float`](https://docs.python.org/3/library/functions.html#float) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None)\]\] ### LinePlot **Bases:** `BaseModel` A generic XY line plot with labeled axes, supporting multiple curves. Use this for ROC curves, KS plots, calibration curves, or any custom line chart that doesn't fit the specific PrecisionRecall type. #### Attributes ##### x\_label Label for the x-axis. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### y\_label Label for the y-axis. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### x\_range Optional fixed range for x-axis (min, max). **Type:** [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`float`](https://docs.python.org/3/library/functions.html#float), [`float`](https://docs.python.org/3/library/functions.html#float)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### y\_range Optional fixed range for y-axis (min, max). **Type:** [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`float`](https://docs.python.org/3/library/functions.html#float), [`float`](https://docs.python.org/3/library/functions.html#float)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### curves One or more curves to plot. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`LinePlotCurve`\] ### ReportCaseFailure **Bases:** `Generic[InputsT, OutputT, MetadataT]` A single case in an evaluation report that failed due to an error during task execution. #### Attributes ##### name The name of the [case](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### inputs The inputs to the task, from [`Case.inputs`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.inputs). **Type:** `InputsT` ##### metadata Any metadata associated with the case, from [`Case.metadata`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.metadata). **Type:** `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### expected\_output The expected output of the task, from [`Case.expected_output`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.expected_output). **Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### error\_message The message of the exception that caused the failure. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### error\_stacktrace The stacktrace of the exception that caused the failure. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### source\_case\_name The original case name before run-indexing. Serves as the aggregation key for multi-run experiments. None when repeat == 1. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### trace\_id The trace ID of the case span. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### span\_id The span ID of the case span. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### ReportCaseGroup **Bases:** `Generic[InputsT, OutputT, MetadataT]` Grouped results from running the same case multiple times. This is a computed view, not stored data. Obtain via `EvaluationReport.case_groups()`. #### Attributes ##### name The original case name (shared across all runs). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### inputs The inputs (same for all runs). **Type:** `InputsT` ##### metadata The metadata (same for all runs). **Type:** `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### expected\_output The expected output (same for all runs). **Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None) ##### runs Individual run results. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`ReportCase`\[`InputsT`, `OutputT`, `MetadataT`\]\] ##### failures Runs that failed with exceptions. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`ReportCaseFailure`\[`InputsT`, `OutputT`, `MetadataT`\]\] ##### summary Aggregated statistics across runs. **Type:** `ReportCaseAggregate` ### ReportCaseAggregate **Bases:** `BaseModel` A synthetic case that summarizes a set of cases. #### Methods ##### average `@staticmethod` ```python def average(cases: list[ReportCase]) -> ReportCaseAggregate ``` Produce a synthetic "summary" case by averaging quantitative attributes. ###### Returns `ReportCaseAggregate` ##### average\_from\_aggregates `@staticmethod` ```python def average_from_aggregates( aggregates: list[ReportCaseAggregate], ) -> ReportCaseAggregate ``` Average across multiple aggregates (used for multi-run experiment summaries). ###### Returns `ReportCaseAggregate` ### EvaluationReport **Bases:** `Generic[InputsT, OutputT, MetadataT]` A report of the results of evaluating a model on a set of cases. #### Attributes ##### name The name of the report. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### cases The cases in the report. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportCase`\[`InputsT`, `OutputT`, `MetadataT`\]\] ##### failures The failures in the report. These are cases where task execution raised an exception. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportCaseFailure`\[`InputsT`, `OutputT`, `MetadataT`\]\] **Default:** `field(default_factory=(list[ReportCaseFailure[InputsT, OutputT, MetadataT]]))` ##### analyses Experiment-wide analyses produced by report evaluators. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportAnalysis`\] **Default:** `field(default_factory=(list[ReportAnalysis]))` ##### report\_evaluator\_failures Failures from report evaluators that raised exceptions. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`EvaluatorFailure`\] **Default:** `field(default_factory=(list[EvaluatorFailure]))` ##### experiment\_metadata Metadata associated with the specific experiment represented by this report. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### trace\_id The trace ID of the evaluation. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### span\_id The span ID of the evaluation. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` #### Methods ##### case\_groups ```python def case_groups() -> list[ReportCaseGroup[InputsT, OutputT, MetadataT]] | None ``` Group cases by source\_case\_name and compute per-group aggregates. Returns None if no cases have source\_case\_name set (i.e., single-run experiment). ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportCaseGroup`\[`InputsT`, `OutputT`, `MetadataT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### render ```python def render( width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, include_analyses: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> str ``` Render this report to a nicely-formatted string, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### print ```python def print( width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, console: Console | None = None, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, include_analyses: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> None ``` Print this report to the console, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### console\_table ```python def console_table( baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, with_title: bool = True, ) -> Table ``` Return a table containing the data from this report. If a baseline is provided, returns a diff between this report and the baseline report. Optionally include input and output details. ###### Returns `Table` ##### failures\_table ```python def failures_table( include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_error_message: bool = True, include_error_stacktrace: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, ) -> Table ``` Return a table containing the failures in this report. ###### Returns `Table` ##### \_\_str\_\_ ```python def __str__() -> str ``` Return a string representation of the report. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### RenderValueConfig **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) A configuration for rendering a values in an Evaluation report. ### RenderNumberConfig **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) A configuration for rendering a particular score or metric in an Evaluation report. See the implementation of `_RenderNumber` for more clarity on how these parameters affect the rendering. #### Attributes ##### value\_formatter The logic to use for formatting values. - If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures. - You can also use a custom string format spec, e.g. '{:.3f}' - You can also use a custom function, e.g. lambda x: f'{x:.3f}' **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`float`](https://docs.python.org/3/library/functions.html#float) | [`int`](https://docs.python.org/3/library/functions.html#int)\], [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### diff\_formatter The logic to use for formatting details about the diff. The strings produced by the value\_formatter will always be included in the reports, but the diff\_formatter is used to produce additional text about the difference between the old and new values, such as the absolute or relative difference. - If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures, and will include the percentage change. - You can also use a custom string format spec, e.g. '{:+.3f}' - You can also use a custom function, e.g. lambda x: f'{x:+.3f}'. If this function returns None, no extra diff text will be added. - You can also use None to never generate extra diff text. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`float`](https://docs.python.org/3/library/functions.html#float) | [`int`](https://docs.python.org/3/library/functions.html#int), [`float`](https://docs.python.org/3/library/functions.html#float) | [`int`](https://docs.python.org/3/library/functions.html#int)\], [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### diff\_atol The absolute tolerance for considering a difference "significant". A difference is "significant" if `abs(new - old) < self.diff_atol + self.diff_rtol * abs(old)`. If a difference is not significant, it will not have the diff styles applied. Note that we still show both the rendered before and after values in the diff any time they differ, even if the difference is not significant. (If the rendered values are exactly the same, we only show the value once.) If not provided, use 1e-6. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) ##### diff\_rtol The relative tolerance for considering a difference "significant". See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use 0.001 if all values are ints, otherwise 0.05. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) ##### diff\_increase\_style The style to apply to diffed values that have a significant increase. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use green for scores and red for metrics. You can also use arbitrary `rich` styles, such as "bold red". **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### diff\_decrease\_style The style to apply to diffed values that have significant decrease. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use red for scores and green for metrics. You can also use arbitrary `rich` styles, such as "bold red". **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### ReportCaseRenderer #### Methods ##### build\_base\_table ```python def build_base_table(title: str) -> Table ``` Build and return a Rich Table for the diff output. ###### Returns `Table` ##### build\_failures\_table ```python def build_failures_table(title: str) -> Table ``` Build and return a Rich Table for the failures output. ###### Returns `Table` ##### build\_row ```python def build_row(case: ReportCase) -> list[str] ``` Build a table row for a single case. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### build\_aggregate\_row ```python def build_aggregate_row(aggregate: ReportCaseAggregate) -> list[str] ``` Build a table row for an aggregated case. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### build\_diff\_row ```python def build_diff_row(new_case: ReportCase, baseline: ReportCase) -> list[str] ``` Build a table row for a given case ID. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### build\_diff\_aggregate\_row ```python def build_diff_aggregate_row( new: ReportCaseAggregate, baseline: ReportCaseAggregate, ) -> list[str] ``` Build a table row for a given case ID. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### build\_failure\_row ```python def build_failure_row(case: ReportCaseFailure) -> list[str] ``` Build a table row for a single case failure. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ### EvaluationRenderer A class for rendering an EvalReport or the diff between two EvalReports. #### Methods ##### build\_table ```python def build_table(report: EvaluationReport, with_title: bool = True) -> Table ``` Build a table for the report. ###### Returns `Table` -- A Rich Table object ###### Parameters **`report`** : `EvaluationReport` The evaluation report to render **`with_title`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include the title in the table (default True) ##### build\_diff\_table ```python def build_diff_table( report: EvaluationReport, baseline: EvaluationReport, with_title: bool = True, ) -> Table ``` Build a diff table comparing report to baseline. ###### Returns `Table` -- A Rich Table object ###### Parameters **`report`** : `EvaluationReport` The evaluation report to compare **`baseline`** : `EvaluationReport` The baseline report to compare against **`with_title`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include the title in the table (default True) ### ReportAnalysis Discriminated union of all report-level analysis types. **Default:** `Annotated[ConfusionMatrix | PrecisionRecall | ScalarResult | TableResult | LinePlot, Discriminator('type')]` --- # [pydantic_graph.beta](https://pydantic.dev/docs/ai/api/pydantic_graph/beta/) # pydantic\_graph.beta The next version of the pydantic-graph framework with enhanced graph execution capabilities. This module provides a parallel control flow graph execution framework with support for: - 'Step' nodes for task execution - 'Decision' nodes for conditional branching - 'Fork' nodes for parallel execution coordination - 'Join' nodes and 'Reducer's for re-joining parallel executions - Mermaid diagram generation for graph visualization ### TypeExpression **Bases:** `Generic[T]` A workaround for type checker limitations when using complex type expressions. This class serves as a wrapper for types that cannot normally be used in positions requiring `type[T]`, such as `Any`, `Union[...]`, or `Literal[...]`. It provides a way to pass these complex type expressions to functions expecting concrete types. ### StepContext **Bases:** `Generic[StateT, DepsT, InputT]` Context information passed to step functions during graph execution. The step context provides access to the current graph state, dependencies, and input data for a step. #### Attributes ##### inputs The input data for this step. This must be a property to ensure correct variance behavior **Type:** `InputT` ### StartNode **Bases:** `Generic[OutputT]` Entry point node for graph execution. The StartNode represents the beginning of a graph execution flow. #### Attributes ##### id Fixed identifier for the start node. **Default:** `NodeID('__start__')` ### EndNode **Bases:** `Generic[InputT]` Terminal node representing the completion of graph execution. The EndNode marks the successful completion of a graph execution flow and can collect the final output data. #### Attributes ##### id Fixed identifier for the end node. **Default:** `NodeID('__end__')` ### GraphBuilder **Bases:** `Generic[StateT, DepsT, GraphInputT, GraphOutputT]` A builder for constructing executable graph definitions. GraphBuilder provides a fluent interface for defining nodes, edges, and routing in a graph workflow. It supports typed state, dependencies, and input/output validation. #### Attributes ##### name Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name` ##### state\_type The type of the graph state. **Type:** `TypeOrTypeExpression`\[`StateT`\] **Default:** `state_type` ##### deps\_type The type of the dependencies. **Type:** `TypeOrTypeExpression`\[`DepsT`\] **Default:** `deps_type` ##### input\_type The type of the graph input data. **Type:** `TypeOrTypeExpression`\[`GraphInputT`\] **Default:** `input_type` ##### output\_type The type of the graph output data. **Type:** `TypeOrTypeExpression`\[`GraphOutputT`\] **Default:** `output_type` ##### auto\_instrument Whether to automatically create instrumentation spans. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `auto_instrument` ##### start\_node Get the start node for the graph. **Type:** `StartNode`\[`GraphInputT`\] ##### end\_node Get the end node for the graph. **Type:** `EndNode`\[`GraphOutputT`\] #### Methods ##### \_\_init\_\_ ```python def __init__( name: str | None = None, state_type: TypeOrTypeExpression[StateT] = NoneType, deps_type: TypeOrTypeExpression[DepsT] = NoneType, input_type: TypeOrTypeExpression[GraphInputT] = NoneType, output_type: TypeOrTypeExpression[GraphOutputT] = NoneType, auto_instrument: bool = True, ) ``` Initialize a graph builder. ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. **`state_type`** : `TypeOrTypeExpression`\[`StateT`\] _Default:_ `NoneType` The type of the graph state **`deps_type`** : `TypeOrTypeExpression`\[`DepsT`\] _Default:_ `NoneType` The type of the dependencies **`input_type`** : `TypeOrTypeExpression`\[`GraphInputT`\] _Default:_ `NoneType` The type of the graph input data **`output_type`** : `TypeOrTypeExpression`\[`GraphOutputT`\] _Default:_ `NoneType` The type of the graph output data **`auto_instrument`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to automatically create instrumentation spans ##### step ```python def step( node_id: str | None = None, label: str | None = None, ) -> Callable[[StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT]] def step( call: StepFunction[StateT, DepsT, InputT, OutputT], node_id: str | None = None, label: str | None = None, ) -> Step[StateT, DepsT, InputT, OutputT] ``` Create a step from a step function. This method can be used as a decorator or called directly to create a step node from an async function. ###### Returns `Step`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`StepFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\], `Step`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\] -- Either a Step instance or a decorator function ###### Parameters **`call`** : `StepFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The step function to wrap **`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID for the node **`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional human-readable label ##### stream ```python def stream( node_id: str | None = None, label: str | None = None, ) -> Callable[[StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]] def stream( call: StreamFunction[StateT, DepsT, InputT, OutputT], node_id: str | None = None, label: str | None = None, ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] def stream( call: StreamFunction[StateT, DepsT, InputT, OutputT] | None = None, node_id: str | None = None, label: str | None = None, ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[[StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]] ``` Create a step from an async iterator (which functions like a "stream"). This method can be used as a decorator or called directly to create a step node from an async function. ###### Returns `Step`\[`StateT`, `DepsT`, `InputT`, [`AsyncIterable`](https://docs.python.org/3/library/typing.html#typing.AsyncIterable)\[`OutputT`\]\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`StreamFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\], `Step`\[`StateT`, `DepsT`, `InputT`, [`AsyncIterable`](https://docs.python.org/3/library/typing.html#typing.AsyncIterable)\[`OutputT`\]\]\] -- Either a Step instance or a decorator function ###### Parameters **`call`** : `StreamFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The step function to wrap **`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID for the node **`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional human-readable label ##### add ```python def add(edges: EdgePath[StateT, DepsT] = ()) -> None ``` Add one or more edge paths to the graph. This method processes edge paths and automatically creates any necessary fork nodes for broadcasts and maps. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`*edges`** : `EdgePath`\[`StateT`, `DepsT`\] _Default:_ `()` The edge paths to add to the graph ##### add\_edge ```python def add_edge( source: Source[T], destination: Destination[T], label: str | None = None, ) -> None ``` Add a simple edge between two nodes. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`source`** : `Source`\[`T`\] The source node **`destination`** : `Destination`\[`T`\] The destination node **`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional label for the edge ##### add\_mapping\_edge ```python def add_mapping_edge( source: Source[Iterable[T]], map_to: Destination[T], pre_map_label: str | None = None, post_map_label: str | None = None, fork_id: ForkID | None = None, downstream_join_id: JoinID | None = None, ) -> None ``` Add an edge that maps iterable data across parallel paths. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`source`** : `Source`\[[`Iterable`](https://docs.python.org/3/library/typing.html#typing.Iterable)\[`T`\]\] The source node that produces iterable data **`map_to`** : `Destination`\[`T`\] The destination node that receives individual items **`pre_map_label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional label before the map operation **`post_map_label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional label after the map operation **`fork_id`** : `ForkID` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID for the fork node produced for this map operation **`downstream_join_id`** : `JoinID` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID of a join node that will always be downstream of this map. Specifying this ensures correct handling if you try to map an empty iterable. ##### edge\_from ```python def edge_from( sources: Source[SourceOutputT] = (), ) -> EdgePathBuilder[StateT, DepsT, SourceOutputT] ``` Create an edge path builder starting from the given source nodes. ###### Returns `EdgePathBuilder`\[`StateT`, `DepsT`, `SourceOutputT`\] -- An EdgePathBuilder for constructing the complete edge path ###### Parameters **`*sources`** : `Source`\[`SourceOutputT`\] _Default:_ `()` The source nodes to start the edge path from ##### decision ```python def decision( note: str | None = None, node_id: str | None = None, ) -> Decision[StateT, DepsT, Never] ``` Create a new decision node. ###### Returns `Decision`\[`StateT`, `DepsT`, [`Never`](https://docs.python.org/3/library/typing.html#typing.Never)\] -- A new Decision node with no branches ###### Parameters **`note`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional note to describe the decision logic **`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID for the node produced for this decision logic ##### match ```python def match( source: TypeOrTypeExpression[SourceT], matches: Callable[[Any], bool] | None = None, ) -> DecisionBranchBuilder[StateT, DepsT, SourceT, SourceT, Never] ``` Create a decision branch matcher. ###### Returns `DecisionBranchBuilder`\[`StateT`, `DepsT`, `SourceT`, `SourceT`, [`Never`](https://docs.python.org/3/library/typing.html#typing.Never)\] -- A DecisionBranchBuilder for constructing the branch ###### Parameters **`source`** : `TypeOrTypeExpression`\[`SourceT`\] The type or type expression to match against **`matches`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional custom matching function ##### match\_node ```python def match_node( source: type[SourceNodeT], matches: Callable[[Any], bool] | None = None, ) -> DecisionBranch[SourceNodeT] ``` Create a decision branch for BaseNode subclasses. This is similar to match() but specifically designed for matching against BaseNode types from the v1 system. ###### Returns `DecisionBranch`\[`SourceNodeT`\] -- A DecisionBranch for the BaseNode type ###### Parameters **`source`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`SourceNodeT`\] The BaseNode subclass to match against **`matches`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional custom matching function ##### node ```python def node( node_type: type[BaseNode[StateT, DepsT, GraphOutputT]], ) -> EdgePath[StateT, DepsT] ``` Create an edge path from a BaseNode class. This method integrates v1-style BaseNode classes into the v2 graph system by analyzing their type hints and creating appropriate edges. ###### Returns `EdgePath`\[`StateT`, `DepsT`\] -- An EdgePath representing the node and its connections ###### Parameters **`node_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, `GraphOutputT`\]\] The BaseNode subclass to integrate ###### Raises - `GraphSetupError` -- If the node type is missing required type hints ##### build ```python def build( validate_graph_structure: bool = True, ) -> Graph[StateT, DepsT, GraphInputT, GraphOutputT] ``` Build the final executable graph from the accumulated nodes and edges. This method performs validation, normalization, and analysis of the graph structure to create a complete, executable graph instance. ###### Returns `Graph`\[`StateT`, `DepsT`, `GraphInputT`, `GraphOutputT`\] -- A complete Graph instance ready for execution ###### Parameters **`validate_graph_structure`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` whether to perform validation of the graph structure See the docstring of \_validate\_graph\_structure below for more details. ###### Raises - `ValueError` -- If the graph structure is invalid (e.g., join without parent fork) ### Graph **Bases:** `Generic[StateT, DepsT, InputT, OutputT]` A complete graph definition ready for execution. The Graph class represents a complete workflow graph with typed inputs, outputs, state, and dependencies. It contains all nodes, edges, and metadata needed for execution. #### Attributes ##### name Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### state\_type The type of the graph state. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`StateT`\] ##### deps\_type The type of the dependencies. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`DepsT`\] ##### input\_type The type of the input data. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`InputT`\] ##### output\_type The type of the output data. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`OutputT`\] ##### auto\_instrument Whether to automatically create instrumentation spans. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### nodes All nodes in the graph indexed by their ID. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`NodeID`, `AnyNode`\] ##### edges\_by\_source Outgoing paths from each source node. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`NodeID`, [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Path`\]\] ##### parent\_forks Parent fork information for each join node. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`JoinID`, `ParentFork`\[`NodeID`\]\] ##### intermediate\_join\_nodes For each join, the set of other joins that appear between it and its parent fork. Used to determine which joins are "final" (have no other joins as intermediates) and which joins should preserve fork stacks when proceeding downstream. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`JoinID`, [`set`](https://docs.python.org/3/reference/expressions.html#set)\[`JoinID`\]\] #### Methods ##### get\_parent\_fork ```python def get_parent_fork(join_id: JoinID) -> ParentFork[NodeID] ``` Get the parent fork information for a join node. ###### Returns `ParentFork`\[`NodeID`\] -- The parent fork information for the join ###### Parameters **`join_id`** : `JoinID` The ID of the join node ###### Raises - `RuntimeError` -- If the join ID is not found or has no parent fork ##### is\_final\_join ```python def is_final_join(join_id: JoinID) -> bool ``` Check if a join is 'final' (has no downstream joins with the same parent fork). A join is non-final if it appears as an intermediate node for another join with the same parent fork. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) -- True if the join is final, False if it's non-final ###### Parameters **`join_id`** : `JoinID` The ID of the join node ##### run `@async` ```python def run( state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> OutputT ``` Execute the graph and return the final output. This is the main entry point for graph execution. It runs the graph to completion and returns the final output value. ###### Returns `OutputT` -- The final output from the graph execution ###### Parameters **`state`** : `StateT` _Default:_ `None` The graph state instance **`deps`** : `DepsT` _Default:_ `None` The dependencies instance **`inputs`** : `InputT` _Default:_ `None` The input data for the graph **`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional span for tracing/instrumentation **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. ##### iter `@async` ```python def iter( state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, OutputT]] ``` Create an iterator for step-by-step graph execution. This method allows for more fine-grained control over graph execution, enabling inspection of intermediate states and results. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`GraphRun`\[`StateT`, `DepsT`, `OutputT`\]\] ###### Parameters **`state`** : `StateT` _Default:_ `None` The graph state instance **`deps`** : `DepsT` _Default:_ `None` The dependencies instance **`inputs`** : `InputT` _Default:_ `None` The input data for the graph **`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional span for tracing/instrumentation **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. ##### render ```python def render( title: str | None = None, direction: StateDiagramDirection | None = None, ) -> str ``` Render the graph as a Mermaid diagram string. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) -- A string containing the Mermaid diagram representation ###### Parameters **`title`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional title for the diagram **`direction`** : `StateDiagramDirection` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional direction for the diagram layout ##### \_\_str\_\_ ```python def __str__() -> str ``` Return a Mermaid diagram representation of the graph. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) -- A string containing the Mermaid diagram of the graph ### StepNode **Bases:** `BaseNode[StateT, DepsT, Any]` A base node that represents a step with bound inputs. StepNode bridges between the v1 and v2 graph execution systems by wrapping a [`Step`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.Step) with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps. #### Attributes ##### step The step to execute. **Type:** `Step`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### inputs The inputs bound to this step. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) #### Methods ##### run `@async` ```python def run(ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any] ``` Attempt to run the step node. ###### Returns `BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `End`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of step execution ###### Parameters **`ctx`** : `GraphRunContext`\[`StateT`, `DepsT`\] The graph execution context ###### Raises - `NotImplementedError` -- Always raised as StepNode is not meant to be run directly --- # [pydantic_graph.beta.decision](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_decision/) # pydantic\_graph.beta.decision Decision node implementation for conditional branching in graph execution. This module provides the Decision node type and related classes for implementing conditional branching logic in parallel control flow graphs. Decision nodes allow the graph to choose different execution paths based on runtime conditions. ### Decision **Bases:** `Generic[StateT, DepsT, HandledT]` Decision node for conditional branching in graph execution. A Decision node evaluates conditions and routes execution to different branches based on the input data type or custom matching logic. #### Attributes ##### id Unique identifier for this decision node. **Type:** `NodeID` ##### branches List of branches that can be taken from this decision. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`DecisionBranch`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] ##### note Optional documentation note for this decision. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) #### Methods ##### branch ```python def branch(branch: DecisionBranch[T]) -> Decision[StateT, DepsT, HandledT | T] ``` Add a new branch to this decision. ###### Returns `Decision`\[`StateT`, `DepsT`, `HandledT` | `T`\] -- A new Decision with the additional branch. ###### Parameters **`branch`** : `DecisionBranch`\[`T`\] The branch to add to this decision. ### DecisionBranch **Bases:** `Generic[SourceT]` Represents a single branch within a decision node. Each branch defines the conditions under which it should be taken and the path to follow when those conditions are met. Note: with the current design, it is actually _critical_ that this class is invariant in SourceT for the sake of type-checking that inputs to a Decision are actually handled. See the `# type: ignore` comment in `tests.graph.beta.test_graph_edge_cases.test_decision_no_matching_branch` for an example of how this works. #### Attributes ##### source The expected type of data for this branch. This is necessary for exhaustiveness-checking when handling the inputs to a decision node. **Type:** `TypeOrTypeExpression`\[`SourceT`\] ##### matches An optional predicate function used to determine whether input data matches this branch. If `None`, default logic is used which attempts to check the value for type-compatibility with the `source` type: - If `source` is `Any` or `object`, the branch will always match - If `source` is a `Literal` type, this branch will match if the value is one of the parametrizing literal values - If `source` is any other type, the value will be checked for matching using `isinstance` Inputs are tested against each branch of a decision node in order, and the path of the first matching branch is used to handle the input value. **Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### path The execution path to follow when an input value matches this branch of a decision node. This can include transforming, mapping, and broadcasting the output before sending to the next node or nodes. The path can also include position-aware labels which are used when generating mermaid diagrams. **Type:** `Path` ##### destinations The destination nodes that can be referenced by DestinationMarker in the path. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`AnyDestinationNode`\] ### DecisionBranchBuilder **Bases:** `Generic[StateT, DepsT, OutputT, SourceT, HandledT]` Builder for constructing decision branches with fluent API. This builder provides methods to configure branches with destinations, forks, and transformations in a type-safe manner. Instances of this class should be created using [`GraphBuilder.match`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder), not created directly. #### Methods ##### to ```python def to( destination: DestinationNode[StateT, DepsT, OutputT] | type[BaseNode[StateT, DepsT, Any]], extra_destinations: DestinationNode[StateT, DepsT, OutputT] | type[BaseNode[StateT, DepsT, Any]] = (), fork_id: str | None = None, ) -> DecisionBranch[SourceT] ``` Set the destination(s) for this branch. ###### Returns `DecisionBranch`\[`SourceT`\] -- A completed DecisionBranch with the specified destinations. ###### Parameters **`destination`** : `DestinationNode`\[`StateT`, `DepsT`, `OutputT`\] | [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] The primary destination node. **`*extra_destinations`** : `DestinationNode`\[`StateT`, `DepsT`, `OutputT`\] | [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] _Default:_ `()` Additional destination nodes. **`fork_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional node ID to use for the resulting broadcast fork if multiple destinations are provided. ##### broadcast ```python def broadcast( get_forks: Callable[[Self], Sequence[DecisionBranch[SourceT]]], fork_id: str | None = None, ) -> DecisionBranch[SourceT] ``` Broadcast this decision branch into multiple destinations. ###### Returns `DecisionBranch`\[`SourceT`\] -- A completed DecisionBranch with the specified destinations. ###### Parameters **`get_forks`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)\], [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`DecisionBranch`\[`SourceT`\]\]\] The callback that will return a sequence of decision branches to broadcast to. **`fork_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional node ID to use for the resulting broadcast fork. ##### transform ```python def transform( func: TransformFunction[StateT, DepsT, OutputT, NewOutputT], ) -> DecisionBranchBuilder[StateT, DepsT, NewOutputT, SourceT, HandledT] ``` Apply a transformation to the branch's output. ###### Returns `DecisionBranchBuilder`\[`StateT`, `DepsT`, `NewOutputT`, `SourceT`, `HandledT`\] -- A new DecisionBranchBuilder where the provided transform is applied prior to generating the final output. ###### Parameters **`func`** : `TransformFunction`\[`StateT`, `DepsT`, `OutputT`, `NewOutputT`\] Transformation function to apply. ##### map ```python def map( fork_id: str | None = None, downstream_join_id: str | None = None, ) -> DecisionBranchBuilder[StateT, DepsT, T, SourceT, HandledT] ``` Spread the branch's output. To do this, the current output must be iterable, and any subsequent steps in the path being built for this branch will be applied to each item of the current output in parallel. ###### Returns `DecisionBranchBuilder`\[`StateT`, `DepsT`, `T`, `SourceT`, `HandledT`\] -- A new DecisionBranchBuilder where mapping is performed prior to generating the final output. ###### Parameters **`fork_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID for the fork, defaults to a generated value **`downstream_join_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID of a downstream join node which is involved when mapping empty iterables ##### label ```python def label( label: str, ) -> DecisionBranchBuilder[StateT, DepsT, OutputT, SourceT, HandledT] ``` Apply a label to the branch at the current point in the path being built. These labels are only used in generated mermaid diagrams. ###### Returns `DecisionBranchBuilder`\[`StateT`, `DepsT`, `OutputT`, `SourceT`, `HandledT`\] -- A new DecisionBranchBuilder where the label has been applied at the end of the current path being built. ###### Parameters **`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The label to apply. ### StateT Type variable for graph state. **Default:** `TypeVar('StateT', infer_variance=True)` ### DepsT Type variable for graph dependencies. **Default:** `TypeVar('DepsT', infer_variance=True)` ### HandledT Type variable used to track types handled by the branches of a Decision. **Default:** `TypeVar('HandledT', infer_variance=True)` ### T Generic type variable. **Default:** `TypeVar('T', infer_variance=True)` ### SourceT Type variable for source data for a DecisionBranch. **Default:** `TypeVar('SourceT', infer_variance=True)` ### OutputT Type variable for the output data of a node. **Default:** `TypeVar('OutputT', infer_variance=True)` ### NewOutputT Type variable for transformed output. **Default:** `TypeVar('NewOutputT', infer_variance=True)` --- # [pydantic_graph.beta.graph](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_graph/) # pydantic\_graph.beta.graph Core graph execution engine for the next version of the pydantic-graph library. This module provides the main `Graph` class and `GraphRun` execution engine that handles the orchestration of nodes, edges, and parallel execution paths in the graph-based workflow system. ### EndMarker **Bases:** `Generic[OutputT]` A marker indicating the end of graph execution with a final value. EndMarker is used internally to signal that the graph has completed execution and carries the final output value. ### ErrorMarker A marker indicating that a graph node raised an exception. Yielded by the graph iterator instead of raising immediately, allowing the caller to recover by sending new tasks via `GraphRun.next()` or `GraphRun.override_next()`. If the caller does not override, the error is re-raised on the next iteration. #### Attributes ##### error The exception raised by the node. **Type:** [`BaseException`](https://docs.python.org/3/library/exceptions.html#BaseException) ### JoinItem An item representing data flowing into a join operation. JoinItem carries input data from a parallel execution path to a join node, along with metadata about which execution 'fork' it originated from. #### Attributes ##### join\_id The ID of the join node this item is targeting. **Type:** `JoinID` ##### inputs The input data for the join operation. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ##### fork\_stack The stack of ForkStackItems that led to producing this join item. **Type:** `ForkStack` ### Graph **Bases:** `Generic[StateT, DepsT, InputT, OutputT]` A complete graph definition ready for execution. The Graph class represents a complete workflow graph with typed inputs, outputs, state, and dependencies. It contains all nodes, edges, and metadata needed for execution. #### Attributes ##### name Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### state\_type The type of the graph state. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`StateT`\] ##### deps\_type The type of the dependencies. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`DepsT`\] ##### input\_type The type of the input data. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`InputT`\] ##### output\_type The type of the output data. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`OutputT`\] ##### auto\_instrument Whether to automatically create instrumentation spans. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### nodes All nodes in the graph indexed by their ID. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`NodeID`, `AnyNode`\] ##### edges\_by\_source Outgoing paths from each source node. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`NodeID`, [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Path`\]\] ##### parent\_forks Parent fork information for each join node. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`JoinID`, `ParentFork`\[`NodeID`\]\] ##### intermediate\_join\_nodes For each join, the set of other joins that appear between it and its parent fork. Used to determine which joins are "final" (have no other joins as intermediates) and which joins should preserve fork stacks when proceeding downstream. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`JoinID`, [`set`](https://docs.python.org/3/reference/expressions.html#set)\[`JoinID`\]\] #### Methods ##### get\_parent\_fork ```python def get_parent_fork(join_id: JoinID) -> ParentFork[NodeID] ``` Get the parent fork information for a join node. ###### Returns `ParentFork`\[`NodeID`\] -- The parent fork information for the join ###### Parameters **`join_id`** : `JoinID` The ID of the join node ###### Raises - `RuntimeError` -- If the join ID is not found or has no parent fork ##### is\_final\_join ```python def is_final_join(join_id: JoinID) -> bool ``` Check if a join is 'final' (has no downstream joins with the same parent fork). A join is non-final if it appears as an intermediate node for another join with the same parent fork. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) -- True if the join is final, False if it's non-final ###### Parameters **`join_id`** : `JoinID` The ID of the join node ##### run `@async` ```python def run( state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> OutputT ``` Execute the graph and return the final output. This is the main entry point for graph execution. It runs the graph to completion and returns the final output value. ###### Returns `OutputT` -- The final output from the graph execution ###### Parameters **`state`** : `StateT` _Default:_ `None` The graph state instance **`deps`** : `DepsT` _Default:_ `None` The dependencies instance **`inputs`** : `InputT` _Default:_ `None` The input data for the graph **`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional span for tracing/instrumentation **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. ##### iter `@async` ```python def iter( state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, OutputT]] ``` Create an iterator for step-by-step graph execution. This method allows for more fine-grained control over graph execution, enabling inspection of intermediate states and results. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`GraphRun`\[`StateT`, `DepsT`, `OutputT`\]\] ###### Parameters **`state`** : `StateT` _Default:_ `None` The graph state instance **`deps`** : `DepsT` _Default:_ `None` The dependencies instance **`inputs`** : `InputT` _Default:_ `None` The input data for the graph **`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional span for tracing/instrumentation **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. ##### render ```python def render( title: str | None = None, direction: StateDiagramDirection | None = None, ) -> str ``` Render the graph as a Mermaid diagram string. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) -- A string containing the Mermaid diagram representation ###### Parameters **`title`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional title for the diagram **`direction`** : `StateDiagramDirection` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional direction for the diagram layout ##### \_\_str\_\_ ```python def __str__() -> str ``` Return a Mermaid diagram representation of the graph. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) -- A string containing the Mermaid diagram of the graph ### GraphTaskRequest A request to run a task representing the execution of a node in the graph. GraphTaskRequest encapsulates all the information needed to execute a specific node, including its inputs and the fork context it's executing within. #### Attributes ##### node\_id The ID of the node to execute. **Type:** `NodeID` ##### inputs The input data for the node. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ##### fork\_stack Stack of forks that have been entered. Used by the GraphRun to decide when to proceed through joins. **Type:** `ForkStack` **Default:** `field(repr=False)` ### GraphTask **Bases:** `GraphTaskRequest` A task representing the execution of a node in the graph. GraphTask encapsulates all the information needed to execute a specific node, including its inputs and the fork context it's executing within, and has a unique ID to identify the task within the graph run. #### Attributes ##### task\_id Unique identifier for this task. **Type:** `TaskID` **Default:** `field(repr=False)` ### GraphRun **Bases:** `Generic[StateT, DepsT, OutputT]` A single execution instance of a graph. GraphRun manages the execution state for a single run of a graph, including task scheduling, fork/join coordination, and result tracking. #### Attributes ##### graph The graph being executed. **Default:** `graph` ##### state The graph state instance. **Default:** `state` ##### deps The dependencies instance. **Default:** `deps` ##### inputs The initial input data. **Default:** `inputs` ##### next\_task Get the next task(s) to be executed. **Type:** `EndMarker`\[`OutputT`\] | `ErrorMarker` | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTask`\] ##### output Get the final output if the graph has completed. **Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None) #### Methods ##### \_\_init\_\_ ```python def __init__( graph: Graph[StateT, DepsT, InputT, OutputT], state: StateT, deps: DepsT, inputs: InputT, traceparent: str | None, ) ``` Initialize a graph run. ###### Parameters **`graph`** : `Graph`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] The graph to execute **`state`** : `StateT` The graph state instance **`deps`** : `DepsT` The dependencies instance **`inputs`** : `InputT` The input data for the graph **`traceparent`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) Optional trace parent for instrumentation ##### \_\_aiter\_\_ ```python def __aiter__() -> AsyncIterator[EndMarker[OutputT] | Sequence[GraphTask]] ``` Return self as an async iterator. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EndMarker`\[`OutputT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTask`\]\] -- Self for async iteration ##### \_\_anext\_\_ `@async` ```python def __anext__() -> EndMarker[OutputT] | Sequence[GraphTask] ``` Get the next item in the async iteration. ###### Returns `EndMarker`\[`OutputT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTask`\] -- The next execution result from the graph ###### Raises - `Exception` -- If a node raised an error and the caller has not recovered via `override_next()`. ##### next `@async` ```python def next( value: EndMarker[OutputT] | Sequence[GraphTaskRequest] | None = None, ) -> EndMarker[OutputT] | Sequence[GraphTask] ``` Advance the graph execution by one step. This method allows for sending a value to the iterator, which is useful for resuming iteration or overriding intermediate results. ###### Returns `EndMarker`\[`OutputT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTask`\] -- The next execution result: either an EndMarker, or sequence of GraphTasks ###### Parameters **`value`** : `EndMarker`\[`OutputT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTaskRequest`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional value to send to the iterator ##### override\_next ```python def override_next(value: Sequence[GraphTaskRequest] | EndMarker[OutputT]) -> None ``` Override the next pending step, allowing the graph to continue after an `End` or error. This is used by hook systems (like `after_node_run` or `on_node_run_error`) to redirect the graph to a new node when the current step produced an `End` result or raised an error, or to signal early completion by passing an `EndMarker`. Must only be called between iterations (not while an iteration is in flight). ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`value`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTaskRequest`\] | `EndMarker`\[`OutputT`\] New task requests to execute next, or an `EndMarker` to signal completion. ### StateT Type variable for graph state. **Default:** `TypeVar('StateT', infer_variance=True)` ### DepsT Type variable for graph dependencies. **Default:** `TypeVar('DepsT', infer_variance=True)` ### InputT Type variable for graph inputs. **Default:** `TypeVar('InputT', infer_variance=True)` ### OutputT Type variable for graph outputs. **Default:** `TypeVar('OutputT', infer_variance=True)` --- # [pydantic_graph.beta.graph_builder](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_graph_builder/) # pydantic\_graph.beta.graph\_builder Graph builder for constructing executable graph definitions. This module provides the GraphBuilder class and related utilities for constructing typed, executable graph definitions with steps, joins, decisions, and edge routing. ### GraphBuilder **Bases:** `Generic[StateT, DepsT, GraphInputT, GraphOutputT]` A builder for constructing executable graph definitions. GraphBuilder provides a fluent interface for defining nodes, edges, and routing in a graph workflow. It supports typed state, dependencies, and input/output validation. #### Attributes ##### name Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name` ##### state\_type The type of the graph state. **Type:** `TypeOrTypeExpression`\[`StateT`\] **Default:** `state_type` ##### deps\_type The type of the dependencies. **Type:** `TypeOrTypeExpression`\[`DepsT`\] **Default:** `deps_type` ##### input\_type The type of the graph input data. **Type:** `TypeOrTypeExpression`\[`GraphInputT`\] **Default:** `input_type` ##### output\_type The type of the graph output data. **Type:** `TypeOrTypeExpression`\[`GraphOutputT`\] **Default:** `output_type` ##### auto\_instrument Whether to automatically create instrumentation spans. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `auto_instrument` ##### start\_node Get the start node for the graph. **Type:** `StartNode`\[`GraphInputT`\] ##### end\_node Get the end node for the graph. **Type:** `EndNode`\[`GraphOutputT`\] #### Methods ##### \_\_init\_\_ ```python def __init__( name: str | None = None, state_type: TypeOrTypeExpression[StateT] = NoneType, deps_type: TypeOrTypeExpression[DepsT] = NoneType, input_type: TypeOrTypeExpression[GraphInputT] = NoneType, output_type: TypeOrTypeExpression[GraphOutputT] = NoneType, auto_instrument: bool = True, ) ``` Initialize a graph builder. ###### Parameters **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. **`state_type`** : `TypeOrTypeExpression`\[`StateT`\] _Default:_ `NoneType` The type of the graph state **`deps_type`** : `TypeOrTypeExpression`\[`DepsT`\] _Default:_ `NoneType` The type of the dependencies **`input_type`** : `TypeOrTypeExpression`\[`GraphInputT`\] _Default:_ `NoneType` The type of the graph input data **`output_type`** : `TypeOrTypeExpression`\[`GraphOutputT`\] _Default:_ `NoneType` The type of the graph output data **`auto_instrument`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to automatically create instrumentation spans ##### step ```python def step( node_id: str | None = None, label: str | None = None, ) -> Callable[[StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT]] def step( call: StepFunction[StateT, DepsT, InputT, OutputT], node_id: str | None = None, label: str | None = None, ) -> Step[StateT, DepsT, InputT, OutputT] ``` Create a step from a step function. This method can be used as a decorator or called directly to create a step node from an async function. ###### Returns `Step`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`StepFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\], `Step`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\] -- Either a Step instance or a decorator function ###### Parameters **`call`** : `StepFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The step function to wrap **`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID for the node **`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional human-readable label ##### stream ```python def stream( node_id: str | None = None, label: str | None = None, ) -> Callable[[StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]] def stream( call: StreamFunction[StateT, DepsT, InputT, OutputT], node_id: str | None = None, label: str | None = None, ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] def stream( call: StreamFunction[StateT, DepsT, InputT, OutputT] | None = None, node_id: str | None = None, label: str | None = None, ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[[StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]] ``` Create a step from an async iterator (which functions like a "stream"). This method can be used as a decorator or called directly to create a step node from an async function. ###### Returns `Step`\[`StateT`, `DepsT`, `InputT`, [`AsyncIterable`](https://docs.python.org/3/library/typing.html#typing.AsyncIterable)\[`OutputT`\]\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`StreamFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\], `Step`\[`StateT`, `DepsT`, `InputT`, [`AsyncIterable`](https://docs.python.org/3/library/typing.html#typing.AsyncIterable)\[`OutputT`\]\]\] -- Either a Step instance or a decorator function ###### Parameters **`call`** : `StreamFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The step function to wrap **`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID for the node **`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional human-readable label ##### add ```python def add(edges: EdgePath[StateT, DepsT] = ()) -> None ``` Add one or more edge paths to the graph. This method processes edge paths and automatically creates any necessary fork nodes for broadcasts and maps. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`*edges`** : `EdgePath`\[`StateT`, `DepsT`\] _Default:_ `()` The edge paths to add to the graph ##### add\_edge ```python def add_edge( source: Source[T], destination: Destination[T], label: str | None = None, ) -> None ``` Add a simple edge between two nodes. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`source`** : `Source`\[`T`\] The source node **`destination`** : `Destination`\[`T`\] The destination node **`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional label for the edge ##### add\_mapping\_edge ```python def add_mapping_edge( source: Source[Iterable[T]], map_to: Destination[T], pre_map_label: str | None = None, post_map_label: str | None = None, fork_id: ForkID | None = None, downstream_join_id: JoinID | None = None, ) -> None ``` Add an edge that maps iterable data across parallel paths. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`source`** : `Source`\[[`Iterable`](https://docs.python.org/3/library/typing.html#typing.Iterable)\[`T`\]\] The source node that produces iterable data **`map_to`** : `Destination`\[`T`\] The destination node that receives individual items **`pre_map_label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional label before the map operation **`post_map_label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional label after the map operation **`fork_id`** : `ForkID` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID for the fork node produced for this map operation **`downstream_join_id`** : `JoinID` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID of a join node that will always be downstream of this map. Specifying this ensures correct handling if you try to map an empty iterable. ##### edge\_from ```python def edge_from( sources: Source[SourceOutputT] = (), ) -> EdgePathBuilder[StateT, DepsT, SourceOutputT] ``` Create an edge path builder starting from the given source nodes. ###### Returns `EdgePathBuilder`\[`StateT`, `DepsT`, `SourceOutputT`\] -- An EdgePathBuilder for constructing the complete edge path ###### Parameters **`*sources`** : `Source`\[`SourceOutputT`\] _Default:_ `()` The source nodes to start the edge path from ##### decision ```python def decision( note: str | None = None, node_id: str | None = None, ) -> Decision[StateT, DepsT, Never] ``` Create a new decision node. ###### Returns `Decision`\[`StateT`, `DepsT`, [`Never`](https://docs.python.org/3/library/typing.html#typing.Never)\] -- A new Decision node with no branches ###### Parameters **`note`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional note to describe the decision logic **`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional ID for the node produced for this decision logic ##### match ```python def match( source: TypeOrTypeExpression[SourceT], matches: Callable[[Any], bool] | None = None, ) -> DecisionBranchBuilder[StateT, DepsT, SourceT, SourceT, Never] ``` Create a decision branch matcher. ###### Returns `DecisionBranchBuilder`\[`StateT`, `DepsT`, `SourceT`, `SourceT`, [`Never`](https://docs.python.org/3/library/typing.html#typing.Never)\] -- A DecisionBranchBuilder for constructing the branch ###### Parameters **`source`** : `TypeOrTypeExpression`\[`SourceT`\] The type or type expression to match against **`matches`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional custom matching function ##### match\_node ```python def match_node( source: type[SourceNodeT], matches: Callable[[Any], bool] | None = None, ) -> DecisionBranch[SourceNodeT] ``` Create a decision branch for BaseNode subclasses. This is similar to match() but specifically designed for matching against BaseNode types from the v1 system. ###### Returns `DecisionBranch`\[`SourceNodeT`\] -- A DecisionBranch for the BaseNode type ###### Parameters **`source`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`SourceNodeT`\] The BaseNode subclass to match against **`matches`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional custom matching function ##### node ```python def node( node_type: type[BaseNode[StateT, DepsT, GraphOutputT]], ) -> EdgePath[StateT, DepsT] ``` Create an edge path from a BaseNode class. This method integrates v1-style BaseNode classes into the v2 graph system by analyzing their type hints and creating appropriate edges. ###### Returns `EdgePath`\[`StateT`, `DepsT`\] -- An EdgePath representing the node and its connections ###### Parameters **`node_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, `GraphOutputT`\]\] The BaseNode subclass to integrate ###### Raises - `GraphSetupError` -- If the node type is missing required type hints ##### build ```python def build( validate_graph_structure: bool = True, ) -> Graph[StateT, DepsT, GraphInputT, GraphOutputT] ``` Build the final executable graph from the accumulated nodes and edges. This method performs validation, normalization, and analysis of the graph structure to create a complete, executable graph instance. ###### Returns `Graph`\[`StateT`, `DepsT`, `GraphInputT`, `GraphOutputT`\] -- A complete Graph instance ready for execution ###### Parameters **`validate_graph_structure`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` whether to perform validation of the graph structure See the docstring of \_validate\_graph\_structure below for more details. ###### Raises - `ValueError` -- If the graph structure is invalid (e.g., join without parent fork) --- # [pydantic_graph.beta.join](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_join/) # pydantic\_graph.beta.join Join operations and reducers for graph execution. This module provides the core components for joining parallel execution paths in a graph, including various reducer types that aggregate data from multiple sources into a single output. ### JoinState The state of a join during graph execution associated to a particular fork run. ### ReducerContext **Bases:** `Generic[StateT, DepsT]` Context information passed to reducer functions during graph execution. The reducer context provides access to the current graph state and dependencies. #### Attributes ##### state The state of the graph run. **Type:** `StateT` ##### deps The deps for the graph run. **Type:** `DepsT` #### Methods ##### cancel\_sibling\_tasks ```python def cancel_sibling_tasks() ``` Cancel all sibling tasks created from the same fork. You can call this if you want your join to have early-stopping behavior. ### SupportsSum **Bases:** [`Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol) A protocol for a type that supports adding to itself. ### ReduceFirstValue **Bases:** `Generic[T]` A reducer that returns the first value it encounters, and cancels all other tasks. #### Methods ##### \_\_call\_\_ ```python def __call__(ctx: ReducerContext[object, object], current: T, inputs: T) -> T ``` The reducer function. ###### Returns `T` ### Join **Bases:** `Generic[StateT, DepsT, InputT, OutputT]` A join operation that synchronizes and aggregates parallel execution paths. A join defines how to combine outputs from multiple parallel execution paths using a [`ReducerFunction`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReducerFunction). It specifies which fork it joins (if any) and manages the initialization of reducers. #### Methods ##### as\_node ```python def as_node(inputs: None = None) -> JoinNode[StateT, DepsT] def as_node(inputs: InputT) -> JoinNode[StateT, DepsT] ``` Create a step node with bound inputs. ###### Returns `JoinNode`\[`StateT`, `DepsT`\] -- A [`StepNode`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepNode) with this step and the bound inputs ###### Parameters **`inputs`** : `InputT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The input data to bind to this step, or None ### JoinNode **Bases:** `BaseNode[StateT, DepsT, Any]` A base node that represents a join item with bound inputs. JoinNode bridges between the v1 and v2 graph execution systems by wrapping a [`Join`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.Join) with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps. #### Attributes ##### join The step to execute. **Type:** `Join`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### inputs The inputs bound to this step. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) #### Methods ##### run `@async` ```python def run(ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any] ``` Attempt to run the join node. ###### Returns `BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `End`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of step execution ###### Parameters **`ctx`** : `GraphRunContext`\[`StateT`, `DepsT`\] The graph execution context ###### Raises - `NotImplementedError` -- Always raised as StepNode is not meant to be run directly ### reduce\_null ```python def reduce_null(current: None, inputs: Any) -> None ``` A reducer that discards all input data and returns None. #### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ### reduce\_list\_append ```python def reduce_list_append(current: list[T], inputs: T) -> list[T] ``` A reducer that appends to a list. #### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`T`\] ### reduce\_list\_extend ```python def reduce_list_extend(current: list[T], inputs: Iterable[T]) -> list[T] ``` A reducer that extends a list. #### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`T`\] ### reduce\_dict\_update ```python def reduce_dict_update(current: dict[K, V], inputs: Mapping[K, V]) -> dict[K, V] ``` A reducer that updates a dict. #### Returns [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`K`, `V`\] ### reduce\_sum ```python def reduce_sum(current: NumericT, inputs: NumericT) -> NumericT ``` A reducer that sums numbers. #### Returns `NumericT` ### ReducerFunction A function used for reducing inputs to a join node. **Default:** `TypeAliasType('ReducerFunction', ContextReducerFunction[StateT, DepsT, InputT, OutputT] | PlainReducerFunction[InputT, OutputT], type_params=(StateT, DepsT, InputT, OutputT))` --- # [pydantic_graph.beta.node](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_node/) # pydantic\_graph.beta.node Core node types for graph construction and execution. This module defines the fundamental node types used to build execution graphs, including start/end nodes and fork nodes for parallel execution. ### StartNode **Bases:** `Generic[OutputT]` Entry point node for graph execution. The StartNode represents the beginning of a graph execution flow. #### Attributes ##### id Fixed identifier for the start node. **Default:** `NodeID('__start__')` ### EndNode **Bases:** `Generic[InputT]` Terminal node representing the completion of graph execution. The EndNode marks the successful completion of a graph execution flow and can collect the final output data. #### Attributes ##### id Fixed identifier for the end node. **Default:** `NodeID('__end__')` ### Fork **Bases:** `Generic[InputT, OutputT]` Fork node that creates parallel execution branches. A Fork node splits the execution flow into multiple parallel branches, enabling concurrent execution of downstream nodes. It can either map a sequence across multiple branches or duplicate data to each branch. #### Attributes ##### id Unique identifier for this fork node. **Type:** `ForkID` ##### is\_map Determines fork behavior. If True, InputT must be Sequence\[OutputT\] and each element is sent to a separate branch. If False, InputT must be OutputT and the same data is sent to all branches. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### downstream\_join\_id Optional identifier of a downstream join node that should be jumped to if mapping an empty iterable. **Type:** `JoinID` | [`None`](https://docs.python.org/3/library/constants.html#None) ### StateT Type variable for graph state. **Default:** `TypeVar('StateT', infer_variance=True)` ### OutputT Type variable for node output data. **Default:** `TypeVar('OutputT', infer_variance=True)` ### InputT Type variable for node input data. **Default:** `TypeVar('InputT', infer_variance=True)` --- # [pydantic_graph.beta.step](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_step/) # pydantic\_graph.beta.step Step-based graph execution components. This module provides the core abstractions for step-based graph execution, including step contexts, step functions, and step nodes that bridge between the v1 and v2 graph execution systems. ### StepContext **Bases:** `Generic[StateT, DepsT, InputT]` Context information passed to step functions during graph execution. The step context provides access to the current graph state, dependencies, and input data for a step. #### Attributes ##### inputs The input data for this step. This must be a property to ensure correct variance behavior **Type:** `InputT` ### StepFunction **Bases:** `Protocol[StateT, DepsT, InputT, OutputT]` Protocol for step functions that can be executed in the graph. Step functions are async callables that receive a step context and return a result. #### Methods ##### \_\_call\_\_ ```python def __call__(ctx: StepContext[StateT, DepsT, InputT]) -> Awaitable[OutputT] ``` Execute the step function with the given context. ###### Returns [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`OutputT`\] -- An awaitable that resolves to the step's output ###### Parameters **`ctx`** : `StepContext`\[`StateT`, `DepsT`, `InputT`\] The step context containing state, dependencies, and inputs ### StreamFunction **Bases:** `Protocol[StateT, DepsT, InputT, OutputT]` Protocol for stream functions that can be executed in the graph. Stream functions are async callables that receive a step context and return an async iterator. #### Methods ##### \_\_call\_\_ ```python def __call__(ctx: StepContext[StateT, DepsT, InputT]) -> AsyncIterator[OutputT] ``` Execute the stream function with the given context. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`OutputT`\] -- An async iterator yielding the streamed output ###### Parameters **`ctx`** : `StepContext`\[`StateT`, `DepsT`, `InputT`\] The step context containing state, dependencies, and inputs ### Step **Bases:** `Generic[StateT, DepsT, InputT, OutputT]` A step in the graph execution that wraps a step function. Steps represent individual units of execution in the graph, encapsulating a step function along with metadata like ID and label. #### Attributes ##### id Unique identifier for this step. **Type:** `NodeID` **Default:** `id` ##### label Optional human-readable label for this step. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `label` ##### call The step function to execute. This needs to be a property for proper variance inference. **Type:** `StepFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] #### Methods ##### as\_node ```python def as_node(inputs: None = None) -> StepNode[StateT, DepsT] def as_node(inputs: InputT) -> StepNode[StateT, DepsT] ``` Create a step node with bound inputs. ###### Returns `StepNode`\[`StateT`, `DepsT`\] -- A [`StepNode`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepNode) with this step and the bound inputs ###### Parameters **`inputs`** : `InputT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The input data to bind to this step, or None ### StepNode **Bases:** `BaseNode[StateT, DepsT, Any]` A base node that represents a step with bound inputs. StepNode bridges between the v1 and v2 graph execution systems by wrapping a [`Step`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.Step) with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps. #### Attributes ##### step The step to execute. **Type:** `Step`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] ##### inputs The inputs bound to this step. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) #### Methods ##### run `@async` ```python def run(ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any] ``` Attempt to run the step node. ###### Returns `BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `End`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of step execution ###### Parameters **`ctx`** : `GraphRunContext`\[`StateT`, `DepsT`\] The graph execution context ###### Raises - `NotImplementedError` -- Always raised as StepNode is not meant to be run directly ### NodeStep **Bases:** `Step[StateT, DepsT, Any, BaseNode[StateT, DepsT, Any] | End[Any]]` A step that wraps a BaseNode type for execution. NodeStep allows v1-style BaseNode classes to be used as steps in the v2 graph execution system. It validates that the input is of the expected node type and runs it with the appropriate graph context. #### Attributes ##### node\_type The BaseNode type this step executes. **Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] **Default:** `get_origin(node_type) or node_type` #### Methods ##### \_\_init\_\_ ```python def __init__( node_type: type[BaseNode[StateT, DepsT, Any]], id: NodeID | None = None, label: str | None = None, ) ``` Initialize a node step. ###### Parameters **`node_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] The BaseNode class this step will execute **`id`** : `NodeID` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional unique identifier, defaults to the node's get\_node\_id() **`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional human-readable label for this step ### AnyStepFunction Type alias for a step function with any type parameters. **Default:** `StepFunction[Any, Any, Any, Any]` --- # [pydantic_graph.exceptions](https://pydantic.dev/docs/ai/api/pydantic_graph/exceptions/) # pydantic\_graph.exceptions ### GraphSetupError **Bases:** [`TypeError`](https://docs.python.org/3/library/exceptions.html#TypeError) Error caused by an incorrectly configured graph. #### Attributes ##### message Description of the mistake. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message` ### GraphBuildingError **Bases:** [`ValueError`](https://docs.python.org/3/library/exceptions.html#ValueError) An error raised during graph-building. #### Attributes ##### message The error message. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message` ### GraphValidationError **Bases:** [`ValueError`](https://docs.python.org/3/library/exceptions.html#ValueError) An error raised during graph validation. #### Attributes ##### message The error message. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message` ### GraphRuntimeError **Bases:** [`RuntimeError`](https://docs.python.org/3/library/exceptions.html#RuntimeError) Error caused by an issue during graph execution. #### Attributes ##### message The error message. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message` ### GraphNodeStatusError **Bases:** `GraphRuntimeError` Error caused by trying to run a node that already has status `'running'`, `'success'`, or `'error'`. #### Methods ##### check `@classmethod` ```python def check(cls, status: SnapshotStatus) -> None ``` Check if the status is valid. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) --- # [pydantic_graph](https://pydantic.dev/docs/ai/api/pydantic_graph/graph/) # pydantic\_graph ### Graph **Bases:** `Generic[StateT, DepsT, RunEndT]` Definition of a graph. In `pydantic-graph`, a graph is a collection of nodes that can be run in sequence. The nodes define their outgoing edges -- e.g. which nodes may be run next, and thereby the structure of the graph. Here's a very simple example of a graph which increments a number by 1, but makes sure the number is never 42 at the end. never\_42.py ```py from __future__ import annotations from dataclasses import dataclass from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class MyState: number: int @dataclass class Increment(BaseNode[MyState]): async def run(self, ctx: GraphRunContext) -> Check42: ctx.state.number += 1 return Check42() @dataclass class Check42(BaseNode[MyState, None, int]): async def run(self, ctx: GraphRunContext) -> Increment | End[int]: if ctx.state.number == 42: return Increment() else: return End(ctx.state.number) never_42_graph = Graph(nodes=(Increment, Check42)) ``` _(This example is complete, it can be run "as is")_ See [`run`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run) For an example of running graph, and [`mermaid_code`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.mermaid_code) for an example of generating a mermaid diagram from the graph. #### Methods ##### \_\_init\_\_ ```python def __init__( nodes: Sequence[type[BaseNode[StateT, DepsT, RunEndT]]], name: str | None = None, state_type: type[StateT] | _utils.Unset = _utils.UNSET, run_end_type: type[RunEndT] | _utils.Unset = _utils.UNSET, auto_instrument: bool = True, ) ``` Create a graph from a sequence of nodes. ###### Parameters **`nodes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, `RunEndT`\]\]\] The nodes which make up the graph, nodes need to be unique and all be generic in the same state type. **`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. **`state_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`StateT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The type of the state for the graph, this can generally be inferred from `nodes`. **`run_end_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`RunEndT`\] | `_utils.Unset` _Default:_ `_utils.UNSET` The type of the result of running the graph, this can generally be inferred from `nodes`. **`auto_instrument`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to create a span for the graph run and the execution of each node's run method. ##### run `@async` ```python def run( start_node: BaseNode[StateT, DepsT, RunEndT], state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT] ``` Run the graph from a starting node until it ends. Here's an example of running the graph from [above](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph): run\_never\_42.py ```py from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=2) state = MyState(41) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=43) ``` ###### Returns `GraphRunResult`\[`StateT`, `RunEndT`\] -- A `GraphRunResult` containing information about the run, including its final result. ###### Parameters **`start_node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. **`state`** : `StateT` _Default:_ `None` The initial state of the graph. **`deps`** : `DepsT` _Default:_ `None` The dependencies of the graph. **`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` State persistence interface, defaults to [`SimpleStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.SimpleStatePersistence) if `None`. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. ##### run\_sync ```python def run_sync( start_node: BaseNode[StateT, DepsT, RunEndT], state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT] ``` Synchronously run the graph. This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. ###### Returns `GraphRunResult`\[`StateT`, `RunEndT`\] -- The result type from ending the run and the history of the run. ###### Parameters **`start_node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. **`state`** : `StateT` _Default:_ `None` The initial state of the graph. **`deps`** : `DepsT` _Default:_ `None` The dependencies of the graph. **`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` State persistence interface, defaults to [`SimpleStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.SimpleStatePersistence) if `None`. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. ##### iter `@async` ```python def iter( start_node: BaseNode[StateT, DepsT, RunEndT], state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]] ``` A contextmanager which can be used to iterate over the graph's nodes as they are executed. This method returns a `GraphRun` object which can be used to async-iterate over the nodes of this `Graph` as they are executed. This is the API to use if you want to record or interact with the nodes as the graph execution unfolds. The `GraphRun` can also be used to manually drive the graph execution by calling [`GraphRun.next`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun.next). The `GraphRun` provides access to the full run history, state, deps, and the final result of the run once it has completed. For more details, see the API documentation of [`GraphRun`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun). Returns: A GraphRun that can be async iterated over to drive the graph to completion. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`GraphRun`\[`StateT`, `DepsT`, `RunEndT`\]\] ###### Parameters **`start_node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] the first node to run. Since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. **`state`** : `StateT` _Default:_ `None` The initial state of the graph. **`deps`** : `DepsT` _Default:_ `None` The dependencies of the graph. **`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` State persistence interface, defaults to [`SimpleStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.SimpleStatePersistence) if `None`. **`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The span to use for the graph run. If not provided, a new span will be created. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. ##### iter\_from\_persistence `@async` ```python def iter_from_persistence( persistence: BaseStatePersistence[StateT, RunEndT], deps: DepsT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]] ``` A contextmanager to iterate over the graph's nodes as they are executed, created from a persistence object. This method has similar functionality to [`iter`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter), but instead of passing the node to run, it will restore the node and state from state persistence. Returns: A GraphRun that can be async iterated over to drive the graph to completion. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`GraphRun`\[`StateT`, `DepsT`, `RunEndT`\]\] ###### Parameters **`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\] The state persistence interface to use. **`deps`** : `DepsT` _Default:_ `None` The dependencies of the graph. **`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The span to use for the graph run. If not provided, a new span will be created. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. ##### initialize `@async` ```python def initialize( node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], state: StateT = None, infer_name: bool = True, ) -> None ``` Initialize a new graph run in persistence without running it. This is useful if you want to set up a graph run to be run later, e.g. via [`iter_from_persistence`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter_from_persistence). ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] The node to run first. **`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\] State persistence interface. **`state`** : `StateT` _Default:_ `None` The start state of the graph. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. ##### mermaid\_code ```python def mermaid_code( start_node: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, title: str | None | typing_extensions.Literal[False] = None, edge_labels: bool = True, notes: bool = True, highlighted_nodes: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, highlight_css: str = mermaid.DEFAULT_HIGHLIGHT_CSS, infer_name: bool = True, direction: mermaid.StateDiagramDirection | None = None, ) -> str ``` Generate a diagram representing the graph as [mermaid](https://mermaid.js.org/) diagram. This method calls [`pydantic_graph.mermaid.generate_code`](/docs/ai/api/pydantic_graph/mermaid/#pydantic_graph.mermaid.generate_code). Here's an example of generating a diagram for the graph from [above](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph): mermaid\_never\_42.py ```py from never_42 import Increment, never_42_graph print(never_42_graph.mermaid_code(start_node=Increment)) ''' --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ''' ``` The rendered diagram will look like this: ```mermaid --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ``` ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) -- The mermaid code for the graph, which can then be rendered as a diagram. ###### Parameters **`start_node`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`mermaid.NodeIdent`\] | `mermaid.NodeIdent` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The node or nodes which can start the graph. **`title`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) | [`typing_extensions.Literal`](https://typing-extensions.readthedocs.io/en/latest/index.html#typing_extensions.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\] _Default:_ `None` The title of the diagram, use `False` to not include a title. **`edge_labels`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include edge labels. **`notes`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include notes on each node. **`highlighted_nodes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`mermaid.NodeIdent`\] | `mermaid.NodeIdent` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional node or nodes to highlight. **`highlight_css`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `mermaid.DEFAULT_HIGHLIGHT_CSS` The CSS to use for highlighting nodes. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. **`direction`** : `mermaid.StateDiagramDirection` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The direction of flow. ##### mermaid\_image ```python def mermaid_image( infer_name: bool = True, kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] = {}, ) -> bytes ``` Generate a diagram representing the graph as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`](/docs/ai/api/pydantic_graph/mermaid/#pydantic_graph.mermaid.MermaidConfig). Uses external service This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- The image bytes. ###### Parameters **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. **`**kwargs`** : [`typing_extensions.Unpack`](https://typing-extensions.readthedocs.io/en/latest/index.html#typing_extensions.Unpack)\[`mermaid.MermaidConfig`\] _Default:_ `{}` Additional arguments to pass to `mermaid.request_image`. ##### mermaid\_save ```python def mermaid_save( path: Path | str, infer_name: bool = True, kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] = {}, ) -> None ``` Generate a diagram representing the graph and save it as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`](/docs/ai/api/pydantic_graph/mermaid/#pydantic_graph.mermaid.MermaidConfig). Uses external service This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The path to save the image to. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to infer the graph name from the calling frame. **`**kwargs`** : [`typing_extensions.Unpack`](https://typing-extensions.readthedocs.io/en/latest/index.html#typing_extensions.Unpack)\[`mermaid.MermaidConfig`\] _Default:_ `{}` Additional arguments to pass to `mermaid.save_image`. ##### get\_nodes ```python def get_nodes() -> Sequence[type[BaseNode[StateT, DepsT, RunEndT]]] ``` Get the nodes in the graph. ###### Returns [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, `RunEndT`\]\]\] ### GraphRun **Bases:** `Generic[StateT, DepsT, RunEndT]` A stateful, async-iterable run of a [`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph). You typically get a `GraphRun` instance from calling `async with [my_graph.iter(...)](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter) as graph_run:`. That gives you the ability to iterate through nodes as they run, either by `async for` iteration or by repeatedly calling `.next(...)`. Here's an example of iterating over the graph from [above](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph): iter\_never\_42.py ```py from copy import deepcopy from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=1)), (Increment(), MyState(number=1)), (Check42(), MyState(number=2)), (End(data=2), MyState(number=2)), ] ''' state = MyState(41) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=41)), (Increment(), MyState(number=41)), (Check42(), MyState(number=42)), (Increment(), MyState(number=42)), (Check42(), MyState(number=43)), (End(data=43), MyState(number=43)), ] ''' ``` See the [`GraphRun.next` documentation](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun.next) for an example of how to manually drive the graph run. #### Attributes ##### next\_node The next node that will be run in the graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. **Type:** `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] | `End`\[`RunEndT`\] ##### result The final result of the graph run if the run is completed, otherwise `None`. **Type:** `GraphRunResult`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) #### Methods ##### \_\_init\_\_ ```python def __init__( graph: Graph[StateT, DepsT, RunEndT], start_node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], state: StateT, deps: DepsT, traceparent: str | None, snapshot_id: str | None = None, ) ``` Create a new run for a given graph, starting at the specified node. Typically, you'll use [`Graph.iter`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter) rather than calling this directly. ###### Parameters **`graph`** : `Graph`\[`StateT`, `DepsT`, `RunEndT`\] The [`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph) to run. **`start_node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] The node where execution will begin. **`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\] State persistence interface. **`state`** : `StateT` A shared state object or primitive (like a counter, dataclass, etc.) that is available to all nodes via `ctx.state`. **`deps`** : `DepsT` Optional dependencies that each node can access via `ctx.deps`, e.g. database connections, configuration, or logging clients. **`traceparent`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) The traceparent for the span used for the graph run. **`snapshot_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The ID of the snapshot the node came from. ##### next `@async` ```python def next( node: BaseNode[StateT, DepsT, RunEndT] | None = None, ) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] ``` Manually drive the graph run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The graph run should stop when you return an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node. Here's an example of using `next` to drive the graph from [above](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph): next\_never\_42.py ```py from copy import deepcopy from pydantic_graph import End from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(48) async with never_42_graph.iter(Increment(), state=state) as graph_run: next_node = graph_run.next_node # start with the first node node_states = [(next_node, deepcopy(graph_run.state))] while not isinstance(next_node, End): if graph_run.state.number == 50: graph_run.state.number = 42 next_node = await graph_run.next(next_node) node_states.append((next_node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=48)), (Check42(), MyState(number=49)), (End(data=49), MyState(number=49)), ] ''' ``` ###### Returns `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] | `End`\[`RunEndT`\] -- The next node returned by the graph logic, or an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node if `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] | `End`\[`RunEndT`\] -- the run has completed. ###### Parameters **`node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The node to run next in the graph. If not specified, uses `self.next_node`, which is initialized to the `start_node` of the run and updated each time a new node is returned. ##### \_\_anext\_\_ `@async` ```python def __anext__() -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] ``` Use the last returned node as the input to `Graph.next`. ###### Returns `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] | `End`\[`RunEndT`\] ### GraphRunResult **Bases:** `Generic[StateT, RunEndT]` The final result of running a graph. --- # [pydantic_graph.mermaid](https://pydantic.dev/docs/ai/api/pydantic_graph/mermaid/) # pydantic\_graph.mermaid ### MermaidConfig **Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict) Parameters to configure mermaid chart generation. #### Attributes ##### start\_node Identifiers of nodes that start the graph. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`NodeIdent`\] | `NodeIdent` ##### highlighted\_nodes Identifiers of nodes to highlight. **Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`NodeIdent`\] | `NodeIdent` ##### highlight\_css CSS to use for highlighting nodes. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### title The title of the diagram. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### edge\_labels Whether to include edge labels in the diagram. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### notes Whether to include notes on nodes in the diagram, defaults to true. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### image\_type The image type to generate. If unspecified, the default behavior is `'jpeg'`. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['jpeg', 'png', 'webp', 'svg', 'pdf'\] ##### pdf\_fit When using image\_type='pdf', whether to fit the diagram to the PDF page. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### pdf\_landscape When using image\_type='pdf', whether to use landscape orientation for the PDF. This has no effect if using `pdf_fit`. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### pdf\_paper When using image\_type='pdf', the paper size of the PDF. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['letter', 'legal', 'tabloid', 'ledger', 'a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6'\] ##### background\_color The background color of the diagram. If None, the default transparent background is used. The color value is interpreted as a hexadecimal color code by default (and should not have a leading '#'), but you can also use named colors by prefixing the value with `'!'`. For example, valid choices include `background_color='!white'` or `background_color='FF0000'`. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### theme The theme of the diagram. Defaults to 'default'. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['default', 'neutral', 'dark', 'forest'\] ##### width The width of the diagram. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### height The height of the diagram. **Type:** [`int`](https://docs.python.org/3/library/functions.html#int) ##### scale The scale of the diagram. The scale must be a number between 1 and 3, and you can only set a scale if one or both of width and height are set. **Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`float`](https://docs.python.org/3/library/functions.html#float), `Ge`(1), `Le`(3)\] ##### httpx\_client An HTTPX client to use for requests, mostly for testing purposes. **Type:** `httpx.Client` ##### direction The direction of the state diagram. **Type:** `StateDiagramDirection` ### generate\_code ```python def generate_code( graph: Graph[Any, Any, Any], start_node: Sequence[NodeIdent] | NodeIdent | None = None, highlighted_nodes: Sequence[NodeIdent] | NodeIdent | None = None, highlight_css: str = DEFAULT_HIGHLIGHT_CSS, title: str | None = None, edge_labels: bool = True, notes: bool = True, direction: StateDiagramDirection | None, ) -> str ``` Generate [Mermaid state diagram](https://mermaid.js.org/syntax/stateDiagram.html) code for a graph. #### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) -- The Mermaid code for the graph. #### Parameters **`graph`** : `Graph`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] The graph to generate the image for. **`start_node`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`NodeIdent`\] | `NodeIdent` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Identifiers of nodes that start the graph. **`highlighted_nodes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`NodeIdent`\] | `NodeIdent` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Identifiers of nodes to highlight. **`highlight_css`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_HIGHLIGHT_CSS` CSS to use for highlighting nodes. **`title`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` The title of the diagram. **`edge_labels`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include edge labels in the diagram. **`notes`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to include notes in the diagram. **`direction`** : `StateDiagramDirection` | [`None`](https://docs.python.org/3/library/constants.html#None) The direction of flow. ### request\_image ```python def request_image( graph: Graph[Any, Any, Any], kwargs: Unpack[MermaidConfig] = {}, ) -> bytes ``` Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink). #### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- The image data. #### Parameters **`graph`** : `Graph`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] The graph to generate the image for. **`**kwargs`** : [`Unpack`](https://docs.python.org/3/library/typing.html#typing.Unpack)\[`MermaidConfig`\] _Default:_ `{}` Additional parameters to configure mermaid chart generation. ### save\_image ```python def save_image( path: Path | str, graph: Graph[Any, Any, Any], kwargs: Unpack[MermaidConfig] = {}, ) -> None ``` Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink) and save it to a local file. #### Returns [`None`](https://docs.python.org/3/library/constants.html#None) #### Parameters **`path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) The path to save the image to. **`graph`** : `Graph`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] The graph to generate the image for. **`**kwargs`** : [`Unpack`](https://docs.python.org/3/library/typing.html#typing.Unpack)\[`MermaidConfig`\] _Default:_ `{}` Additional parameters to configure mermaid chart generation. ### DEFAULT\_HIGHLIGHT\_CSS The default CSS to use for highlighting nodes. **Default:** `'fill:#fdff32'` ### StateDiagramDirection Used to specify the direction of the state diagram generated by mermaid. - `'TB'`: Top to bottom, this is the default for mermaid charts. - `'LR'`: Left to right - `'RL'`: Right to left - `'BT'`: Bottom to top **Default:** `Literal['TB', 'LR', 'RL', 'BT']` ### NodeIdent A type alias for a node identifier. This can be: - A node instance (instance of a subclass of [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode)). - A node class (subclass of [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode)). - A string representing the node ID. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'type[BaseNode[Any, Any, Any]] | BaseNode[Any, Any, Any] | str'` --- # [pydantic_graph.nodes](https://pydantic.dev/docs/ai/api/pydantic_graph/nodes/) # pydantic\_graph.nodes ### GraphRunContext **Bases:** `Generic[StateT, DepsT]` Context for a graph. #### Attributes ##### state The state of the graph. **Type:** `StateT` ##### deps Dependencies for the graph. **Type:** `DepsT` ### BaseNode **Bases:** `ABC`, `Generic[StateT, DepsT, NodeRunEndT]` Base class for a node. #### Attributes ##### docstring\_notes Set to `True` to generate mermaid diagram notes from the class's docstring. While this can add valuable information to the diagram, it can make diagrams harder to view, hence it is disabled by default. You can also customise notes overriding the [`get_note`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.get_note) method. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` #### Methods ##### run `@abstractmethod` `@async` ```python def run( ctx: GraphRunContext[StateT, DepsT], ) -> BaseNode[StateT, DepsT, Any] | End[NodeRunEndT] ``` Run the node. This is an abstract method that must be implemented by subclasses. Return types used at runtime The return type of this method are read by `pydantic_graph` at runtime and used to define which nodes can be called next in the graph. This is displayed in [mermaid diagrams](/docs/ai/api/pydantic_graph/mermaid) and enforced when running the graph. ###### Returns `BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `End`\[`NodeRunEndT`\] -- The next node to run or [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) to signal the end of the graph. ###### Parameters **`ctx`** : `GraphRunContext`\[`StateT`, `DepsT`\] The graph context. ##### get\_node\_id `@cached` `@classmethod` ```python def get_node_id(cls) -> str ``` Get the ID of the node. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### get\_note `@classmethod` ```python def get_note(cls) -> str | None ``` Get a note about the node to render on mermaid charts. By default, this returns a note only if [`docstring_notes`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.docstring_notes) is `True`. You can override this method to customise the node notes. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### get\_node\_def `@classmethod` ```python def get_node_def( cls, local_ns: dict[str, Any] | None, ) -> NodeDef[StateT, DepsT, NodeRunEndT] ``` Get the node definition. ###### Returns `NodeDef`\[`StateT`, `DepsT`, `NodeRunEndT`\] ##### deep\_copy ```python def deep_copy() -> Self ``` Returns a deep copy of the node. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) ### End **Bases:** `Generic[RunEndT]` Type to return from a node to signal the end of the graph. #### Attributes ##### data Data to return from the graph. **Type:** `RunEndT` #### Methods ##### deep\_copy\_data ```python def deep_copy_data() -> End[RunEndT] ``` Returns a deep copy of the end of the run. ###### Returns `End`\[`RunEndT`\] ### Edge Annotation to apply a label to an edge in a graph. #### Attributes ##### label Label for the edge. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) ### StateT Type variable for the state in a graph. **Default:** `TypeVar('StateT', default=None)` ### DepsT Type variable for the dependencies of a graph and node. **Default:** `TypeVar('DepsT', default=None, contravariant=True)` ### RunEndT Covariant type variable for the return type of a graph [`run`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run). **Default:** `TypeVar('RunEndT', covariant=True, default=None)` ### NodeRunEndT Covariant type variable for the return type of a node [`run`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.run). **Default:** `TypeVar('NodeRunEndT', covariant=True, default=Never)` --- # [pydantic_graph.persistence](https://pydantic.dev/docs/ai/api/pydantic_graph/persistence/) # pydantic\_graph.persistence ### NodeSnapshot **Bases:** `Generic[StateT, RunEndT]` History step describing the execution of a node in a graph. #### Attributes ##### state The state of the graph before the node is run. **Type:** `StateT` ##### node The node to run next. **Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[`BaseNode`\[`StateT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `RunEndT`\], `_utils.CustomNodeSchema`()\] ##### start\_ts The timestamp when the node started running, `None` until the run starts. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### duration The duration of the node run in seconds, if the node has been run. **Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### status The status of the snapshot. **Type:** `SnapshotStatus` **Default:** `'created'` ##### kind The kind of history step, can be used as a discriminator when deserializing history. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['node'\] **Default:** `'node'` ##### id Unique ID of the snapshot. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `UNSET_SNAPSHOT_ID` ### EndSnapshot **Bases:** `Generic[StateT, RunEndT]` History step describing the end of a graph run. #### Attributes ##### state The state of the graph at the end of the run. **Type:** `StateT` ##### result The result of the graph run. **Type:** `End`\[`RunEndT`\] ##### ts The timestamp when the graph run ended. **Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=(_utils.now_utc))` ##### kind The kind of history step, can be used as a discriminator when deserializing history. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['end'\] **Default:** `'end'` ##### id Unique ID of the snapshot. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `UNSET_SNAPSHOT_ID` ##### node Shim to get the [`result`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.EndSnapshot.result). Useful to allow `[snapshot.node for snapshot in persistence.history]`. **Type:** `End`\[`RunEndT`\] ### BaseStatePersistence **Bases:** `ABC`, `Generic[StateT, RunEndT]` Abstract base class for storing the state of a graph run. Each instance of a `BaseStatePersistence` subclass should be used for a single graph run. #### Methods ##### snapshot\_node `@abstractmethod` `@async` ```python def snapshot_node(state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None ``` Snapshot the state of a graph, when the next step is to run a node. This method should add a [`NodeSnapshot`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot) to persistence. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`state`** : `StateT` The state of the graph. **`next_node`** : `BaseNode`\[`StateT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `RunEndT`\] The next node to run. ##### snapshot\_node\_if\_new `@abstractmethod` `@async` ```python def snapshot_node_if_new( snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT], ) -> None ``` Snapshot the state of a graph if the snapshot ID doesn't already exist in persistence. This method will generally call [`snapshot_node`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.snapshot_node) but should do so in an atomic way. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`snapshot_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The ID of the snapshot to check. **`state`** : `StateT` The state of the graph. **`next_node`** : `BaseNode`\[`StateT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `RunEndT`\] The next node to run. ##### snapshot\_end `@abstractmethod` `@async` ```python def snapshot_end(state: StateT, end: End[RunEndT]) -> None ``` Snapshot the state of a graph when the graph has ended. This method should add an [`EndSnapshot`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.EndSnapshot) to persistence. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`state`** : `StateT` The state of the graph. **`end`** : `End`\[`RunEndT`\] data from the end of the run. ##### record\_run `@abstractmethod` ```python def record_run(snapshot_id: str) -> AbstractAsyncContextManager[None] ``` Record the run of the node, or error if the node is already running. In particular this should set: - [`NodeSnapshot.status`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot.status) to `'running'` and [`NodeSnapshot.start_ts`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot.start_ts) when the run starts. - [`NodeSnapshot.status`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot.status) to `'success'` or `'error'` and [`NodeSnapshot.duration`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot.duration) when the run finishes. ###### Returns `AbstractAsyncContextManager`\[[`None`](https://docs.python.org/3/library/constants.html#None)\] -- An async context manager that records the run of the node. ###### Parameters **`snapshot_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) The ID of the snapshot to record. ###### Raises - `GraphNodeRunningError` -- if the node status is not `'created'` or `'pending'`. - `LookupError` -- if the snapshot ID is not found in persistence. ##### load\_next `@abstractmethod` `@async` ```python def load_next() -> NodeSnapshot[StateT, RunEndT] | None ``` Retrieve a node snapshot with status `'created'` and set its status to `'pending'`. This is used by [`Graph.iter_from_persistence`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter_from_persistence) to get the next node to run. Returns: The snapshot, or `None` if no snapshot with status `'created'` exists. ###### Returns `NodeSnapshot`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### load\_all `@abstractmethod` `@async` ```python def load_all() -> list[Snapshot[StateT, RunEndT]] ``` Load the entire history of snapshots. `load_all` is not used by pydantic-graph itself, instead it's provided to make it convenient to get all [snapshots](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.Snapshot) from persistence. Returns: The list of snapshots. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Snapshot`\[`StateT`, `RunEndT`\]\] ##### set\_graph\_types ```python def set_graph_types(graph: Graph[StateT, Any, RunEndT]) -> None ``` Set the types of the state and run end from a graph. You generally won't need to customise this method, instead implement [`set_types`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.set_types) and [`should_set_types`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.should_set_types). ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ##### should\_set\_types ```python def should_set_types() -> bool ``` Whether types need to be set. Implementations should override this method to return `True` when types have not been set if they are needed. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### set\_types ```python def set_types(state_type: type[StateT], run_end_type: type[RunEndT]) -> None ``` Set the types of the state and run end. This can be used to create [type adapters](https://docs.pydantic.dev/latest/api/pydantic/type_adapter/#pydantic.type_adapter.TypeAdapter) for serializing and deserializing snapshots, e.g. with [`build_snapshot_list_type_adapter`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.build_snapshot_list_type_adapter). ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`state_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`StateT`\] The state type. **`run_end_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`RunEndT`\] The run end type. ### build\_snapshot\_list\_type\_adapter ```python def build_snapshot_list_type_adapter( state_t: type[StateT], run_end_t: type[RunEndT], ) -> pydantic.TypeAdapter[list[Snapshot[StateT, RunEndT]]] ``` Build a type adapter for a list of snapshots. This method should be called from within [`set_types`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.set_types) where context variables will be set such that Pydantic can create a schema for [`NodeSnapshot.node`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot.node). #### Returns [`pydantic.TypeAdapter`](https://docs.pydantic.dev/latest/api/pydantic/type_adapter/#pydantic.type_adapter.TypeAdapter)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`Snapshot`\[`StateT`, `RunEndT`\]\]\] ### SnapshotStatus The status of a snapshot. - `'created'`: The snapshot has been created but not yet run. - `'pending'`: The snapshot has been retrieved with [`load_next`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.load_next) but not yet run. - `'running'`: The snapshot is currently running. - `'success'`: The snapshot has been run successfully. - `'error'`: The snapshot has been run but an error occurred. **Default:** `Literal['created', 'pending', 'running', 'success', 'error']` ### Snapshot A step in the history of a graph run. [`Graph.run`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run) returns a list of these steps describing the execution of the graph, together with the run return value. **Default:** `NodeSnapshot[StateT, RunEndT] | EndSnapshot[StateT, RunEndT]` In memory state persistence. This module provides simple in memory state persistence for graphs. ### SimpleStatePersistence **Bases:** `BaseStatePersistence[StateT, RunEndT]` Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default. #### Attributes ##### last\_snapshot The last snapshot. **Type:** `Snapshot`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### FullStatePersistence **Bases:** `BaseStatePersistence[StateT, RunEndT]` In memory state persistence that hold a list of snapshots. #### Attributes ##### deep\_copy Whether to deep copy the state and nodes when storing them. Defaults to `True` so even if nodes or state are modified after the snapshot is taken, the persistence history will record the value at the time of the snapshot. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True` ##### history List of snapshots taken during the graph run. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Snapshot`\[`StateT`, `RunEndT`\]\] **Default:** `field(default_factory=(list[Snapshot[StateT, RunEndT]]))` #### Methods ##### dump\_json ```python def dump_json(indent: int | None = None) -> bytes ``` Dump the history to JSON bytes. ###### Returns [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) ##### load\_json ```python def load_json(json_data: str | bytes | bytearray) -> None ``` Load the history from JSON. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ### FileStatePersistence **Bases:** `BaseStatePersistence[StateT, RunEndT]` File based state persistence that hold graph run state in a JSON file. #### Attributes ##### json\_file Path to the JSON file where the snapshots are stored. You should use a different file for each graph run, but a single file should be reused for multiple steps of the same run. For example if you have a run ID of the form `run_123abc`, you might create a `FileStatePersistence` thus: ```py from pathlib import Path from pydantic_graph import FullStatePersistence run_id = 'run_123abc' persistence = FullStatePersistence(Path('runs') / f'{run_id}.json') ``` **Type:** `Path` #### Methods ##### should\_set\_types ```python def should_set_types() -> bool ``` Whether types need to be set. ###### Returns [`bool`](https://docs.python.org/3/library/functions.html#bool) --- # [pydantic_ai.ui.ag_ui](https://pydantic.dev/docs/ai/api/ui/ag_ui/) # pydantic\_ai.ui.ag\_ui AG-UI protocol integration for Pydantic AI agents. ### AGUIEventStream **Bases:** `UIEventStream[RunAgentInput, BaseEvent, AgentDepsT, OutputDataT]` UI event stream transformer for the Agent-User Interaction (AG-UI) protocol. #### Methods ##### handle\_event `@async` ```python def handle_event(event: NativeEvent) -> AsyncIterator[BaseEvent] ``` Override to set timestamps on all AG-UI events. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`BaseEvent`\] ### AGUIAdapter **Bases:** `UIAdapter[RunAgentInput, Message, BaseEvent, AgentDepsT, OutputDataT]` UI adapter for the Agent-User Interaction (AG-UI) protocol. #### Attributes ##### ag\_ui\_version AG-UI protocol version controlling behavior thresholds. Accepts any version string (e.g. `'0.1.13'`). Defaults to the version detected from the installed `ag-ui-protocol` package. Known thresholds: - `< 0.1.13`: emits `THINKING_*` events during streaming, drops `ThinkingPart` from `dump_messages` output. - `>= 0.1.13`: emits `REASONING_*` events with encrypted metadata during streaming, and includes `ThinkingPart` as `ReasoningMessage` in `dump_messages` output for full round-trip fidelity of thinking signatures and provider metadata. - `>= 0.1.15`: emits typed multimodal input content (`ImageInputContent`, `AudioInputContent`, `VideoInputContent`, `DocumentInputContent`) instead of generic `BinaryInputContent`. `load_messages` always accepts `ReasoningMessage` and multimodal content types regardless of this setting. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `DEFAULT_AG_UI_VERSION` ##### preserve\_file\_data Whether to preserve agent-generated files and uploaded files in AG-UI message conversion. When `True`, agent-generated files and uploaded files are stored as [activity messages](https://docs.ag-ui.com/concepts/activities) during `dump_messages` and restored during `load_messages`, enabling full round-trip fidelity. When `False` (default), they are silently dropped. If your AG-UI frontend uses activities, be aware that `pydantic_ai_*` activity types are reserved for internal round-trip use and should be ignored by frontend activity handlers. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False` ##### messages Pydantic AI messages from the AG-UI run input. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] ##### toolset Toolset representing frontend tools from the AG-UI run input. **Type:** [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### state Frontend state from the AG-UI run input. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) #### Methods ##### build\_run\_input `@classmethod` ```python def build_run_input(cls, body: bytes) -> RunAgentInput ``` Build an AG-UI run input object from the request body. ###### Returns `RunAgentInput` ##### build\_event\_stream ```python def build_event_stream( ) -> UIEventStream[RunAgentInput, BaseEvent, AgentDepsT, OutputDataT] ``` Build an AG-UI event stream transformer. ###### Returns `UIEventStream`\[`RunAgentInput`, `BaseEvent`, `AgentDepsT`, `OutputDataT`\] ##### from\_request `@async` `@classmethod` ```python def from_request( cls, request: Request, agent: AbstractAgent[AgentDepsT, OutputDataT], ag_ui_version: str = DEFAULT_AG_UI_VERSION, preserve_file_data: bool = False, kwargs: Any = {}, ) -> AGUIAdapter[AgentDepsT, OutputDataT] ``` Extends [`from_request`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request) with AG-UI-specific parameters. ###### Returns `AGUIAdapter`\[`AgentDepsT`, `OutputDataT`\] ##### load\_messages `@classmethod` ```python def load_messages( cls, messages: Sequence[Message], preserve_file_data: bool = False, ) -> list[ModelMessage] ``` Transform AG-UI messages into Pydantic AI messages. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] ##### dump\_messages `@classmethod` ```python def dump_messages( cls, messages: Sequence[ModelMessage], ag_ui_version: str = DEFAULT_AG_UI_VERSION, preserve_file_data: bool = False, ) -> list[Message] ``` Transform Pydantic AI messages into AG-UI messages. Note: The round-trip `dump_messages` -> `load_messages` is not fully lossless: - `TextPart.id`, `.provider_name`, `.provider_details` are lost. - `ToolCallPart.id`, `.provider_name`, `.provider_details` are lost. - `BuiltinToolCallPart.id`, `.provider_details` are lost (only `.provider_name` survives via the prefixed tool call ID). - `BuiltinToolReturnPart.provider_details` is lost. - `RetryPromptPart` becomes `ToolReturnPart` (or `UserPromptPart`) on reload. - `CachePoint` and `UploadedFile` content items are dropped (unless `preserve_file_data=True`). - `ThinkingPart` is dropped when `ag_ui_version='0.1.10'`. - `FilePart` is silently dropped unless `preserve_file_data=True`. - `UploadedFile` in a multi-item `UserPromptPart` is split into a separate activity message when `preserve_file_data=True`, which reloads as a separate `UserPromptPart`. - Part ordering within a `ModelResponse` may change when text follows tool calls. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Message`\] -- A list of AG-UI Message objects. ###### Parameters **`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] A sequence of ModelMessage objects to convert. **`ag_ui_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_AG_UI_VERSION` AG-UI protocol version controlling `ThinkingPart` emission. **`preserve_file_data`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to include `FilePart` and `UploadedFile` as `ActivityMessage`. ### DEFAULT\_AG\_UI\_VERSION The default AG-UI version, auto-detected from the installed `ag-ui-protocol` package. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `detect_ag_ui_version()` AG-UI protocol integration for Pydantic AI agents. ### AGUIApp **Bases:** `Generic[AgentDepsT, OutputDataT]`, `Starlette` ASGI application for running Pydantic AI agents with AG-UI protocol support. #### Methods ##### \_\_init\_\_ ```python def __init__( agent: AbstractAgent[AgentDepsT, OutputDataT], ag_ui_version: str = DEFAULT_AG_UI_VERSION, preserve_file_data: bool = False, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[Any] | None = None, debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[Self] | None = None, ) -> None ``` An ASGI application that handles every request by running the agent and streaming the response. Note that the `deps` will be the same for each request, with the exception of the frontend state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler) protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use `AGUIAdapter.run_stream()` or `AGUIAdapter.dispatch_request()` instead. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ###### Parameters **`agent`** : `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\] The agent to run. **`ag_ui_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_AG_UI_VERSION` AG-UI protocol version controlling thinking/reasoning event format. **`preserve_file_data`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether to preserve agent-generated files and uploaded files in AG-UI message conversion. See [`AGUIAdapter.preserve_file_data`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.preserve_file_data). **`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools for this run. **`on_complete`** : `OnCompleteFunc`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can access `all_messages()` and other result data. **`debug`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Boolean indicating if debug tracebacks should be returned on errors. **`routes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`BaseRoute`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of routes to serve incoming HTTP and WebSocket requests. **`middleware`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Middleware`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. **`exception_handlers`** : [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `ExceptionHandler`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. **`on_startup`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. **`on_shutdown`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. **`lifespan`** : `Lifespan`\[[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. --- # [pydantic_ai.ui](https://pydantic.dev/docs/ai/api/ui/base/) # pydantic\_ai.ui ### MessagesBuilder Helper class to build Pydantic AI messages from request/response parts. #### Methods ##### add ```python def add(part: ModelRequestPart | ModelResponsePart) -> None ``` Add a new part, creating a new request or response message if necessary. ###### Returns [`None`](https://docs.python.org/3/library/constants.html#None) ### StateHandler **Bases:** [`Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol) Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field. #### Attributes ##### state Get the current state of the agent run. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) ### UIEventStream **Bases:** `ABC`, `Generic[RunInputT, EventT, AgentDepsT, OutputDataT]` Base class for UI event stream transformers. This class is responsible for transforming Pydantic AI events into protocol-specific events. #### Attributes ##### accept The `Accept` header value of the request, used to determine how to encode the protocol-specific events for the streaming response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### message\_id The message ID to use for the next event. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `field(default_factory=(lambda: str(uuid4())))` ##### response\_headers Response headers to return to the frontend. **Type:** [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### content\_type Get the content type for the event stream, compatible with the `Accept` header value. By default, this returns the Server-Sent Events content type (`text/event-stream`). If a subclass supports other types as well, it should consider `self.accept` in [`encode_event()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.encode_event) and return the resulting content type. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) #### Methods ##### new\_message\_id ```python def new_message_id() -> str ``` Generate and store a new message ID. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### encode\_event `@abstractmethod` ```python def encode_event(event: EventT) -> str ``` Encode a protocol-specific event as a string. ###### Returns [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### encode\_stream `@async` ```python def encode_stream(stream: AsyncIterator[EventT]) -> AsyncIterator[str] ``` Encode a stream of protocol-specific events as strings according to the `Accept` header value. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ##### streaming\_response ```python def streaming_response(stream: AsyncIterator[EventT]) -> StreamingResponse ``` Generate a streaming response from a stream of protocol-specific events. ###### Returns `StreamingResponse` ##### transform\_stream `@async` ```python def transform_stream( stream: AsyncIterator[NativeEvent], on_complete: OnCompleteFunc[EventT] | None = None, ) -> AsyncIterator[EventT] ``` Transform a stream of Pydantic AI events into protocol-specific events. This method dispatches to specific hooks and `handle_*` methods that subclasses can override: - [`before_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.before_stream) - [`after_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.after_stream) - [`on_error()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.on_error) - [`before_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.before_request) - [`after_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.after_request) - [`before_response()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.before_response) - [`after_response()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.after_response) - [`handle_event()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_event) ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`stream`** : [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`NativeEvent`\] The stream of Pydantic AI events to transform. **`on_complete`** : `OnCompleteFunc`\[`EventT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events. ##### handle\_event `@async` ```python def handle_event(event: NativeEvent) -> AsyncIterator[EventT] ``` Transform a Pydantic AI event into one or more protocol-specific events. This method dispatches to specific `handle_*` methods based on event type: - [`PartStartEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartStartEvent) -> [`handle_part_start()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_part_start) - [`PartDeltaEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartDeltaEvent) -> `handle_part_delta` - [`PartEndEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartEndEvent) -> `handle_part_end` - [`FinalResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FinalResultEvent) -> `handle_final_result` - [`FunctionToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolCallEvent) -> `handle_function_tool_call` - [`FunctionToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolResultEvent) -> `handle_function_tool_result` - [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent) -> `handle_run_result` Subclasses are encouraged to override the individual `handle_*` methods rather than this one. If you need specific behavior for all events, make sure you call the super method. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ##### handle\_part\_start `@async` ```python def handle_part_start(event: PartStartEvent) -> AsyncIterator[EventT] ``` Handle a `PartStartEvent`. This method dispatches to specific `handle_*` methods based on part type: - [`TextPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPart) -> [`handle_text_start()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_text_start) - [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) -> [`handle_thinking_start()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_thinking_start) - [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) -> [`handle_tool_call_start()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_tool_call_start) - [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) -> [`handle_builtin_tool_call_start()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_builtin_tool_call_start) - [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) -> [`handle_builtin_tool_return()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_builtin_tool_return) - [`FilePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FilePart) -> [`handle_file()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_file) - [`CompactionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.CompactionPart) -> [`handle_compaction()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_compaction) Subclasses are encouraged to override the individual `handle_*` methods rather than this one. If you need specific behavior for all part start events, make sure you call the super method. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`event`** : [`PartStartEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartStartEvent) The part start event. ##### handle\_part\_delta `@async` ```python def handle_part_delta(event: PartDeltaEvent) -> AsyncIterator[EventT] ``` Handle a PartDeltaEvent. This method dispatches to specific `handle_*_delta` methods based on part delta type: - [`TextPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPartDelta) -> [`handle_text_delta()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_text_delta) - [`ThinkingPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPartDelta) -> [`handle_thinking_delta()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_thinking_delta) - [`ToolCallPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPartDelta) -> [`handle_tool_call_delta()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_tool_call_delta) Subclasses are encouraged to override the individual `handle_*_delta` methods rather than this one. If you need specific behavior for all part delta events, make sure you call the super method. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`event`** : [`PartDeltaEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartDeltaEvent) The PartDeltaEvent. ##### handle\_part\_end `@async` ```python def handle_part_end(event: PartEndEvent) -> AsyncIterator[EventT] ``` Handle a `PartEndEvent`. This method dispatches to specific `handle_*_end` methods based on part type: - [`TextPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPart) -> [`handle_text_end()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_text_end) - [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) -> [`handle_thinking_end()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_thinking_end) - [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) -> [`handle_tool_call_end()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_tool_call_end) - [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) -> [`handle_builtin_tool_call_end()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_builtin_tool_call_end) Subclasses are encouraged to override the individual `handle_*_end` methods rather than this one. If you need specific behavior for all part end events, make sure you call the super method. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`event`** : [`PartEndEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartEndEvent) The part end event. ##### before\_stream `@async` ```python def before_stream() -> AsyncIterator[EventT] ``` Yield events before agent streaming starts. This hook is called before any agent events are processed. Override this to inject custom events at the start of the stream. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ##### after\_stream `@async` ```python def after_stream() -> AsyncIterator[EventT] ``` Yield events after agent streaming completes. This hook is called after all agent events have been processed. Override this to inject custom events at the end of the stream. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ##### on\_error `@async` ```python def on_error(error: Exception) -> AsyncIterator[EventT] ``` Handle errors that occur during streaming. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`error`** : [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception) The error that occurred during streaming. ##### before\_request `@async` ```python def before_request() -> AsyncIterator[EventT] ``` Yield events before a model request is processed. Override this to inject custom events at the start of the request. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ##### after\_request `@async` ```python def after_request() -> AsyncIterator[EventT] ``` Yield events after a model request is processed. Override this to inject custom events at the end of the request. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ##### before\_response `@async` ```python def before_response() -> AsyncIterator[EventT] ``` Yield events before a model response is processed. Override this to inject custom events at the start of the response. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ##### after\_response `@async` ```python def after_response() -> AsyncIterator[EventT] ``` Yield events after a model response is processed. Override this to inject custom events at the end of the response. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ##### handle\_text\_start `@async` ```python def handle_text_start( part: TextPart, follows_text: bool = False, ) -> AsyncIterator[EventT] ``` Handle the start of a `TextPart`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`part`** : [`TextPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPart) The text part. **`follows_text`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether the part is directly preceded by another text part. In this case, you may want to yield a "text-delta" event instead of a "text-start" event. ##### handle\_text\_delta `@async` ```python def handle_text_delta(delta: TextPartDelta) -> AsyncIterator[EventT] ``` Handle a `TextPartDelta`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`delta`** : [`TextPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPartDelta) The text part delta. ##### handle\_text\_end `@async` ```python def handle_text_end( part: TextPart, followed_by_text: bool = False, ) -> AsyncIterator[EventT] ``` Handle the end of a `TextPart`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`part`** : [`TextPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPart) The text part. **`followed_by_text`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether the part is directly followed by another text part. In this case, you may not want to yield a "text-end" event yet. ##### handle\_thinking\_start `@async` ```python def handle_thinking_start( part: ThinkingPart, follows_thinking: bool = False, ) -> AsyncIterator[EventT] ``` Handle the start of a `ThinkingPart`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`part`** : [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) The thinking part. **`follows_thinking`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether the part is directly preceded by another thinking part. In this case, you may want to yield a "thinking-delta" event instead of a "thinking-start" event. ##### handle\_thinking\_delta `@async` ```python def handle_thinking_delta(delta: ThinkingPartDelta) -> AsyncIterator[EventT] ``` Handle a `ThinkingPartDelta`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`delta`** : [`ThinkingPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPartDelta) The thinking part delta. ##### handle\_thinking\_end `@async` ```python def handle_thinking_end( part: ThinkingPart, followed_by_thinking: bool = False, ) -> AsyncIterator[EventT] ``` Handle the end of a `ThinkingPart`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`part`** : [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) The thinking part. **`followed_by_thinking`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False` Whether the part is directly followed by another thinking part. In this case, you may not want to yield a "thinking-end" event yet. ##### handle\_tool\_call\_start `@async` ```python def handle_tool_call_start(part: ToolCallPart) -> AsyncIterator[EventT] ``` Handle the start of a `ToolCallPart`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`part`** : [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) The tool call part. ##### handle\_tool\_call\_delta `@async` ```python def handle_tool_call_delta(delta: ToolCallPartDelta) -> AsyncIterator[EventT] ``` Handle a `ToolCallPartDelta`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`delta`** : [`ToolCallPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPartDelta) The tool call part delta. ##### handle\_tool\_call\_end `@async` ```python def handle_tool_call_end(part: ToolCallPart) -> AsyncIterator[EventT] ``` Handle the end of a `ToolCallPart`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`part`** : [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) The tool call part. ##### handle\_builtin\_tool\_call\_start `@async` ```python def handle_builtin_tool_call_start(part: BuiltinToolCallPart) -> AsyncIterator[EventT] ``` Handle a `BuiltinToolCallPart` at start. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`part`** : [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) The builtin tool call part. ##### handle\_builtin\_tool\_call\_end `@async` ```python def handle_builtin_tool_call_end(part: BuiltinToolCallPart) -> AsyncIterator[EventT] ``` Handle the end of a `BuiltinToolCallPart`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`part`** : [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) The builtin tool call part. ##### handle\_builtin\_tool\_return `@async` ```python def handle_builtin_tool_return(part: BuiltinToolReturnPart) -> AsyncIterator[EventT] ``` Handle a `BuiltinToolReturnPart`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`part`** : [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) The builtin tool return part. ##### handle\_file `@async` ```python def handle_file(part: FilePart) -> AsyncIterator[EventT] ``` Handle a `FilePart`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`part`** : [`FilePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FilePart) The file part. ##### handle\_compaction `@async` ```python def handle_compaction(part: CompactionPart) -> AsyncIterator[EventT] ``` Handle a `CompactionPart`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`part`** : [`CompactionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.CompactionPart) The compaction part. ##### handle\_final\_result `@async` ```python def handle_final_result(event: FinalResultEvent) -> AsyncIterator[EventT] ``` Handle a `FinalResultEvent`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`event`** : [`FinalResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FinalResultEvent) The final result event. ##### handle\_function\_tool\_call `@async` ```python def handle_function_tool_call(event: FunctionToolCallEvent) -> AsyncIterator[EventT] ``` Handle a `FunctionToolCallEvent`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`event`** : [`FunctionToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolCallEvent) The function tool call event. ##### handle\_function\_tool\_result `@async` ```python def handle_function_tool_result(event: FunctionToolResultEvent) -> AsyncIterator[EventT] ``` Handle a `FunctionToolResultEvent`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`event`** : [`FunctionToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolResultEvent) The function tool result event. ##### handle\_run\_result `@async` ```python def handle_run_result(event: AgentRunResultEvent) -> AsyncIterator[EventT] ``` Handle an `AgentRunResultEvent`. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`event`** : [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent) The agent run result event. ### StateDeps **Bases:** `Generic[StateT]` Dependency type that holds state. This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`. The state is set using the `state` setter by the `Adapter` when the run starts. Implements the `StateHandler` protocol. ### UIAdapter **Bases:** `ABC`, `Generic[RunInputT, MessageT, EventT, AgentDepsT, OutputDataT]` Base class for UI adapters. This class is responsible for transforming agent run input received from the frontend into arguments for `Agent.run_stream_events()`, running the agent, and then transforming Pydantic AI events into protocol-specific events. The event stream transformation is handled by a protocol-specific [`UIEventStream`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream) subclass. #### Attributes ##### agent The Pydantic AI agent to run. **Type:** `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\] ##### run\_input The protocol-specific run input object. **Type:** `RunInputT` ##### accept The `Accept` header value of the request, used to determine how to encode the protocol-specific events for the streaming response. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### messages Pydantic AI messages from the protocol-specific run input. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] ##### toolset Toolset representing frontend tools from the protocol-specific run input. **Type:** [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### state Frontend state from the protocol-specific run input. **Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) ##### deferred\_tool\_results Deferred tool results extracted from the request, used for tool approval workflows. **Type:** [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) #### Methods ##### from\_request `@async` `@classmethod` ```python def from_request( cls, request: Request, agent: AbstractAgent[AgentDepsT, OutputDataT], kwargs: Any = {}, ) -> Self ``` Create an adapter from a request. Extra keyword arguments are forwarded to the adapter constructor, allowing subclasses to accept additional adapter-specific parameters. ###### Returns [`Self`](https://docs.python.org/3/library/typing.html#typing.Self) ##### build\_run\_input `@abstractmethod` `@classmethod` ```python def build_run_input(cls, body: bytes) -> RunInputT ``` Build a protocol-specific run input object from the request body. ###### Returns `RunInputT` ##### load\_messages `@abstractmethod` `@classmethod` ```python def load_messages(cls, messages: Sequence[MessageT]) -> list[ModelMessage] ``` Transform protocol-specific messages into Pydantic AI messages. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] ##### dump\_messages `@classmethod` ```python def dump_messages(cls, messages: Sequence[ModelMessage]) -> list[MessageT] ``` Transform Pydantic AI messages into protocol-specific messages. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`MessageT`\] ##### build\_event\_stream `@abstractmethod` ```python def build_event_stream() -> UIEventStream[RunInputT, EventT, AgentDepsT, OutputDataT] ``` Build a protocol-specific event stream transformer. ###### Returns `UIEventStream`\[`RunInputT`, `EventT`, `AgentDepsT`, `OutputDataT`\] ##### transform\_stream ```python def transform_stream( stream: AsyncIterator[NativeEvent], on_complete: OnCompleteFunc[EventT] | None = None, ) -> AsyncIterator[EventT] ``` Transform a stream of Pydantic AI events into protocol-specific events. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`stream`** : [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`NativeEvent`\] The stream of Pydantic AI events to transform. **`on_complete`** : `OnCompleteFunc`\[`EventT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events. ##### encode\_stream ```python def encode_stream(stream: AsyncIterator[EventT]) -> AsyncIterator[str] ``` Encode a stream of protocol-specific events as strings according to the `Accept` header value. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] ###### Parameters **`stream`** : [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] The stream of protocol-specific events to encode. ##### streaming\_response ```python def streaming_response(stream: AsyncIterator[EventT]) -> StreamingResponse ``` Generate a streaming response from a stream of protocol-specific events. ###### Returns `StreamingResponse` ###### Parameters **`stream`** : [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] The stream of protocol-specific events to encode. ##### run\_stream\_native ```python def run_stream_native( output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[NativeEvent] ``` Run the agent with the protocol-specific run input and stream Pydantic AI events. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`NativeEvent`\] ###### Parameters **`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools to use for this run. ##### run\_stream ```python def run_stream( output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[EventT] | None = None, ) -> AsyncIterator[EventT] ``` Run the agent with the protocol-specific run input and stream protocol-specific events. ###### Returns [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\] ###### Parameters **`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `AgentDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools to use for this run. **`on_complete`** : `OnCompleteFunc`\[`EventT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events. ##### dispatch\_request `@async` `@classmethod` ```python def dispatch_request( cls, request: Request, agent: AbstractAgent[DispatchDepsT, DispatchOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[DispatchDepsT] = None, deps: DispatchDepsT = None, output_type: OutputSpec[Any] | None = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[DispatchDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[DispatchDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[EventT] | None = None, kwargs: Any = {}, ) -> Response ``` Handle a protocol-specific HTTP request by running the agent and returning a streaming response of protocol-specific events. Extra keyword arguments are forwarded to [`from_request`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request), allowing subclasses to accept additional adapter-specific parameters. ###### Returns `Response` -- A streaming Starlette response with protocol-specific events encoded per the request's `Accept` header value. ###### Parameters **`request`** : `Request` The incoming Starlette/FastAPI request. **`agent`** : `AbstractAgent`\[`DispatchDepsT`, `DispatchOutputDataT`\] The agent to run. **`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. **`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` History of the conversation so far. **`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional results for deferred tool calls in the message history. **`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional model to use for this run, required if `model` was not set when creating the agent. **`instructions`** : `_instructions.AgentInstructions`\[`DispatchDepsT`\] _Default:_ `None` Optional additional instructions to use for this run. **`deps`** : `DispatchDepsT` _Default:_ `None` Optional dependencies to use for this run. **`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional settings to use for this model's request. **`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional limits on model request count or token usage. **`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional usage to start with, useful for resuming a conversation or agents used in tools. **`metadata`** : `AgentMetadata`\[`DispatchDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata. **`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True` Whether to try to infer the agent name from the call frame if it's not set. **`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`DispatchDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional toolsets for this run. **`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional additional builtin tools to use for this run. **`on_complete`** : `OnCompleteFunc`\[`EventT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events. **`**kwargs`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) _Default:_ `{}` Additional keyword arguments forwarded to [`from_request`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request). ### SSE\_CONTENT\_TYPE Content type header value for Server-Sent Events (SSE). **Default:** `'text/event-stream'` ### NativeEvent Type alias for the native event type, which is either an `AgentStreamEvent` or an `AgentRunResultEvent`. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `AgentStreamEvent | AgentRunResultEvent[Any]` ### OnCompleteFunc Callback function type that receives the `AgentRunResult` of the completed run. Can be sync, async, or an async generator of protocol-specific events. **Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[AgentRunResult[Any]], None] | Callable[[AgentRunResult[Any]], Awaitable[None]] | Callable[[AgentRunResult[Any]], AsyncIterator[EventT]]` --- # [pydantic_ai.ui.vercel_ai](https://pydantic.dev/docs/ai/api/ui/vercel_ai/) # pydantic\_ai.ui.vercel\_ai Vercel AI protocol adapter for Pydantic AI agents. This module provides classes for integrating Pydantic AI agents with the Vercel AI protocol, enabling streaming event-based communication for interactive AI applications. Converted to Python from: [https://github.com/vercel/ai/blob/ai%405.0.34/packages/ai/src/ui/ui-messages.ts](https://github.com/vercel/ai/blob/ai%405.0.34/packages/ai/src/ui/ui-messages.ts) ### VercelAIEventStream **Bases:** `UIEventStream[RequestData, BaseChunk, AgentDepsT, OutputDataT]` UI event stream transformer for the Vercel AI protocol. #### Attributes ##### sdk\_version Vercel AI SDK version to target. Setting to 6 enables tool approval streaming. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[5, 6\] **Default:** `5` ##### server\_message\_id Optional server-generated message ID to include in the `StartChunk`. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### VercelAIAdapter **Bases:** `UIAdapter[RequestData, UIMessage, BaseChunk, AgentDepsT, OutputDataT]` UI adapter for the Vercel AI protocol. #### Attributes ##### sdk\_version Vercel AI SDK version to target. Default is 5 for backwards compatibility. Setting `sdk_version=6` enables tool approval streaming for human-in-the-loop workflows. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[5, 6\] **Default:** `5` ##### server\_message\_id Optional server-generated message ID to include in the `StartChunk`. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### deferred\_tool\_results Extract deferred tool results from Vercel AI messages with approval responses. **Type:** [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) ##### messages Pydantic AI messages from the Vercel AI run input. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] #### Methods ##### build\_run\_input `@classmethod` ```python def build_run_input(cls, body: bytes) -> RequestData ``` Build a Vercel AI run input object from the request body. ###### Returns `RequestData` ##### from\_request `@async` `@classmethod` ```python def from_request( cls, request: Request, agent: AbstractAgent[AgentDepsT, OutputDataT], sdk_version: Literal[5, 6] = 5, server_message_id: str | None = None, kwargs: Any = {}, ) -> VercelAIAdapter[AgentDepsT, OutputDataT] ``` Extends [`from_request`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request) with the `sdk_version` and `server_message_id` parameters. ###### Returns `VercelAIAdapter`\[`AgentDepsT`, `OutputDataT`\] ##### dispatch\_request `@async` `@classmethod` ```python def dispatch_request( cls, request: Request, agent: AbstractAgent[DispatchDepsT, DispatchOutputDataT], sdk_version: Literal[5, 6] = 5, server_message_id: str | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, instructions: _instructions.AgentInstructions[DispatchDepsT] = None, deps: DispatchDepsT = None, output_type: OutputSpec[Any] | None = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[DispatchDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[DispatchDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[BaseChunk] | None = None, kwargs: Any = {}, ) -> Response ``` Extends [`dispatch_request`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.dispatch_request) with the `sdk_version` and `server_message_id` parameters. ###### Returns `Response` ##### build\_event\_stream ```python def build_event_stream( ) -> UIEventStream[RequestData, BaseChunk, AgentDepsT, OutputDataT] ``` Build a Vercel AI event stream transformer. ###### Returns `UIEventStream`\[`RequestData`, `BaseChunk`, `AgentDepsT`, `OutputDataT`\] ##### load\_messages `@classmethod` ```python def load_messages(cls, messages: Sequence[UIMessage]) -> list[ModelMessage] ``` Transform Vercel AI messages into Pydantic AI messages. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] ##### dump\_messages `@classmethod` ```python def dump_messages( cls, messages: Sequence[ModelMessage], generate_message_id: Callable[[ModelRequest | ModelResponse, Literal['system', 'user', 'assistant'], int], str] | None = None, ) -> list[UIMessage] ``` Transform Pydantic AI messages into Vercel AI messages. ###### Returns [`list`](https://docs.python.org/3/glossary.html#term-list)\[`UIMessage`\] -- A list of UIMessage objects in Vercel AI format ###### Parameters **`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] A sequence of ModelMessage objects to convert **`generate_message_id`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`ModelRequest`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest) | [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse), [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['system', 'user', 'assistant'\], [`int`](https://docs.python.org/3/library/functions.html#int)\], [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None` Optional custom function to generate message IDs. If provided, it receives the message, the role ('system', 'user', or 'assistant'), and the message index (incremented per UIMessage appended), and should return a unique string ID. If not provided, uses `provider_response_id` for responses, run\_id-based IDs for messages with run\_id, or a deterministic UUID5 fallback. Vercel AI request types (UI messages). Converted to Python from: [https://github.com/vercel/ai/blob/ai%406.0.57/packages/ai/src/ui/ui-messages.ts](https://github.com/vercel/ai/blob/ai%406.0.57/packages/ai/src/ui/ui-messages.ts) Tool approval types (`ToolApprovalRequested`, `ToolApprovalResponded`) require AI SDK v6 or later. ### BaseUIPart **Bases:** `CamelBaseModel`, `ABC` Abstract base class for all UI parts. ### TextUIPart **Bases:** `BaseUIPart` A text part of a message. #### Attributes ##### text The text content. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### state The state of the text part. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['streaming', 'done'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_metadata The provider metadata. **Type:** `ProviderMetadata` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### ReasoningUIPart **Bases:** `BaseUIPart` A reasoning part of a message. #### Attributes ##### text The reasoning text. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### state The state of the reasoning part. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['streaming', 'done'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### provider\_metadata The provider metadata. **Type:** `ProviderMetadata` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### SourceUrlUIPart **Bases:** `BaseUIPart` A source part of a message. ### SourceDocumentUIPart **Bases:** `BaseUIPart` A document source part of a message. ### FileUIPart **Bases:** `BaseUIPart` A file part of a message. #### Attributes ##### media\_type IANA media type of the file. @see [https://www.iana.org/assignments/media-types/media-types.xhtml](https://www.iana.org/assignments/media-types/media-types.xhtml) **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### filename Optional filename of the file. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### url The URL of the file. It can either be a URL to a hosted file or a [Data URL](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs). **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### provider\_metadata The provider metadata. **Type:** `ProviderMetadata` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### StepStartUIPart **Bases:** `BaseUIPart` A step boundary part of a message. ### DataUIPart **Bases:** `BaseUIPart` Data part with dynamic type based on data name. ### ToolApprovalRequested **Bases:** `CamelBaseModel` Tool approval in requested state (awaiting user response). #### Attributes ##### id The approval request ID. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ### ToolApprovalResponded **Bases:** `CamelBaseModel` Tool approval in responded state (user has approved or denied). #### Attributes ##### id The approval request ID. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### approved Whether the user approved the tool call. **Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) ##### reason Optional reason for the approval or denial. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ### ToolInputStreamingPart **Bases:** `BaseUIPart` Tool part in input-streaming state. ### ToolInputAvailablePart **Bases:** `BaseUIPart` Tool part in input-available state. ### ToolOutputAvailablePart **Bases:** `BaseUIPart` Tool part in output-available state. ### ToolOutputErrorPart **Bases:** `BaseUIPart` Tool part in output-error state. ### ToolApprovalRequestedPart **Bases:** `BaseUIPart` Tool part in approval-requested state (awaiting user decision). ### ToolApprovalRespondedPart **Bases:** `BaseUIPart` Tool part in approval-responded state (user approved/denied, execution pending). ### ToolOutputDeniedPart **Bases:** `BaseUIPart` Tool part in output-denied state (tool was denied, terminal state). ### DynamicToolInputStreamingPart **Bases:** `BaseUIPart` Dynamic tool part in input-streaming state. ### DynamicToolInputAvailablePart **Bases:** `BaseUIPart` Dynamic tool part in input-available state. ### DynamicToolOutputAvailablePart **Bases:** `BaseUIPart` Dynamic tool part in output-available state. ### DynamicToolOutputErrorPart **Bases:** `BaseUIPart` Dynamic tool part in output-error state. ### DynamicToolApprovalRequestedPart **Bases:** `BaseUIPart` Dynamic tool part in approval-requested state (awaiting user decision). ### DynamicToolApprovalRespondedPart **Bases:** `BaseUIPart` Dynamic tool part in approval-responded state (user approved/denied, execution pending). ### DynamicToolOutputDeniedPart **Bases:** `BaseUIPart` Dynamic tool part in output-denied state (tool was denied, terminal state). ### UIMessage **Bases:** `CamelBaseModel` A message as displayed in the UI by Vercel AI Elements. #### Attributes ##### id A unique identifier for the message. **Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) ##### role The role of the message. **Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['system', 'user', 'assistant'\] ##### metadata The metadata of the message. **Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None` ##### parts The parts of the message. Use this for rendering the message in the UI. System messages should be avoided (set the system prompt on the server instead). They can have text parts. User messages can have text parts and file parts. Assistant messages can have text, reasoning, tool invocation, and file parts. **Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`UIMessagePart`\] ### SubmitMessage **Bases:** `CamelBaseModel` Submit message request. ### RegenerateMessage **Bases:** `CamelBaseModel` Ask the agent to regenerate a message. ### ProviderMetadata Provider metadata. **Default:** `dict[str, dict[str, JSONValue]]` ### ToolApproval Union of tool approval states. **Default:** `ToolApprovalRequested | ToolApprovalResponded` ### ToolUIPart Union of all tool part types. **Default:** `ToolInputStreamingPart | ToolInputAvailablePart | ToolOutputAvailablePart | ToolOutputErrorPart | ToolApprovalRequestedPart | ToolApprovalRespondedPart | ToolOutputDeniedPart` ### DynamicToolUIPart Union of all dynamic tool part types. **Default:** `DynamicToolInputStreamingPart | DynamicToolInputAvailablePart | DynamicToolOutputAvailablePart | DynamicToolOutputErrorPart | DynamicToolApprovalRequestedPart | DynamicToolApprovalRespondedPart | DynamicToolOutputDeniedPart` ### UIMessagePart Union of all message part types. **Default:** `TextUIPart | ReasoningUIPart | ToolUIPart | DynamicToolUIPart | SourceUrlUIPart | SourceDocumentUIPart | FileUIPart | DataUIPart | StepStartUIPart` ### RequestData Union of all request data types. **Default:** `Annotated[SubmitMessage | RegenerateMessage, Discriminator('trigger')]` Vercel AI response types (SSE chunks). Converted to Python from: [https://github.com/vercel/ai/blob/ai%406.0.57/packages/ai/src/ui-message-stream/ui-message-chunks.ts](https://github.com/vercel/ai/blob/ai%406.0.57/packages/ai/src/ui-message-stream/ui-message-chunks.ts) Tool approval types (`ToolApprovalRequestChunk`, `ToolOutputDeniedChunk`) require AI SDK UI v6 or later. ### BaseChunk **Bases:** `CamelBaseModel`, `ABC` Abstract base class for response SSE events. ### TextStartChunk **Bases:** `BaseChunk` Text start chunk. ### TextDeltaChunk **Bases:** `BaseChunk` Text delta chunk. ### TextEndChunk **Bases:** `BaseChunk` Text end chunk. ### ReasoningStartChunk **Bases:** `BaseChunk` Reasoning start chunk. ### ReasoningDeltaChunk **Bases:** `BaseChunk` Reasoning delta chunk. ### ReasoningEndChunk **Bases:** `BaseChunk` Reasoning end chunk. ### ErrorChunk **Bases:** `BaseChunk` Error chunk. ### ToolInputStartChunk **Bases:** `BaseChunk` Tool input start chunk. ### ToolInputDeltaChunk **Bases:** `BaseChunk` Tool input delta chunk. ### ToolOutputAvailableChunk **Bases:** `BaseChunk` Tool output available chunk. ### ToolInputAvailableChunk **Bases:** `BaseChunk` Tool input available chunk. ### ToolInputErrorChunk **Bases:** `BaseChunk` Tool input error chunk. ### ToolOutputErrorChunk **Bases:** `BaseChunk` Tool output error chunk. ### ToolApprovalRequestChunk **Bases:** `BaseChunk` Tool approval request chunk for human-in-the-loop approval. Requires AI SDK UI v6 or later. ### ToolOutputDeniedChunk **Bases:** `BaseChunk` Tool output denied chunk when user denies tool execution. Requires AI SDK UI v6 or later. ### SourceUrlChunk **Bases:** `BaseChunk` Source URL chunk. ### SourceDocumentChunk **Bases:** `BaseChunk` Source document chunk. ### FileChunk **Bases:** `BaseChunk` File chunk. ### DataChunk **Bases:** `BaseChunk` Data chunk with dynamic type. ### StartStepChunk **Bases:** `BaseChunk` Start step chunk. ### FinishStepChunk **Bases:** `BaseChunk` Finish step chunk. ### StartChunk **Bases:** `BaseChunk` Start chunk. ### FinishChunk **Bases:** `BaseChunk` Finish chunk. ### AbortChunk **Bases:** `BaseChunk` Abort chunk. ### MessageMetadataChunk **Bases:** `BaseChunk` Message metadata chunk. ### DoneChunk **Bases:** `BaseChunk` Done chunk. ### ProviderMetadata Provider metadata. **Default:** `dict[str, dict[str, JSONValue]]` ### FinishReason Reason why the model finished generating. **Default:** `Literal['stop', 'length', 'content-filter', 'tool-calls', 'error', 'other'] | None` --- # [Agents](https://pydantic.dev/docs/ai/core-concepts/agent/) # Agents ## Introduction Agents are Pydantic AI's primary interface for interacting with LLMs. In some use cases a single Agent will control an entire application or component, but multiple agents can also interact to embody more complex workflows. The [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) class has full API documentation, but conceptually you can think of an agent as a container for: **Component** **Description** [Instructions](#instructions) A set of instructions for the LLM written by the developer. [Function tool(s)](/docs/ai/tools-toolsets/tools) and [toolsets](/docs/ai/tools-toolsets/toolsets) Functions that the LLM may call to get information while generating a response. [Structured output type](/docs/ai/core-concepts/output) The structured datatype the LLM must return at the end of a run, if specified. [Dependency type constraint](/docs/ai/core-concepts/dependencies) Dynamic instructions functions, tools, and output functions may all use dependencies when they're run. [LLM model](/docs/ai/models/base) Optional default LLM model associated with the agent. Can also be specified when running the agent. [Model Settings](#additional-configuration) Optional default model settings to help fine tune requests. Can also be specified when running the agent. [Capabilities](/docs/ai/core-concepts/capabilities) Reusable bundles of tools, hooks, instructions, and model settings that extend agent behavior. While each of these can be configured individually, [capabilities](/docs/ai/core-concepts/capabilities) let you bundle related behavior into reusable units that are easier to compose, share, and [load from configuration files](/docs/ai/core-concepts/agent-spec). In typing terms, agents are generic in their dependency and output types, e.g., an agent which required dependencies of type `Foobar` and produced outputs of type `list[str]` would have type `Agent[Foobar, list[str]]`. In practice, you shouldn't need to care about this, it should just mean your IDE can tell you when you have the right type, and if you choose to use [static type checking](#static-type-checking) it should work well with Pydantic AI. Here's a toy example of an agent that simulates a roulette wheel: roulette\_wheel.py ```python from pydantic_ai import Agent, RunContext roulette_agent = Agent( # (1) 'openai:gpt-5.2', deps_type=int, output_type=bool, system_prompt=( 'Use the `roulette_wheel` function to see if the ' 'customer has won based on the number they provide.' ), ) @roulette_agent.tool async def roulette_wheel(ctx: RunContext[int], square: int) -> str: # (2) """check if the square is a winner""" return 'winner' if square == ctx.deps else 'loser' # Run the agent success_number = 18 # (3) result = roulette_agent.run_sync('Put my money on square eighteen', deps=success_number) print(result.output) # (4) #> True result = roulette_agent.run_sync('I bet five is the winner', deps=success_number) print(result.output) #> False ``` Create an agent, which expects an integer dependency and produces a boolean output. This agent will have type `Agent[int, bool]`. Define a tool that checks if the square is a winner. Here [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) is parameterized with the dependency type `int`; if you got the dependency type wrong you'd get a typing error. In reality, you might want to use a random number here e.g. `random.randint(0, 36)`. `result.output` will be a boolean indicating if the square is a winner. Pydantic performs the output validation, and it'll be typed as a `bool` since its type is derived from the `output_type` generic parameter of the agent. Agents are designed for reuse, like FastAPI Apps You can instantiate one agent and use it globally throughout your application, as you would a small [FastAPI](https://fastapi.tiangolo.com/reference/fastapi/#fastapi.FastAPI) app or an [APIRouter](https://fastapi.tiangolo.com/reference/apirouter/#fastapi.APIRouter), or dynamically create as many agents as you want. Both are valid and supported ways to use agents. ## Running Agents There are five ways to run an agent: 1. [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) -- an async function which returns a [`RunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) containing a completed response. 2. [`agent.run_sync()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync) -- a plain, synchronous function which returns a [`RunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) containing a completed response (internally, this just calls `loop.run_until_complete(self.run())`). 3. [`agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) -- an async context manager which returns a [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult), which contains methods to stream text and structured output as an async iterable. [`agent.run_stream_sync()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_sync) is a synchronous variation that returns a [`StreamedRunResultSync`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync) with synchronous versions of the same methods. 4. [`agent.run_stream_events()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events) -- a function which returns an async iterable of [`AgentStreamEvent`s](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent) and a [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent) containing the final run result. 5. [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) -- a context manager which returns an [`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun), an async iterable over the nodes of the agent's underlying [`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph). Here's a simple example demonstrating the first four: run\_agent.py ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. async def main(): result = await agent.run('What is the capital of France?') print(result.output) #> The capital of France is Paris. async with agent.run_stream('What is the capital of the UK?') as response: async for text in response.stream_text(): print(text) #> The capital of #> The capital of the UK is #> The capital of the UK is London. events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of Mexico?'): events.append(event) print(events) """ [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='Mexico is Mexico ')), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='City.')), PartEndEvent( index=0, part=TextPart(content='The capital of Mexico is Mexico City.') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of Mexico is Mexico City.') ), ] """ ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ You can also pass messages from previous runs to continue a conversation or provide context, as described in [Messages and Chat History](/docs/ai/core-concepts/message-history). ### Streaming Events and Final Output As shown in the example above, [`run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) makes it easy to stream the agent's final output as it comes in. It also takes an optional `event_stream_handler` argument that you can use to gain insight into what is happening during the run before the final output is produced. The example below shows how to stream events and text output. You can also [stream structured output](/docs/ai/core-concepts/output#streaming-structured-output). Note The `run_stream()` and `run_stream_sync()` methods will consider the first output that matches the [output type](/docs/ai/core-concepts/output#structured-output) (which could be text, an [output tool](/docs/ai/core-concepts/output#tool-output) call, or a [deferred](/docs/ai/tools-toolsets/deferred-tools) tool call) to be the final output of the agent run, even when the model generates (additional) tool calls after this "final" output. These "dangling" tool calls will not be executed unless the agent's [`end_strategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.end_strategy) is set to `'exhaustive'`, and even then their results will not be sent back to the model as the agent run will already be considered completed. In short, if the model returns both tool calls and text, and the agent's output type is `str`, **the tool calls will not run** in streaming mode with the default setting. If you want to always keep running the agent when it performs tool calls, and stream all events from the model's streaming response and the agent's execution of tools, use [`agent.run_stream_events()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events) or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.iter) instead, as described in the following sections. run\_stream\_event\_stream\_handler.py ```python import asyncio from collections.abc import AsyncIterable from datetime import date from pydantic_ai import ( Agent, AgentStreamEvent, FinalResultEvent, FunctionToolCallEvent, FunctionToolResultEvent, PartDeltaEvent, PartStartEvent, RunContext, TextPartDelta, ThinkingPartDelta, ToolCallPartDelta, ) weather_agent = Agent( 'openai:gpt-5.2', system_prompt='Providing a weather forecast at the locations the user provides.', ) @weather_agent.tool async def weather_forecast( ctx: RunContext, location: str, forecast_date: date, ) -> str: return f'The forecast in {location} on {forecast_date} is 24°C and sunny.' output_messages: list[str] = [] async def handle_event(event: AgentStreamEvent): if isinstance(event, PartStartEvent): output_messages.append(f'[Request] Starting part {event.index}: {event.part!r}') elif isinstance(event, PartDeltaEvent): if isinstance(event.delta, TextPartDelta): output_messages.append(f'[Request] Part {event.index} text delta: {event.delta.content_delta!r}') elif isinstance(event.delta, ThinkingPartDelta): output_messages.append(f'[Request] Part {event.index} thinking delta: {event.delta.content_delta!r}') elif isinstance(event.delta, ToolCallPartDelta): output_messages.append(f'[Request] Part {event.index} args delta: {event.delta.args_delta}') elif isinstance(event, FunctionToolCallEvent): output_messages.append( f'[Tools] The LLM calls tool={event.part.tool_name!r} with args={event.part.args} (tool_call_id={event.part.tool_call_id!r})' ) elif isinstance(event, FunctionToolResultEvent): output_messages.append(f'[Tools] Tool call {event.tool_call_id!r} returned => {event.result.content}') elif isinstance(event, FinalResultEvent): output_messages.append(f'[Result] The model starting producing a final result (tool_name={event.tool_name})') async def event_stream_handler( ctx: RunContext, event_stream: AsyncIterable[AgentStreamEvent], ): async for event in event_stream: await handle_event(event) async def main(): user_prompt = 'What will the weather be like in Paris on Tuesday?' async with weather_agent.run_stream(user_prompt, event_stream_handler=event_stream_handler) as run: async for output in run.stream_text(): output_messages.append(f'[Output] {output}') if __name__ == '__main__': asyncio.run(main()) print(output_messages) """ [ "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')", '[Request] Part 0 args delta: {"location":"Pa', '[Request] Part 0 args delta: ris","forecast_', '[Request] Part 0 args delta: date":"2030-01-', '[Request] Part 0 args delta: 01"}', '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')', "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.", "[Request] Starting part 0: TextPart(content='It will be ')", '[Result] The model starting producing a final result (tool_name=None)', '[Output] It will be ', '[Output] It will be warm and sunny ', '[Output] It will be warm and sunny in Paris on ', '[Output] It will be warm and sunny in Paris on Tuesday.', ] """ ``` _(This example is complete, it can be run "as is")_ ### Streaming All Events Like `agent.run_stream()`, [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) takes an optional `event_stream_handler` argument that lets you stream all events from the model's streaming response and the agent's execution of tools. Unlike `run_stream()`, it always runs the agent graph to completion even if text was received ahead of tool calls that looked like it could've been the final result. For convenience, a [`agent.run_stream_events()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events) method is also available as a wrapper around `run(event_stream_handler=...)`, which returns an async iterable of [`AgentStreamEvent`s](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent) and a [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent) containing the final run result. Note As they return raw events as they come in, the `run_stream_events()` and `run(event_stream_handler=...)` methods require you to piece together the streamed text and structured output yourself from the `PartStartEvent` and subsequent `PartDeltaEvent`s. To get the best of both worlds, at the expense of some additional complexity, you can use [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.iter) as described in the next section, which lets you [iterate over the agent graph](#iterating-over-an-agents-graph) and [stream both events and output](#streaming-all-events-and-output) at every step. run\_events.py ```python import asyncio from pydantic_ai import AgentRunResultEvent from run_stream_event_stream_handler import handle_event, output_messages, weather_agent async def main(): user_prompt = 'What will the weather be like in Paris on Tuesday?' async for event in weather_agent.run_stream_events(user_prompt): if isinstance(event, AgentRunResultEvent): output_messages.append(f'[Final Output] {event.result.output}') else: await handle_event(event) if __name__ == '__main__': asyncio.run(main()) print(output_messages) """ [ "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')", '[Request] Part 0 args delta: {"location":"Pa', '[Request] Part 0 args delta: ris","forecast_', '[Request] Part 0 args delta: date":"2030-01-', '[Request] Part 0 args delta: 01"}', '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')', "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.", "[Request] Starting part 0: TextPart(content='It will be ')", '[Result] The model starting producing a final result (tool_name=None)', "[Request] Part 0 text delta: 'warm and sunny '", "[Request] Part 0 text delta: 'in Paris on '", "[Request] Part 0 text delta: 'Tuesday.'", '[Final Output] It will be warm and sunny in Paris on Tuesday.', ] """ ``` _(This example is complete, it can be run "as is")_ ### Iterating Over an Agent's Graph Under the hood, each `Agent` in Pydantic AI uses **pydantic-graph** to manage its execution flow. **pydantic-graph** is a generic, type-centric library for building and running finite state machines in Python. It doesn't actually depend on Pydantic AI -- you can use it standalone for workflows that have nothing to do with GenAI -- but Pydantic AI makes use of it to orchestrate the handling of model requests and model responses in an agent's run. In many scenarios, you don't need to worry about pydantic-graph at all; calling `agent.run(...)` simply traverses the underlying graph from start to finish. However, if you need deeper insight or control -- for example to inject your own logic at specific stages -- Pydantic AI exposes the lower-level iteration process via [`Agent.iter`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter). This method returns an [`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun), which you can async-iterate over, or manually drive node-by-node via the [`next`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.next) method. Once the agent's graph returns an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End), you have the final result along with a detailed history of all steps. #### `async for` iteration Here's an example of using `async for` with `iter` to record each node the agent executes: agent\_iter\_async\_for.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] # Begin an AgentRun, which is an async-iterable over the nodes of the agent's graph async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: # Each node represents a step in the agent's execution nodes.append(node) print(nodes) """ [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] """ print(agent_run.result.output) #> The capital of France is Paris. ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ - The `AgentRun` is an async iterator that yields each node (`BaseNode` or `End`) in the flow. - The run ends when an `End` node is returned. #### Using `.next(...)` manually You can also drive the iteration manually by passing the node you want to run next to the `AgentRun.next(...)` method. This allows you to inspect or modify the node before it executes or skip nodes based on your own logic, and to catch errors in `next()` more easily: agent\_iter\_next.py ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-5.2') async def main(): async with agent.iter('What is the capital of France?') as agent_run: node = agent_run.next_node # (1) all_nodes = [node] # Drive the iteration manually: while not isinstance(node, End): # (2) node = await agent_run.next(node) # (3) all_nodes.append(node) # (4) print(all_nodes) """ [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] """ ``` We start by grabbing the first node that will be run in the agent's graph. The agent run is finished once an `End` node has been produced; instances of `End` cannot be passed to `next`. When you call `await agent_run.next(node)`, it executes that node in the agent's graph, updates the run's history, and returns the _next_ node to run. You could also inspect or mutate the new `node` here as needed. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ #### Accessing usage and final output You can retrieve usage statistics (tokens, requests, etc.) at any time from the [`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun) object via `agent_run.usage()`. This method returns a [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) object containing the usage data. Once the run finishes, `agent_run.result` becomes an [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) object containing the final output (and related metadata). #### Streaming All Events and Output Here is an example of streaming an agent run in combination with `async for` iteration: streaming\_iter.py ```python import asyncio from dataclasses import dataclass from datetime import date from pydantic_ai import ( Agent, FinalResultEvent, FunctionToolCallEvent, FunctionToolResultEvent, PartDeltaEvent, PartStartEvent, RunContext, TextPartDelta, ThinkingPartDelta, ToolCallPartDelta, ) @dataclass class WeatherService: async def get_forecast(self, location: str, forecast_date: date) -> str: # In real code: call weather API, DB queries, etc. return f'The forecast in {location} on {forecast_date} is 24°C and sunny.' async def get_historic_weather(self, location: str, forecast_date: date) -> str: # In real code: call a historical weather API or DB return f'The weather in {location} on {forecast_date} was 18°C and partly cloudy.' weather_agent = Agent[WeatherService, str]( 'openai:gpt-5.2', deps_type=WeatherService, output_type=str, # We'll produce a final answer as plain text system_prompt='Providing a weather forecast at the locations the user provides.', ) @weather_agent.tool async def weather_forecast( ctx: RunContext[WeatherService], location: str, forecast_date: date, ) -> str: if forecast_date >= date.today(): return await ctx.deps.get_forecast(location, forecast_date) else: return await ctx.deps.get_historic_weather(location, forecast_date) output_messages: list[str] = [] async def main(): user_prompt = 'What will the weather be like in Paris on Tuesday?' # Begin a node-by-node, streaming iteration async with weather_agent.iter(user_prompt, deps=WeatherService()) as run: async for node in run: if Agent.is_user_prompt_node(node): # A user prompt node => The user has provided input output_messages.append(f'=== UserPromptNode: {node.user_prompt} ===') elif Agent.is_model_request_node(node): # A model request node => We can stream tokens from the model's request output_messages.append('=== ModelRequestNode: streaming partial request tokens ===') async with node.stream(run.ctx) as request_stream: final_result_found = False async for event in request_stream: if isinstance(event, PartStartEvent): output_messages.append(f'[Request] Starting part {event.index}: {event.part!r}') elif isinstance(event, PartDeltaEvent): if isinstance(event.delta, TextPartDelta): output_messages.append( f'[Request] Part {event.index} text delta: {event.delta.content_delta!r}' ) elif isinstance(event.delta, ThinkingPartDelta): output_messages.append( f'[Request] Part {event.index} thinking delta: {event.delta.content_delta!r}' ) elif isinstance(event.delta, ToolCallPartDelta): output_messages.append( f'[Request] Part {event.index} args delta: {event.delta.args_delta}' ) elif isinstance(event, FinalResultEvent): output_messages.append( f'[Result] The model started producing a final result (tool_name={event.tool_name})' ) final_result_found = True break if final_result_found: # Once the final result is found, we can call `AgentStream.stream_text()` to stream the text. # A similar `AgentStream.stream_output()` method is available to stream structured output. async for output in request_stream.stream_text(): output_messages.append(f'[Output] {output}') elif Agent.is_call_tools_node(node): # A handle-response node => The model returned some data, potentially calls a tool output_messages.append('=== CallToolsNode: streaming partial response & tool usage ===') async with node.stream(run.ctx) as handle_stream: async for event in handle_stream: if isinstance(event, FunctionToolCallEvent): output_messages.append( f'[Tools] The LLM calls tool={event.part.tool_name!r} with args={event.part.args} (tool_call_id={event.part.tool_call_id!r})' ) elif isinstance(event, FunctionToolResultEvent): output_messages.append( f'[Tools] Tool call {event.tool_call_id!r} returned => {event.result.content}' ) elif Agent.is_end_node(node): # Once an End node is reached, the agent run is complete assert run.result is not None assert run.result.output == node.data.output output_messages.append(f'=== Final Agent Output: {run.result.output} ===') if __name__ == '__main__': asyncio.run(main()) print(output_messages) """ [ '=== UserPromptNode: What will the weather be like in Paris on Tuesday? ===', '=== ModelRequestNode: streaming partial request tokens ===', "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')", '[Request] Part 0 args delta: {"location":"Pa', '[Request] Part 0 args delta: ris","forecast_', '[Request] Part 0 args delta: date":"2030-01-', '[Request] Part 0 args delta: 01"}', '=== CallToolsNode: streaming partial response & tool usage ===', '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')', "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.", '=== ModelRequestNode: streaming partial request tokens ===', "[Request] Starting part 0: TextPart(content='It will be ')", '[Result] The model started producing a final result (tool_name=None)', '[Output] It will be ', '[Output] It will be warm and sunny ', '[Output] It will be warm and sunny in Paris on ', '[Output] It will be warm and sunny in Paris on Tuesday.', '=== CallToolsNode: streaming partial response & tool usage ===', '=== Final Agent Output: It will be warm and sunny in Paris on Tuesday. ===', ] """ ``` _(This example is complete, it can be run "as is")_ ### Additional Configuration #### Usage Limits Pydantic AI offers a [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) structure to help you limit your usage (tokens, requests, and tool calls) on model runs. You can apply these settings by passing the `usage_limits` argument to the `run{_sync,_stream}` functions. Consider the following example, where we limit the number of response tokens: ```python from pydantic_ai import Agent, UsageLimitExceeded, UsageLimits agent = Agent('anthropic:claude-sonnet-4-6') result_sync = agent.run_sync( 'What is the capital of Italy? Answer with just the city.', usage_limits=UsageLimits(response_tokens_limit=10), ) print(result_sync.output) #> Rome print(result_sync.usage()) #> RunUsage(input_tokens=62, output_tokens=1, requests=1) try: result_sync = agent.run_sync( 'What is the capital of Italy? Answer with a paragraph.', usage_limits=UsageLimits(response_tokens_limit=10), ) except UsageLimitExceeded as e: print(e) #> Exceeded the output_tokens_limit of 10 (output_tokens=32) ``` Restricting the number of requests can be useful in preventing infinite loops or excessive tool calling: ```python from typing_extensions import TypedDict from pydantic_ai import Agent, ModelRetry, UsageLimitExceeded, UsageLimits class NeverOutputType(TypedDict): """ Never ever coerce data to this type. """ never_use_this: str agent = Agent( 'anthropic:claude-sonnet-4-6', retries=3, output_type=NeverOutputType, system_prompt='Any time you get a response, call the `infinite_retry_tool` to produce another response.', ) @agent.tool_plain(retries=5) # (1) def infinite_retry_tool() -> int: raise ModelRetry('Please try again.') try: result_sync = agent.run_sync( 'Begin infinite retry loop!', usage_limits=UsageLimits(request_limit=3) # (2) ) except UsageLimitExceeded as e: print(e) #> The next request would exceed the request_limit of 3 ``` This tool has the ability to retry 5 times before erroring, simulating a tool that might get stuck in a loop. This run will error after 3 requests, preventing the infinite tool calling. ##### Capping tool calls If you need a limit on the number of successful tool invocations within a single run, use `tool_calls_limit`: ```python from pydantic_ai import Agent from pydantic_ai.exceptions import UsageLimitExceeded from pydantic_ai.usage import UsageLimits agent = Agent('anthropic:claude-sonnet-4-6') @agent.tool_plain def do_work() -> str: return 'ok' try: # Allow at most one executed tool call in this run agent.run_sync('Please call the tool twice', usage_limits=UsageLimits(tool_calls_limit=1)) except UsageLimitExceeded as e: print(e) #> The next tool call(s) would exceed the tool_calls_limit of 1 (tool_calls=2). ``` Note - Usage limits are especially relevant if you've registered many tools. Use `request_limit` to bound the number of model turns, and `tool_calls_limit` to cap the number of successful tool executions within a run. - The `tool_calls_limit` is checked before executing tool calls. If the model returns parallel tool calls that would exceed the limit, no tools will be executed. #### Model (Run) Settings Pydantic AI offers a [`settings.ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) structure to help you fine tune your requests. This structure allows you to configure common parameters that influence the model's behavior, such as `temperature`, `max_tokens`, `timeout`, and more. There are three ways to apply these settings, with a clear precedence order: 1. **Model-level defaults** - Set when creating a model instance via the `settings` parameter. These serve as the base defaults for that model. 2. **Agent-level defaults** - Set during [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) initialization via the `model_settings` argument. These are merged with model defaults, with agent settings taking precedence. 3. **Run-time overrides** - Passed to `run{_sync,_stream}` functions via the `model_settings` argument. These have the highest priority and are merged with the combined agent and model defaults. For example, if you'd like to set the `temperature` setting to `0.0` to ensure less random behavior, you can do the following: ```py from pydantic_ai import Agent, ModelSettings from pydantic_ai.models.openai import OpenAIChatModel # 1. Model-level defaults model = OpenAIChatModel( 'gpt-5.2', settings=ModelSettings(temperature=0.8, max_tokens=500) # Base defaults ) # 2. Agent-level defaults (overrides model defaults by merging) agent = Agent(model, model_settings=ModelSettings(temperature=0.5)) # 3. Run-time overrides (highest priority) result_sync = agent.run_sync( 'What is the capital of Italy?', model_settings=ModelSettings(temperature=0.0) # Final temperature: 0.0 ) print(result_sync.output) #> The capital of Italy is Rome. ``` The final request uses `temperature=0.0` (run-time), `max_tokens=500` (from model), demonstrating how settings merge with run-time taking precedence. ##### Dynamic model settings Both agent-level and run-level `model_settings` accept a callable that receives a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings). The callable is invoked before each model request, so settings can vary per step. The current resolved settings so far are available via `ctx.model_settings` inside the callable. Settings are resolved in layers, each merged on top of the previous: 1. **Model defaults** (`model.settings`) 2. **Agent-level** (`Agent(model_settings=...)`) 3. **Capability-level** (e.g. from [`Thinking()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Thinking) -- see [Capabilities](/docs/ai/core-concepts/capabilities#providing-model-settings)) 4. **Run-level** (`agent.run(model_settings=...)`) Inside a callable, `ctx.model_settings` contains the merged result of all _previous_ layers (position-dependent). For example, an agent-level callable sees only model defaults, while a run-level callable sees model defaults + agent-level + capability-level settings. To reset a field set by a previous layer, set it explicitly (e.g. `{'temperature': None}`). ```python from pydantic_ai import Agent, ModelSettings agent = Agent( 'test', model_settings=lambda ctx: ModelSettings( temperature=0.0 if ctx.run_step <= 1 else 0.7, ), ) ``` Model Settings Support Model-level settings are supported by all concrete model implementations (OpenAI, Anthropic, Google, etc.). Wrapper models like [`FallbackModel`](/docs/ai/models/overview#fallback-model), [`WrapperModel`](/docs/ai/api/models/wrapper/#pydantic_ai.models.wrapper.WrapperModel), and [`InstrumentedModel`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentedModel) don't have their own settings - they use the settings of their underlying models. #### Run metadata Run metadata lets you tag each agent execution with contextual details (for example, a tenant ID to filter traces and logs) and read it after completion via [`AgentRun.metadata`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun), [`AgentRunResult.metadata`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult), or [`StreamedRunResult.metadata`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult). The resolved metadata is attached to the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) during the run and, when instrumentation is enabled, added to the run span attributes for observability tools. Configure metadata on an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) or pass it to a run. Both accept either a static dictionary or a callable that receives the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext). Metadata is computed (if a callable) and applied when the run starts, then recomputed after a run ends successfully, so it can include end-of-run values. Agent-level metadata and per-run metadata are merged, with per-run values overriding agent-level ones. run\_metadata.py ```python from dataclasses import dataclass from pydantic_ai import Agent @dataclass class Deps: tenant: str agent = Agent[Deps]( 'openai:gpt-5.2', deps_type=Deps, metadata=lambda ctx: {'tenant': ctx.deps.tenant}, # agent-level metadata ) result = agent.run_sync( 'What is the capital of France?', deps=Deps(tenant='tenant-123'), metadata=lambda ctx: {'num_requests': ctx.usage.requests}, # per-run metadata ) print(result.output) #> The capital of France is Paris. print(result.metadata) #> {'tenant': 'tenant-123', 'num_requests': 1} ``` #### Concurrency Limiting You can limit the number of concurrent agent runs using the `max_concurrency` parameter. This is useful when you want to prevent overwhelming external resources or enforce rate limits when running many agent instances in parallel. agent\_concurrency.py ```python import asyncio from pydantic_ai import Agent, ConcurrencyLimit # Simple limit: allow up to 10 concurrent runs agent = Agent('openai:gpt-5', max_concurrency=10) # With backpressure: limit concurrent runs and queue depth agent_with_backpressure = Agent( 'openai:gpt-5', max_concurrency=ConcurrencyLimit(max_running=10, max_queued=100), ) async def main(): # These will be rate-limited to 10 concurrent runs results = await asyncio.gather( *[agent.run(f'Question {i}') for i in range(20)] ) print(len(results)) #> 20 ``` When the concurrency limit is reached, additional calls to [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) will wait until a slot becomes available. If you configure `max_queued` and the queue fills up, a [`ConcurrencyLimitExceeded`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ConcurrencyLimitExceeded) exception is raised. When instrumentation is enabled, waiting operations appear as "waiting for concurrency" spans with attributes showing queue depth and limits. ### Model specific settings If you wish to further customize model behavior, you can use a subclass of [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings), like [`GoogleModelSettings`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings), associated with your model of choice. For example: ```python from pydantic_ai import Agent, UnexpectedModelBehavior from pydantic_ai.models.google import GoogleModelSettings agent = Agent('google-gla:gemini-3-flash-preview') try: result = agent.run_sync( 'Write a list of 5 very rude things that I might say to the universe after stubbing my toe in the dark:', model_settings=GoogleModelSettings( temperature=0.0, # general model settings can also be specified gemini_safety_settings=[ { 'category': 'HARM_CATEGORY_HARASSMENT', 'threshold': 'BLOCK_LOW_AND_ABOVE', }, { 'category': 'HARM_CATEGORY_HATE_SPEECH', 'threshold': 'BLOCK_LOW_AND_ABOVE', }, ], ), ) except UnexpectedModelBehavior as e: print(e) # (1) """ Content filter 'SAFETY' triggered, body: """ ``` This error is raised because the safety thresholds were exceeded. ## Runs vs. Conversations An agent **run** might represent an entire conversation -- there's no limit to how many messages can be exchanged in a single run. However, a **conversation** might also be composed of multiple runs, especially if you need to maintain state between separate interactions or API calls. Here's an example of a conversation comprised of multiple runs: conversation\_example.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') # First run result1 = agent.run_sync('Who was Albert Einstein?') print(result1.output) #> Albert Einstein was a German-born theoretical physicist. # Second run, passing previous messages result2 = agent.run_sync( 'What was his most famous equation?', message_history=result1.new_messages(), # (1) ) print(result2.output) #> Albert Einstein's most famous equation is (E = mc^2). ``` Continue the conversation; without `message_history` the model would not know who "his" was referring to. _(This example is complete, it can be run "as is")_ ## Type safe by design Pydantic AI is designed to work well with static type checkers, like mypy and pyright. Typing is (somewhat) optional Pydantic AI is designed to make type checking as useful as possible for you if you choose to use it, but you don't have to use types everywhere all the time. That said, because Pydantic AI uses Pydantic, and Pydantic uses type hints as the definition for schema and validation, some types (specifically type hints on parameters to tools, and the `output_type` arguments to [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent)) are used at runtime. We (the library developers) have messed up if type hints are confusing you more than helping you, if you find this, please create an [issue](https://github.com/pydantic/pydantic-ai/issues) explaining what's annoying you! In particular, agents are generic in both the type of their dependencies and the type of the outputs they return, so you can use the type hints to ensure you're using the right types. Consider the following script with type mistakes: type\_mistakes.py ```python from dataclasses import dataclass from pydantic_ai import Agent, RunContext @dataclass class User: name: str agent = Agent( 'test', deps_type=User, # (1) output_type=bool, ) @agent.system_prompt def add_user_name(ctx: RunContext[str]) -> str: # (2) return f"The user's name is {ctx.deps}." def foobar(x: bytes) -> None: pass result = agent.run_sync('Does their name start with "A"?', deps=User('Anne')) foobar(result.output) # (3) ``` The agent is defined as expecting an instance of `User` as `deps`. But here `add_user_name` is defined as taking a `str` as the dependency, not a `User`. Since the agent is defined as returning a `bool`, this will raise a type error since `foobar` expects `bytes`. Running `mypy` on this will give the following output: Terminal ```bash ➤ uv run mypy type_mistakes.py type_mistakes.py:18: error: Argument 1 to "system_prompt" of "Agent" has incompatible type "Callable[[RunContext[str]], str]"; expected "Callable[[RunContext[User]], str]" [arg-type] type_mistakes.py:28: error: Argument 1 to "foobar" has incompatible type "bool"; expected "bytes" [arg-type] Found 2 errors in 1 file (checked 1 source file) ``` Running `pyright` would identify the same issues. ## System Prompts System prompts might seem simple at first glance since they're just strings (or sequences of strings that are concatenated), but crafting the right system prompt is key to getting the model to behave as you want. Tip For most use cases, you should use `instructions` instead of "system prompts". If you know what you are doing though and want to preserve system prompt messages in the message history sent to the LLM in subsequent completions requests, you can achieve this using the `system_prompt` argument/decorator. See the section below on [Instructions](#instructions) for more information. Generally, system prompts fall into two categories: 1. **Static system prompts**: These are known when writing the code and can be defined via the `system_prompt` parameter of the [`Agent` constructor](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__). 2. **Dynamic system prompts**: These depend in some way on context that isn't known until runtime, and should be defined via functions decorated with [`@agent.system_prompt`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.system_prompt). You can add both to a single agent; they're appended in the order they're defined at runtime. Here's an example using both types of system prompts: system\_prompts.py ```python from datetime import date from pydantic_ai import Agent, RunContext agent = Agent( 'openai:gpt-5.2', deps_type=str, # (1) system_prompt="Use the customer's name while replying to them.", # (2) ) @agent.system_prompt # (3) def add_the_users_name(ctx: RunContext[str]) -> str: return f"The user's name is {ctx.deps}." @agent.system_prompt def add_the_date() -> str: # (4) return f'The date is {date.today()}.' result = agent.run_sync('What is the date?', deps='Frank') print(result.output) #> Hello Frank, the date today is 2032-01-02. ``` The agent expects a string dependency. Static system prompt defined at agent creation time. Dynamic system prompt defined via a decorator with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), this is called just after `run_sync`, not when the agent is created, so can benefit from runtime information like the dependencies used on that run. Another dynamic system prompt, system prompts don't have to have the `RunContext` parameter. _(This example is complete, it can be run "as is")_ ## Instructions Instructions are similar to system prompts. The main difference is that when an explicit `message_history` is provided in a call to `Agent.run` and similar methods, _instructions_ from any existing messages in the history are not included in the request to the model -- only the instructions of the _current_ agent are included. You should use: - `instructions` when you want your request to the model to only include system prompts for the _current_ agent - `system_prompt` when you want your request to the model to _retain_ the system prompts used in previous requests (possibly made using other agents) In general, we recommend using `instructions` instead of `system_prompt` unless you have a specific reason to use `system_prompt`. Instructions, like system prompts, can be specified at different times: 1. **Static instructions**: These are known when writing the code and can be defined via the `instructions` parameter of the [`Agent` constructor](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__). 2. **Dynamic instructions**: These rely on context that is only available at runtime and should be defined using functions decorated with [`@agent.instructions`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instructions). Unlike dynamic system prompts, which may be reused when `message_history` is present, dynamic instructions are always reevaluated. 3. **Runtime instructions**: These are additional instructions for a specific run that can be passed to one of the [run methods](#running-agents) using the `instructions` argument. All three types of instructions can be added to a single agent, and they are appended in the order they are defined at runtime. Each instruction is internally classified as either **static** (literal strings from the `instructions` parameter) or **dynamic** (from `@agent.instructions` functions, runtime instructions, or [toolset](/docs/ai/tools-toolsets/toolsets) instructions). Static instructions are always sorted before dynamic ones. This ordering enables providers that support prompt caching (like [Anthropic](/docs/ai/models/anthropic#smart-instruction-caching) and [Bedrock](/docs/ai/models/bedrock#prompt-caching)) to cache the stable static prefix while leaving dynamic instructions outside the cache boundary. Here's an example using a static instruction as well as dynamic instructions: instructions.py ```python from datetime import date from pydantic_ai import Agent, RunContext agent = Agent( 'openai:gpt-5.2', deps_type=str, # (1) instructions="Use the customer's name while replying to them.", # (2) ) @agent.instructions # (3) def add_the_users_name(ctx: RunContext[str]) -> str: return f"The user's name is {ctx.deps}." @agent.instructions def add_the_date() -> str: # (4) return f'The date is {date.today()}.' result = agent.run_sync('What is the date?', deps='Frank') print(result.output) #> Hello Frank, the date today is 2032-01-02. ``` The agent expects a string dependency. Static instructions defined at agent creation time. Dynamic instructions defined via a decorator with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), this is called just after `run_sync`, not when the agent is created, so can benefit from runtime information like the dependencies used on that run. Another dynamic instruction, instructions don't have to have the `RunContext` parameter. _(This example is complete, it can be run "as is")_ Note that returning an empty string will result in no instruction message added. Instructions can also come from [capabilities](/docs/ai/core-concepts/capabilities) via [`get_instructions()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_instructions), or from [template strings](/docs/ai/core-concepts/agent-spec#template-strings) rendered against the agent's dependencies. ## Reflection and self-correction Validation errors from both function tool parameter validation and [structured output validation](/docs/ai/core-concepts/output#structured-output) can be passed back to the model with a request to retry. You can also raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) from within a [tool](/docs/ai/tools-toolsets/tools) or [output function](/docs/ai/core-concepts/output#output-functions) to tell the model it should retry generating a response. - The default retry count is **1** but can be altered for the [entire agent](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__), a [specific tool](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool), or [outputs](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__). - You can access the current retry count from within a tool, output validator, or output function via [`ctx.retry`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.retry). Here's an example: tool\_retry.py ```python from pydantic import BaseModel from pydantic_ai import Agent, RunContext, ModelRetry from fake_database import DatabaseConn class ChatResult(BaseModel): user_id: int message: str agent = Agent( 'openai:gpt-5.2', deps_type=DatabaseConn, output_type=ChatResult, ) @agent.tool(retries=2) def get_user_by_name(ctx: RunContext[DatabaseConn], name: str) -> int: """Get a user's ID from their full name.""" print(name) #> John #> John Doe user_id = ctx.deps.users.get(name=name) if user_id is None: raise ModelRetry( f'No user found with name {name!r}, remember to provide their full name' ) return user_id result = agent.run_sync( 'Send a message to John Doe asking for coffee next week', deps=DatabaseConn() ) print(result.output) """ user_id=123 message='Hello John, would you be free for coffee sometime next week? Let me know what works for you!' """ ``` ## Debugging and Monitoring Agents require a different approach to observability than traditional software. With traditional web endpoints or data pipelines, you can largely predict behavior by reading the code. With agents, this is much harder. The model's decisions are stochastic, and that stochasticity compounds through the agentic loop as the agent reasons, calls tools, observes results, and reasons again. You need to actually see what happened. This means setting up your application to record what's happening in a way you can review afterward, both during development (to understand and iterate) and in production (to debug issues and monitor behavior). The ergonomics matter too: a plaintext dump of everything that happened isn't a practical way to review agent behavior, even during development. You want tooling that lets you step through each decision and tool call interactively. We recommend [Pydantic Logfire](https://logfire.pydantic.dev/docs/), which has been designed with Pydantic AI workflows in mind. ### Tracing with Logfire ```python import logfire logfire.configure() logfire.instrument_pydantic_ai() ``` With Logfire instrumentation enabled, every agent run creates a detailed trace showing: - **Messages exchanged** with the model (system, user, assistant) - **Tool calls** including arguments and return values - **Token usage** per request and cumulative - **Latency** for each operation - **Errors** with full context This visibility is invaluable for: - Understanding why an agent made a specific decision - Debugging unexpected behavior - Optimizing performance and costs - Monitoring production deployments ### Systematic Testing with Evals For systematic evaluation of agent behavior beyond runtime debugging, [Pydantic Evals](/docs/ai/evals/) provides a code-first framework for testing AI systems: ```python from pydantic_evals import Case, Dataset dataset = Dataset( name='agent_eval', cases=[ Case(name='capital_question', inputs='What is the capital of France?', expected_output='Paris'), ] ) report = dataset.evaluate_sync(my_agent_function) ``` Evals let you define test cases, run them against your agent, and score the results. When combined with Logfire, evaluation results appear in the web UI for visualization and comparison across runs. See the [Logfire integration guide](/docs/ai/evals/how-to/logfire-integration) for setup. ### Using Other Backends Pydantic AI's instrumentation is built on [OpenTelemetry](https://opentelemetry.io/), so you can send traces to any compatible backend. Even if you use the Logfire SDK for its convenience, you can configure it to send data to other backends. See [alternative backends](/docs/ai/integrations/logfire#using-opentelemetry) for setup instructions. [Full Logfire integration guide →](/docs/ai/integrations/logfire) ## Model errors If models behave unexpectedly (e.g., the retry limit is exceeded, or their API returns `503`), agent runs will raise [`UnexpectedModelBehavior`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UnexpectedModelBehavior). In these cases, [`capture_run_messages`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.capture_run_messages) can be used to access the messages exchanged during the run to help diagnose the issue. agent\_model\_errors.py ```python from pydantic_ai import Agent, ModelRetry, UnexpectedModelBehavior, capture_run_messages agent = Agent('openai:gpt-5.2') @agent.tool_plain def calc_volume(size: int) -> int: # (1) if size == 42: return size**3 else: raise ModelRetry('Please try again.') with capture_run_messages() as messages: # (2) try: result = agent.run_sync('Please get me the volume of a box with size 6.') except UnexpectedModelBehavior as e: print('An error occurred:', e) #> An error occurred: Tool 'calc_volume' exceeded max retries count of 1 print('cause:', repr(e.__cause__)) #> cause: ModelRetry('Please try again.') print('messages:', messages) """ messages: [ ModelRequest( parts=[ UserPromptPart( content='Please get me the volume of a box with size 6.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='calc_volume', args={'size': 6}, tool_call_id='pyd_ai_tool_call_id', ) ], usage=RequestUsage(input_tokens=62, output_tokens=4), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ RetryPromptPart( content='Please try again.', tool_name='calc_volume', tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='calc_volume', args={'size': 6}, tool_call_id='pyd_ai_tool_call_id', ) ], usage=RequestUsage(input_tokens=72, output_tokens=8), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ else: print(result.output) ``` Define a tool that will raise `ModelRetry` repeatedly in this case. [`capture_run_messages`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.capture_run_messages) is used to capture the messages exchanged during the run. _(This example is complete, it can be run "as is")_ Note If you call [`run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`run_sync`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync), or [`run_stream`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only. ## Agent Specs Agents can also be defined declaratively in YAML or JSON using [agent specs](/docs/ai/core-concepts/agent-spec). This separates agent configuration from application code: ```yaml model: anthropic:claude-opus-4-6 instructions: You are a helpful assistant. capabilities: - WebSearch - Thinking: effort: high ``` ```python from pydantic_ai import Agent agent = Agent.from_file('agent.yaml') ``` See [Agent Specs](/docs/ai/core-concepts/agent-spec) for the full spec format, template strings, and custom capability registration. --- # [Agent Specs](https://pydantic.dev/docs/ai/core-concepts/agent-spec/) # Agent Specs Agent specs let you define agents declaratively in YAML or JSON -- [model](/docs/ai/models/overview), [instructions](/docs/ai/core-concepts/agent#instructions), [capabilities](/docs/ai/core-concepts/capabilities), and all. One line to load, no Python agent construction code required. This is useful for: - Separating agent configuration from application code - Letting non-developers (prompt engineers, domain experts) configure agents - Storing agent definitions alongside other config files - Sharing agent configurations across teams or projects ## Defining a spec A spec file defines the agent's configuration in YAML or JSON: agent.yaml ```yaml model: anthropic:claude-opus-4-6 instructions: You are a helpful research assistant. model_settings: max_tokens: 8192 capabilities: - WebSearch - Thinking: effort: high ``` ## Loading specs [`Agent.from_file`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_file) loads a spec from a YAML or JSON file and constructs an agent: from\_file\_example.py ```python from pydantic_ai import Agent agent = Agent.from_file('agent.yaml') ``` [`Agent.from_spec`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_spec) accepts a dict or `AgentSpec` instance and supports additional keyword arguments that supplement or override the spec: from\_spec\_example.py ```python from dataclasses import dataclass from pydantic_ai import Agent @dataclass class UserContext: user_name: str agent = Agent.from_spec( { 'model': 'anthropic:claude-opus-4-6', 'instructions': 'You are helping {{user_name}}.', 'capabilities': ['WebSearch'], }, deps_type=UserContext, ) ``` Keyword arguments interact with spec fields as follows: - **Scalar fields** (`model`, `name`, `retries`, `end_strategy`, etc.) -- the keyword argument overrides the spec value when provided. - **`instructions`** -- merged: spec instructions come first, then keyword argument instructions. - **`capabilities`** -- merged: spec capabilities come first, then keyword argument capabilities. - **`model_settings`** -- merged additively: keyword argument settings override matching spec settings. - **`output_type`** -- takes precedence over `output_schema` from the spec. When `deps_type` is passed, [template strings](#template-strings) in the spec's `instructions`, `description`, and capability arguments are compiled and validated against the deps type at construction time. For more control over spec loading, use `AgentSpec.from_file` to load the spec separately before passing it to `Agent.from_spec`. ## Template strings `TemplateStr` provides Handlebars-style templates (`{{variable}}`) that are rendered against the agent's [dependencies](/docs/ai/core-concepts/dependencies) at runtime. In spec files, strings containing `{{` are automatically converted to template strings: ```yaml instructions: "You are assisting {{name}}, who is a {{role}}." ``` Template variables are resolved from the fields of the `deps` object. When a `deps_type` (or [`deps_schema`](#deps_schema)) is provided, template variable names are validated at construction time. In Python code, `TemplateStr` can be used explicitly, but a callable with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) is generally preferred for IDE autocomplete and type checking: template\_instructions.py ```python from dataclasses import dataclass from pydantic_ai import Agent, TemplateStr @dataclass class UserProfile: name: str role: str agent = Agent( 'openai:gpt-5.2', deps_type=UserProfile, instructions=TemplateStr('You are assisting {{name}}, who is a {{role}}.'), ) result = agent.run_sync('hello', deps=UserProfile(name='Alice', role='engineer')) print(result.output) #> Hello! How can I help you today? ``` ## Capability spec syntax Capabilities in specs support three forms: - `'MyCapability'` -- no arguments, calls `MyCapability.from_spec()` - `{'MyCapability': value}` -- single positional argument, calls `MyCapability.from_spec(value)` - `{'MyCapability': {key: value, ...}}` -- keyword arguments, calls `MyCapability.from_spec(**kwargs)` ## Custom capabilities in specs See [Publishing capabilities](/docs/ai/core-concepts/capabilities#publishing-capabilities) for how to make custom capabilities work with agent specs. ## `AgentSpec` reference The `AgentSpec` model represents the full spec structure: Field Type Description `model` `str` [Model](/docs/ai/models/overview) name (required) `name` `str | None` Agent name `description` `str | None` Agent description (supports [templates](#template-strings)) `instructions` `str | list[str] | None` [Instructions](/docs/ai/core-concepts/agent#instructions) (supports [templates](#template-strings)) `model_settings` `dict | None` [Model settings](/docs/ai/core-concepts/agent#model-run-settings) `capabilities` `list` [Capabilities](/docs/ai/core-concepts/capabilities) (see [spec syntax](#capability-spec-syntax)) `deps_schema` `dict | None` JSON Schema for [template string](#template-strings) validation (see below) `output_schema` `dict | None` JSON Schema for [structured output](/docs/ai/core-concepts/output) (see below) `retries` `int` Default [tool retries](/docs/ai/advanced-features/retries) (default: `1`) `output_retries` `int | None` [Output](/docs/ai/core-concepts/output) validation retries `end_strategy` `EndStrategy` When to stop (`'early'` or `'exhaustive'`) `tool_timeout` `float | None` Default [tool](/docs/ai/tools-toolsets/tools) timeout in seconds `instrument` `bool | None` Enable [Logfire](/docs/ai/integrations/logfire) instrumentation `metadata` `dict | None` Agent [metadata](/docs/ai/core-concepts/agent#run-metadata) ### `deps_schema` When loading a spec file without a Python `deps_type`, `deps_schema` provides a JSON Schema that validates [template string](#template-strings) variable names at construction time. It does **not** validate the actual deps object at runtime -- it only ensures that template variables like `{{user_name}}` correspond to properties defined in the schema. ### `output_schema` When provided (and no `output_type` keyword argument is passed to `from_spec`), `output_schema` defines the structure the model should produce as its final output. Under the hood, it creates a [`StructuredDict`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.StructuredDict) output type: the JSON Schema is sent to the model API so the model knows what structure to produce, and the response is returned as a `dict[str, Any]`. Note The model's response is not validated against the schema's `properties` or `required` fields -- it is accepted as a plain dict. The schema serves as an instruction to the model, not a runtime validation constraint. agent\_with\_schema.yaml ```yaml model: anthropic:claude-opus-4-6 deps_schema: type: object properties: user_name: type: string required: [user_name] output_schema: type: object properties: answer: type: string confidence: type: number required: [answer, confidence] instructions: "You are helping {{user_name}}. Always include a confidence score." capabilities: - WebSearch ``` ## Saving specs `AgentSpec.to_file` saves a spec to YAML or JSON and optionally generates a companion JSON Schema file for editor autocompletion: save\_spec\_example.py ```python from pydantic_ai import AgentSpec spec = AgentSpec( model='anthropic:claude-opus-4-6', instructions='You are a helpful assistant.', capabilities=['WebSearch'], ) spec.to_file('agent.yaml') # Also generates ./agent_schema.json for editor autocompletion ``` The generated JSON Schema file enables autocompletion and validation in editors that support the [YAML Language Server](https://github.com/redhat-developer/yaml-language-server) protocol. Pass `schema_path=None` to skip schema generation. --- # [Capabilities](https://pydantic.dev/docs/ai/core-concepts/capabilities/) # Capabilities A capability is a reusable, composable unit of agent behavior. Instead of threading multiple arguments through your `Agent` constructor -- [instructions](/docs/ai/core-concepts/agent#instructions) here, [model settings](/docs/ai/core-concepts/agent#model-run-settings) there, a [toolset](/docs/ai/tools-toolsets/toolsets) somewhere else, a [history processor](/docs/ai/core-concepts/message-history#processing-message-history) on yet another parameter -- you can bundle related behavior into a single capability and pass it via the [`capabilities`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) parameter. Capabilities can provide any combination of: - **Tools** -- via [toolsets](/docs/ai/tools-toolsets/toolsets) or [builtin tools](/docs/ai/tools-toolsets/builtin-tools) - **Lifecycle hooks** -- intercept and modify model requests, tool calls, and the overall run - **Instructions** -- static or dynamic [instruction](/docs/ai/core-concepts/agent#instructions) additions - **Model settings** -- static or per-step [model settings](/docs/ai/core-concepts/agent#model-run-settings) This makes them the primary extension point for Pydantic AI. Whether you're building a memory system, a guardrail, a cost tracker, or an approval workflow, a capability is the right abstraction. ## Built-in capabilities Pydantic AI ships with several capabilities that cover common needs: Capability What it provides Spec [`Thinking`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Thinking) Enables model [thinking/reasoning](/docs/ai/advanced-features/thinking) at configurable effort Yes [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) Decorator-based [lifecycle hook](/docs/ai/core-concepts/hooks) registration -- [`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch) Web search -- builtin when supported, [local fallback](/docs/ai/tools-toolsets/common-tools#duckduckgo-search-tool) with [`duckduckgo` extra](/docs/ai/overview/install#slim-install) Yes [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch) URL fetching -- builtin when supported, [local fallback](/docs/ai/tools-toolsets/common-tools#web-fetch-tool) with [`web-fetch` extra](/docs/ai/overview/install#slim-install) Yes [`ImageGeneration`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ImageGeneration) Image generation -- builtin when supported, subagent fallback via `fallback_model` Yes [`MCP`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.MCP) MCP server -- builtin when supported, direct connection otherwise Yes [`PrepareTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrepareTools) Filters or modifies [tool definitions](/docs/ai/tools-toolsets/tools) per step -- [`PrefixTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrefixTools) Wraps a capability and prefixes its tool names Yes [`BuiltinTool`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.BuiltinTool) Registers a [builtin tool](/docs/ai/tools-toolsets/builtin-tools) with the agent Yes [`Toolset`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Toolset) Wraps an [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) -- [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) Includes return type schemas in tool definitions sent to the model Yes [`SetToolMetadata`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.SetToolMetadata) Merges metadata key-value pairs onto selected tools Yes [`HistoryProcessor`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.HistoryProcessor) Wraps a [history processor](/docs/ai/core-concepts/message-history#processing-message-history) -- [`ThreadExecutor`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ThreadExecutor) Uses a custom thread executor for [sync functions](/docs/ai/tools-toolsets/tools-advanced#thread-executor-for-long-running-servers) -- The **Spec** column indicates whether the capability can be used in [agent specs](/docs/ai/core-concepts/agent-spec) (YAML/JSON). Capabilities marked **--** take non-serializable arguments (callables, toolset objects) and can only be used in Python code. builtin\_capabilities.py ```python from pydantic_ai import Agent from pydantic_ai.capabilities import Thinking, WebSearch agent = Agent( 'anthropic:claude-opus-4-6', instructions='You are a research assistant. Be thorough and cite sources.', capabilities=[ Thinking(effort='high'), WebSearch(), ], ) ``` [Instructions](/docs/ai/core-concepts/agent#instructions) and [model settings](/docs/ai/core-concepts/agent#model-run-settings) are configured directly via the `instructions` and `model_settings` parameters on `Agent` (or `AgentSpec`). Capabilities are for behavior that goes beyond simple configuration -- tools, lifecycle hooks, and custom extensions. They compose well, especially when you want to reuse the same configuration across multiple agents or load it from a [spec file](/docs/ai/core-concepts/agent-spec). ### Thinking The [`Thinking`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Thinking) capability enables model [thinking/reasoning](/docs/ai/advanced-features/thinking) at a configurable effort level. It's the simplest way to enable thinking across providers: thinking\_capability.py ```python from pydantic_ai import Agent from pydantic_ai.capabilities import Thinking agent = Agent('anthropic:claude-sonnet-4-6', capabilities=[Thinking(effort='high')]) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` See [Thinking](/docs/ai/advanced-features/thinking) for provider-specific details and the [unified thinking settings](/docs/ai/advanced-features/thinking#unified-thinking-settings). ### Compaction Provider-specific compaction capabilities manage conversation context size by compacting older messages into summaries: Provider Capability Details OpenAI Responses API [`OpenAICompaction`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAICompaction) [OpenAI compaction](/docs/ai/models/openai#message-compaction) Anthropic [`AnthropicCompaction`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicCompaction) [Anthropic compaction](/docs/ai/models/anthropic#message-compaction) ### ThreadExecutor The [`ThreadExecutor`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ThreadExecutor) capability provides a custom [`Executor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Executor) for running sync tool functions and other sync callbacks in threads. This is useful in long-running servers (e.g. FastAPI) where the default ephemeral threads from `anyio.to_thread.run_sync` can accumulate under sustained load: ```python from concurrent.futures import ThreadPoolExecutor from pydantic_ai import Agent from pydantic_ai.capabilities import ThreadExecutor executor = ThreadPoolExecutor(max_workers=16, thread_name_prefix='agent-worker') agent = Agent('openai:gpt-5.2', capabilities=[ThreadExecutor(executor)]) ``` See [Thread executor for long-running servers](/docs/ai/tools-toolsets/tools-advanced#thread-executor-for-long-running-servers) for more details. ### Hooks The [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) capability provides decorator-based [lifecycle hook](#hooking-into-the-lifecycle) registration -- the easiest way to intercept model requests, tool calls, and other events without subclassing [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability): ```python from pydantic_ai import Agent, ModelRequestContext, RunContext from pydantic_ai.capabilities import Hooks hooks = Hooks() @hooks.on.before_model_request async def log_request(ctx: RunContext[None], request_context: ModelRequestContext) -> ModelRequestContext: agent_name = ctx.agent.name if ctx.agent else 'unknown' print(f'[{agent_name}] Sending {len(request_context.messages)} messages') return request_context agent = Agent('openai:gpt-5.2', name='my_agent', capabilities=[hooks]) ``` All hooks receive [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), which provides access to the running agent via [`ctx.agent`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.agent) -- useful for logging, metrics, and other cross-cutting concerns that need to identify which agent is running. See the dedicated [Hooks](/docs/ai/core-concepts/hooks) page for the full API: decorator and constructor registration, timeouts, tool filtering, wrap hooks, per-event hooks, and more. ### Provider-adaptive tools [`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch), [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch), [`ImageGeneration`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ImageGeneration), and [`MCP`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.MCP) provide model-agnostic access to common tool types. When the model supports the tool natively (as a [builtin tool](/docs/ai/tools-toolsets/builtin-tools)), it's used directly. When it doesn't, a local function tool handles it instead -- so your agent works across providers without code changes. Each accepts `builtin` and `local` keyword arguments to control which side is used: provider\_adaptive\_tools.py ```python from pydantic_ai import Agent from pydantic_ai.capabilities import MCP, ImageGeneration, WebFetch, WebSearch agent = Agent( 'anthropic:claude-sonnet-4-6', capabilities=[ # Auto-detects DuckDuckGo as local fallback WebSearch(), # Auto-detects markdownify-based fetch as local fallback WebFetch(), # Builtin when the model supports it; falls back to a subagent # running the specified LLM with image generation support ImageGeneration(fallback_model='openai-responses:gpt-5.4'), # Auto-detects transport from URL MCP(url='https://mcp.example.com/api'), ], ) ``` To force builtin-only (errors on unsupported models instead of falling back to local): builtin\_only.py ```python MCP(url='https://mcp.example.com/api', local=False) ``` To force local-only (never use the builtin, even when the model supports it): local\_only.py ```python MCP(url='https://mcp.example.com/api', builtin=False) ``` Some constraint fields require the builtin because the local fallback can't enforce them. When these are set and the model doesn't support the builtin, a [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) is raised. For example, [`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch) domain constraints require the builtin, while [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch) enforces them locally: constraints.py ```python # Only search example.com -- requires builtin support WebSearch(allowed_domains=['example.com']) # Only fetch example.com -- enforced locally when builtin is unavailable WebFetch(allowed_domains=['example.com']) ``` All of these capabilities are subclasses of [`BuiltinOrLocalTool`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.BuiltinOrLocalTool), which you can use directly or subclass to build your own provider-adaptive tools. For example, to pair [`CodeExecutionTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.CodeExecutionTool) with a local fallback: custom\_builtin\_or\_local.py ```python from pydantic_ai.builtin_tools import CodeExecutionTool from pydantic_ai.capabilities import BuiltinOrLocalTool cap = BuiltinOrLocalTool(builtin=CodeExecutionTool(), local=my_local_executor) ``` ### PrepareTools [`PrepareTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrepareTools) wraps a [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) as a capability, for filtering or modifying [tool definitions](/docs/ai/tools-toolsets/tools) per step: prepare\_tools\_builtin.py ```python from pydantic_ai import Agent, RunContext, ToolDefinition from pydantic_ai.capabilities import PrepareTools async def hide_dangerous(ctx: RunContext[None], tool_defs: list[ToolDefinition]) -> list[ToolDefinition]: return [td for td in tool_defs if not td.name.startswith('delete_')] agent = Agent('openai:gpt-5.2', capabilities=[PrepareTools(hide_dangerous)]) @agent.tool_plain def delete_file(path: str) -> str: """Delete a file.""" return f'deleted {path}' @agent.tool_plain def read_file(path: str) -> str: """Read a file.""" return f'contents of {path}' result = agent.run_sync('hello') # The model only sees `read_file`, not `delete_file` ``` For more complex tool preparation logic, see [Tool preparation](#tool-preparation) under lifecycle hooks. ### PrefixTools [`PrefixTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrefixTools) wraps another capability and prefixes all of its tool names, useful for namespacing when composing multiple capabilities that might have conflicting tool names: prefix\_tools\_example.py ```python from pydantic_ai import Agent from pydantic_ai.capabilities import MCP, PrefixTools agent = Agent( 'openai:gpt-5.2', capabilities=[ PrefixTools(MCP(url='https://api1.example.com'), prefix='api1'), PrefixTools(MCP(url='https://api2.example.com'), prefix='api2'), ], ) ``` Every [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability) has a convenience method [`prefix_tools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.prefix_tools) that returns a [`PrefixTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrefixTools) wrapper: prefix\_convenience.py ```python MCP(url='https://mcp.example.com/api').prefix_tools('mcp') ``` ### IncludeToolReturnSchemas [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) includes return type schemas in tool definitions sent to the model. For models that natively support return schemas (e.g. Google Gemini), the schema is passed as a structured field in the API request. For other models, it is injected into the tool description as JSON text. include\_return\_schemas.py ```python from pydantic_ai import Agent from pydantic_ai.capabilities import IncludeToolReturnSchemas from pydantic_ai.models.test import TestModel test_model = TestModel() agent = Agent(test_model, capabilities=[IncludeToolReturnSchemas()]) @agent.tool_plain def get_temperature(city: str) -> float: """Get the temperature for a city.""" return 21.0 result = agent.run_sync('What is the temperature in Paris?') params = test_model.last_model_request_parameters assert params is not None td = params.function_tools[0] assert td.include_return_schema is True ``` _(This example is complete, it can be run "as is")_ Use the `tools` parameter to select which tools should include return schemas. It accepts a list of tool names, a metadata dict for matching, or a callable predicate: include\_return\_schemas\_selective.py ```python from pydantic_ai import Agent from pydantic_ai.capabilities import IncludeToolReturnSchemas from pydantic_ai.models.test import TestModel test_model = TestModel() agent = Agent( test_model, capabilities=[IncludeToolReturnSchemas(tools=['get_temperature'])], ) @agent.tool_plain def get_temperature(city: str) -> float: """Get the temperature for a city.""" return 21.0 @agent.tool_plain def get_greeting(name: str) -> str: """Get a greeting.""" return f'Hello, {name}!' result = agent.run_sync('Hello') params = test_model.last_model_request_parameters assert params is not None temp_tool = next(t for t in params.function_tools if t.name == 'get_temperature') greet_tool = next(t for t in params.function_tools if t.name == 'get_greeting') assert temp_tool.include_return_schema is True assert greet_tool.include_return_schema is None ``` _(This example is complete, it can be run "as is")_ The same effect can be achieved at the toolset level using [`.include_return_schemas()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.include_return_schemas) -- see [toolset composition](/docs/ai/tools-toolsets/toolsets#including-return-schemas). ### SetToolMetadata [`SetToolMetadata`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.SetToolMetadata) merges metadata key-value pairs onto selected tools. This is useful for tagging tools with configuration that other capabilities or custom logic can inspect: set\_tool\_metadata.py ```python from pydantic_ai import Agent from pydantic_ai.capabilities import SetToolMetadata from pydantic_ai.models.test import TestModel test_model = TestModel() agent = Agent( test_model, capabilities=[SetToolMetadata(tools=['search'], sensitive=True)], ) @agent.tool_plain def search(query: str) -> str: """Search for information.""" return f'Results for: {query}' @agent.tool_plain def greet(name: str) -> str: """Greet someone.""" return f'Hello, {name}!' result = agent.run_sync('Search for pydantic') params = test_model.last_model_request_parameters assert params is not None search_tool = next(t for t in params.function_tools if t.name == 'search') greet_tool = next(t for t in params.function_tools if t.name == 'greet') assert search_tool.metadata is not None and search_tool.metadata.get('sensitive') is True assert greet_tool.metadata is None or greet_tool.metadata.get('sensitive') is None ``` _(This example is complete, it can be run "as is")_ The same effect can be achieved at the toolset level using [`.with_metadata()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.with_metadata) -- see [toolset composition](/docs/ai/tools-toolsets/toolsets#setting-tool-metadata). ## Building custom capabilities To build your own capability, subclass [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability) and override the methods you need. There are two categories: **configuration methods** that are called at agent construction (except [`get_wrapper_toolset`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset) which is called per-run), and **lifecycle hooks** that fire during each run. ### Providing tools A capability that provides tools returns a [toolset](/docs/ai/tools-toolsets/toolsets) from [`get_toolset`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_toolset). This can be a pre-built [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) instance, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns one dynamically: custom\_capability\_tools.py ```python from dataclasses import dataclass from typing import Any from pydantic_ai import Agent from pydantic_ai.capabilities import AbstractCapability from pydantic_ai.toolsets import AgentToolset, FunctionToolset math_toolset = FunctionToolset() @math_toolset.tool_plain def add(a: float, b: float) -> float: """Add two numbers.""" return a + b @math_toolset.tool_plain def multiply(a: float, b: float) -> float: """Multiply two numbers.""" return a * b @dataclass class MathTools(AbstractCapability[Any]): """Provides basic math operations.""" def get_toolset(self) -> AgentToolset[Any] | None: return math_toolset agent = Agent('openai:gpt-5.2', capabilities=[MathTools()]) result = agent.run_sync('What is 2 + 3?') print(result.output) #> The answer is 5.0 ``` For [builtin tools](/docs/ai/tools-toolsets/builtin-tools), override [`get_builtin_tools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_builtin_tools) to return a sequence of [`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool) instances (which includes both [`AbstractBuiltinTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.AbstractBuiltinTool) objects and callables that receive [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)). #### Toolset wrapping [`get_wrapper_toolset`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset) lets a capability wrap the agent's entire assembled toolset with a [`WrapperToolset`](/docs/ai/tools-toolsets/toolsets#changing-tool-execution). This is more powerful than providing tools -- it can intercept tool execution, add logging, or apply cross-cutting behavior. The wrapper receives the combined non-output toolset (after any agent-level [`prepare_tools`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) wrapping). Output tools are added separately and are not affected. wrapper\_toolset\_example.py ```python from dataclasses import dataclass from typing import Any from pydantic_ai import Agent from pydantic_ai.capabilities import AbstractCapability from pydantic_ai.toolsets import AbstractToolset from pydantic_ai.toolsets.wrapper import WrapperToolset @dataclass class LoggingToolset(WrapperToolset[Any]): """Logs all tool calls.""" async def call_tool( self, tool_name: str, tool_args: dict[str, Any], *args: Any, **kwargs: Any ) -> Any: print(f' Calling tool: {tool_name}') return await super().call_tool(tool_name, tool_args, *args, **kwargs) @dataclass class LogToolCalls(AbstractCapability[Any]): """Wraps the agent's toolset to log all tool calls.""" def get_wrapper_toolset(self, toolset: AbstractToolset[Any]) -> AbstractToolset[Any]: return LoggingToolset(wrapped=toolset) agent = Agent('openai:gpt-5.2', capabilities=[LogToolCalls()]) @agent.tool_plain def greet(name: str) -> str: """Greet someone.""" return f'Hello, {name}!' result = agent.run_sync('hello') # Tool calls are logged as they happen ``` Note `get_wrapper_toolset` wraps the non-output _toolset_ once per run (during toolset assembly), intercepting tool execution. This is different from the [`prepare_tools`](#tool-preparation) hook, which operates on tool _definitions_ per step and controls visibility rather than execution. ### Providing instructions [`get_instructions`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_instructions) adds [instructions](/docs/ai/core-concepts/agent#instructions) to the agent. Since it's called once at agent construction, return a callable if you need dynamic values: custom\_capability\_config.py ```python from dataclasses import dataclass from datetime import datetime from typing import Any from pydantic_ai import Agent, RunContext from pydantic_ai.capabilities import AbstractCapability @dataclass class KnowsCurrentTime(AbstractCapability[Any]): """Tells the agent what time it is.""" def get_instructions(self): def _get_time(ctx: RunContext[Any]) -> str: return f'The current date and time is {datetime.now().isoformat()}.' return _get_time agent = Agent('openai:gpt-5.2', capabilities=[KnowsCurrentTime()]) result = agent.run_sync('What time is it?') print(result.output) #> The current time is 3:45 PM. ``` Instructions can also use [template strings](/docs/ai/core-concepts/agent-spec#template-strings) (`TemplateStr('Hello {{name}}')`) for Handlebars-style templates rendered against the agent's [dependencies](/docs/ai/core-concepts/dependencies). In Python code, a callable with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) is generally preferred for IDE autocomplete. ### Providing model settings [`get_model_settings`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_model_settings) returns [model settings](/docs/ai/core-concepts/agent#model-run-settings) as a dict or a callable for per-step settings. When model settings need to vary per step -- for example, enabling thinking only on retry -- return a callable: dynamic\_settings.py ```python from dataclasses import dataclass from pydantic_ai import Agent, ModelSettings, RunContext from pydantic_ai.capabilities import AbstractCapability @dataclass class ThinkingOnRetry(AbstractCapability[None]): """Enables thinking mode when the agent is retrying.""" def get_model_settings(self): def resolve(ctx: RunContext[None]) -> ModelSettings: if ctx.run_step > 1: return ModelSettings(thinking='high') return ModelSettings() return resolve agent = Agent('openai:gpt-5.2', capabilities=[ThinkingOnRetry()]) result = agent.run_sync('hello') print(result.output) #> Hello! How can I help you today? ``` The callable receives a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) where `ctx.model_settings` contains the merged result of all layers resolved before this capability (model defaults and agent-level settings). ### Configuration methods reference Method Return type Purpose [`get_toolset()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_toolset) `AgentToolset` `| None` A [toolset](/docs/ai/tools-toolsets/toolsets) to register (or a callable for [dynamic toolsets](/docs/ai/tools-toolsets/toolsets#dynamically-building-a-toolset)) [`get_builtin_tools()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_builtin_tools) `Sequence[`[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)`]` [Builtin tools](/docs/ai/tools-toolsets/builtin-tools) to register (including callables) [`get_wrapper_toolset()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset) [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) `| None` [Wrap the agent's assembled toolset](#toolset-wrapping) [`get_instructions()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_instructions) `AgentInstructions` `| None` [Instructions](/docs/ai/core-concepts/agent#instructions) (static strings, [template strings](/docs/ai/core-concepts/agent-spec#template-strings), or callables) [`get_model_settings()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_model_settings) `AgentModelSettings` `| None` [Model settings](/docs/ai/core-concepts/agent#model-run-settings) dict, or a callable for per-step settings ### Hooking into the lifecycle Capabilities can hook into five lifecycle points, each with up to four variants: - **`before_*`** -- fires before the action, can modify inputs - **`after_*`** -- fires after the action succeeds (in reverse capability order), can modify outputs - **`wrap_*`** -- full middleware control: receives a `handler` callable and decides whether/how to call it - **`on_*_error`** -- fires when the action fails (after `wrap_*` has had its chance to recover), can observe, transform, or recover from errors Tip For quick, application-level hooks without subclassing, use the [`Hooks`](/docs/ai/core-concepts/hooks) capability instead. #### Run hooks Hook Signature Purpose [`before_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_run) `(ctx: RunContext) -> None` Observe-only notification that a run is starting [`after_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_run) `(ctx: RunContext, *, result: AgentRunResult) -> AgentRunResult` Modify the final result [`wrap_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_run) `(ctx: RunContext, *, handler: WrapRunHandler) -> AgentRunResult` Wrap the entire run [`on_run_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_run_error) `(ctx: RunContext, *, error: BaseException) -> AgentRunResult` Handle run errors (see [error hooks](#error-hooks)) `wrap_run` supports error recovery: if `handler()` raises and `wrap_run` catches the exception and returns a result instead, the error is suppressed and the recovery result is used. This works with both [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) and [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter). #### Node hooks Hook Signature Purpose [`before_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_node_run) `(ctx: RunContext, *, node: AgentNode) -> AgentNode` Observe or replace the node before execution [`after_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_node_run) `(ctx: RunContext, *, node: AgentNode, result: NodeResult) -> NodeResult` Modify the result (next node or `End`) [`wrap_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_node_run) `(ctx: RunContext, *, node: AgentNode, handler: WrapNodeRunHandler) -> NodeResult` Wrap each graph node execution [`on_node_run_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_node_run_error) `(ctx: RunContext, *, node: AgentNode, error: BaseException) -> NodeResult` Handle node errors (see [error hooks](#error-hooks)) [`wrap_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_node_run) fires for every node in the [agent graph](/docs/ai/core-concepts/agent#iterating-over-an-agents-graph) (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`). Override this to observe node transitions, add per-step logging, or modify graph progression: Note `wrap_node_run` hooks are called automatically by [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream), and [`agent_run.next()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.next). However, they are **not** called when iterating with bare `async for node in agent_run:` over [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter), since that uses the graph run's internal iteration. Always use `agent_run.next(node)` to advance the run if you need `wrap_node_run` hooks to fire. node\_logging\_example.py ```python from __future__ import annotations from dataclasses import dataclass, field from typing import Any from pydantic_ai import Agent, RunContext from pydantic_ai.capabilities import ( AbstractCapability, AgentNode, NodeResult, WrapNodeRunHandler, ) @dataclass class NodeLogger(AbstractCapability[Any]): """Logs each node that executes during a run.""" nodes: list[str] = field(default_factory=list) async def wrap_node_run( self, ctx: RunContext[Any], *, node: AgentNode[Any], handler: WrapNodeRunHandler[Any] ) -> NodeResult[Any]: self.nodes.append(type(node).__name__) return await handler(node) logger = NodeLogger() agent = Agent('openai:gpt-5.2', capabilities=[logger]) agent.run_sync('hello') print(logger.nodes) #> ['UserPromptNode', 'ModelRequestNode', 'CallToolsNode'] ``` You can also use `wrap_node_run` to modify graph progression -- for example, limiting the number of model requests per run: node\_modification\_example.py ```python from dataclasses import dataclass from typing import Any from pydantic_graph import End from pydantic_ai import ModelRequestNode, RunContext from pydantic_ai.capabilities import AbstractCapability, AgentNode, NodeResult, WrapNodeRunHandler from pydantic_ai.result import FinalResult @dataclass class MaxModelRequests(AbstractCapability[Any]): """Limits the number of model requests per run by ending early.""" max_requests: int = 5 count: int = 0 async def for_run(self, ctx: RunContext[Any]) -> 'MaxModelRequests': return MaxModelRequests(max_requests=self.max_requests) # fresh per run async def wrap_node_run( self, ctx: RunContext[Any], *, node: AgentNode[Any], handler: WrapNodeRunHandler[Any] ) -> NodeResult[Any]: if isinstance(node, ModelRequestNode): self.count += 1 if self.count > self.max_requests: return End(FinalResult(output='Max model requests reached')) return await handler(node) ``` See [Iterating Over an Agent's Graph](/docs/ai/core-concepts/agent#iterating-over-an-agents-graph) for more about the agent graph and its node types. #### Model request hooks Hook Signature Purpose [`before_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_model_request) `(ctx: RunContext, request_context: ModelRequestContext) -> ModelRequestContext` Modify messages, settings, parameters, or model before the model call [`after_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_model_request) `(ctx: RunContext, *, request_context: ModelRequestContext, response: ModelResponse) -> ModelResponse` Modify the model's response [`wrap_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_model_request) `(ctx: RunContext, *, request_context: ModelRequestContext, handler: WrapModelRequestHandler) -> ModelResponse` Wrap the model call [`on_model_request_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_model_request_error) `(ctx: RunContext, *, request_context: ModelRequestContext, error: Exception) -> ModelResponse` Handle model request errors (see [error hooks](#error-hooks)) `ModelRequestContext` bundles `model`, `messages`, `model_settings`, and `model_request_parameters` into a single object, making the signature future-proof. To swap the model for a given request, set `request_context.model` to a different [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) instance. To skip the model call entirely and provide a replacement response, raise [`SkipModelRequest(response)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipModelRequest) from `before_model_request` or `wrap_model_request`. #### Tool hooks Tool processing has two phases: **validation** (parsing and validating the model's JSON arguments against the tool's schema) and **execution** (running the tool function). Each phase has its own hooks. All tool hooks receive a `tool_def` parameter with the [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition). **Validation hooks** -- `args` is the raw `str | dict[str, Any]` from the model before validation, or the validated `dict[str, Any]` after: Hook Signature Purpose [`before_tool_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_tool_validate) `(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: RawToolArgs) -> RawToolArgs` Modify raw args before validation (e.g. JSON repair) [`after_tool_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_tool_validate) `(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs) -> ValidatedToolArgs` Modify validated args [`wrap_tool_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_tool_validate) `(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: RawToolArgs, handler: WrapToolValidateHandler) -> ValidatedToolArgs` Wrap the validation step [`on_tool_validate_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_tool_validate_error) `(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: RawToolArgs, error: Exception) -> ValidatedToolArgs` Handle validation errors (see [error hooks](#error-hooks)) To skip validation and provide pre-validated args, raise [`SkipToolValidation(args)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolValidation) from `before_tool_validate` or `wrap_tool_validate`. **Execution hooks** -- `args` is always the validated `dict[str, Any]`: Hook Signature Purpose [`before_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_tool_execute) `(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs) -> ValidatedToolArgs` Modify args before execution [`after_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_tool_execute) `(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs, result: Any) -> Any` Modify execution result [`wrap_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_tool_execute) `(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs, handler: WrapToolExecuteHandler) -> Any` Wrap execution [`on_tool_execute_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_tool_execute_error) `(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs, error: Exception) -> Any` Handle execution errors (see [error hooks](#error-hooks)) To skip execution and provide a replacement result, raise [`SkipToolExecution(result)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolExecution) from `before_tool_execute` or `wrap_tool_execute`. #### Tool preparation Capabilities can filter or modify which tool definitions the model sees on each step via [`prepare_tools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.prepare_tools). This controls tool **visibility**, not execution -- use execution hooks for that. prepare\_tools\_example.py ```python from dataclasses import dataclass from typing import Any from pydantic_ai import Agent, RunContext, ToolDefinition from pydantic_ai.capabilities import AbstractCapability @dataclass class HideDangerousTools(AbstractCapability[Any]): """Hides tools matching certain name prefixes from the model.""" hidden_prefixes: tuple[str, ...] = ('delete_', 'drop_') async def prepare_tools( self, ctx: RunContext[Any], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition]: return [td for td in tool_defs if not any(td.name.startswith(p) for p in self.hidden_prefixes)] agent = Agent('openai:gpt-5.2', capabilities=[HideDangerousTools()]) @agent.tool_plain def delete_file(path: str) -> str: """Delete a file.""" return f'deleted {path}' @agent.tool_plain def read_file(path: str) -> str: """Read a file.""" return f'contents of {path}' result = agent.run_sync('hello') # The model only sees `read_file`, not `delete_file` ``` The list includes all tool kinds (function, output, unapproved) -- use `tool_def.kind` to distinguish. This hook runs after the agent-level [`prepare_tools`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc). For simple cases, the built-in [`PrepareTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrepareTools) capability wraps a callable without needing a custom subclass. #### Event stream hook For runs with event streaming ([`run_stream_events`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events), [`event_stream_handler`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__), [UI event streams](/docs/ai/integrations/ui/overview)), capabilities can observe or transform the event stream: Hook Signature Purpose [`wrap_run_event_stream`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_run_event_stream) `(ctx: RunContext, *, stream: AsyncIterable[AgentStreamEvent]) -> AsyncIterable[AgentStreamEvent]` Observe, filter, or transform streamed events event\_stream\_example.py ```python from collections.abc import AsyncIterable from dataclasses import dataclass from typing import Any from pydantic_ai import AgentStreamEvent, RunContext from pydantic_ai.capabilities import AbstractCapability from pydantic_ai.messages import ( FunctionToolCallEvent, FunctionToolResultEvent, PartStartEvent, TextPart, ) @dataclass class StreamAuditor(AbstractCapability[Any]): """Logs tool calls and text output during streamed runs.""" async def wrap_run_event_stream( self, ctx: RunContext[Any], *, stream: AsyncIterable[AgentStreamEvent], ) -> AsyncIterable[AgentStreamEvent]: async for event in stream: if isinstance(event, FunctionToolCallEvent): print(f'Tool called: {event.part.tool_name}') elif isinstance(event, FunctionToolResultEvent): print(f'Tool result: {event.tool_return.content!r}') elif isinstance(event, PartStartEvent) and isinstance(event.part, TextPart): print(f'Text: {event.part.content!r}') yield event ``` For building web UIs that transform streamed events into protocol-specific formats (like SSE), see the [UI event streams](/docs/ai/integrations/ui/overview) documentation and the [`UIEventStream`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream) base class. #### Error hooks Each lifecycle point has an `on_*_error` hook -- the error counterpart to `after_*`. While `after_*` hooks fire on success, `on_*_error` hooks fire on failure (after `wrap_*` has had its chance to recover): ``` before_X → wrap_X(handler) ├─ success ─────────→ after_X (modify result) └─ failure → on_X_error ├─ re-raise ──→ (error propagates, after_X not called) └─ recover ───→ after_X (modify recovered result) ``` Error hooks use **raise-to-propagate, return-to-recover** semantics: - **Raise the original error** -- propagates the error unchanged _(default)_ - **Raise a different exception** -- transforms the error - **Return a result** -- suppresses the error and uses the returned value Hook Fires when Recovery type [`on_run_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_run_error) Agent run fails Return [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) [`on_node_run_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_node_run_error) Graph node fails Return next node or [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) [`on_model_request_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_model_request_error) Model request fails Return [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) [`on_tool_validate_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_tool_validate_error) Tool validation fails Return validated args `dict` [`on_tool_execute_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_tool_execute_error) Tool execution fails Return any tool result error\_hooks\_example.py ```python from dataclasses import dataclass, field from typing import Any from pydantic_ai import ModelRequestContext, RunContext from pydantic_ai.capabilities import AbstractCapability from pydantic_ai.messages import ModelResponse, TextPart @dataclass class ErrorLogger(AbstractCapability[Any]): """Logs all errors that occur during agent runs.""" errors: list[str] = field(default_factory=list) async def on_model_request_error( self, ctx: RunContext[Any], *, request_context: ModelRequestContext, error: Exception ) -> ModelResponse: self.errors.append(f'Model error: {error}') # Return a fallback response to recover return ModelResponse(parts=[TextPart(content='Service temporarily unavailable.')]) async def on_tool_execute_error( self, ctx: RunContext[Any], *, call: Any, tool_def: Any, args: dict[str, Any], error: Exception ) -> Any: self.errors.append(f'Tool {call.tool_name} failed: {error}') raise error # Re-raise to let the normal retry flow handle it ``` ### Wrapping capabilities [`WrapperCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WrapperCapability) wraps another capability and delegates all methods to it -- similar to [`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset) for toolsets. Subclass it to override specific methods while delegating the rest: wrapper\_capability\_example.py ```python from dataclasses import dataclass from typing import Any from pydantic_ai import ModelRequestContext, RunContext from pydantic_ai.capabilities import WrapperCapability @dataclass class AuditedCapability(WrapperCapability[Any]): """Wraps any capability and logs its model requests.""" async def before_model_request( self, ctx: RunContext[Any], request_context: ModelRequestContext ) -> ModelRequestContext: print(f'Request from {type(self.wrapped).__name__}') return await super().before_model_request(ctx, request_context) ``` The built-in [`PrefixTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrefixTools) is an example of a `WrapperCapability` -- it wraps another capability and prefixes its tool names. ### Per-run state isolation By default, a capability instance is shared across all runs of an agent. If your capability accumulates mutable state that should not leak between runs, override [`for_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.for_run) to return a fresh instance: per\_run\_state.py ```python from dataclasses import dataclass from typing import Any from pydantic_ai import Agent, ModelRequestContext, RunContext from pydantic_ai.capabilities import AbstractCapability @dataclass class RequestCounter(AbstractCapability[Any]): """Counts model requests per run.""" count: int = 0 async def for_run(self, ctx: RunContext[Any]) -> 'RequestCounter': return RequestCounter() # fresh instance for each run async def before_model_request( self, ctx: RunContext[Any], request_context: ModelRequestContext ) -> ModelRequestContext: self.count += 1 return request_context counter = RequestCounter() agent = Agent('openai:gpt-5.2', capabilities=[counter]) # The shared counter stays at 0 because for_run returns a fresh instance agent.run_sync('first run') agent.run_sync('second run') print(counter.count) #> 0 ``` ### Composition and middleware semantics When multiple capabilities are passed to an agent, they are composed into a single [`CombinedCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CombinedCapability) that follows **middleware semantics** -- the same pattern used by web frameworks like Django and Starlette: - **Configuration** is merged: instructions concatenate, model settings merge additively (later capabilities override earlier ones), toolsets combine, builtin tools collect. - **`before_*`** hooks fire in capability order (outermost to innermost): `cap1 → cap2 → cap3`. - **`after_*`** hooks fire in reverse order (innermost to outermost): `cap3 → cap2 → cap1`. - **`wrap_*`** hooks nest as middleware: `cap1` wraps `cap2` wraps `cap3` wraps the actual operation. The first capability is the **outermost** layer. - **`get_wrapper_toolset`** follows the same nesting: the first capability's wrapper is outermost. This means the first capability in the list has the first and last say on the operation -- it sees the original input before any other capability, and it sees the final output after all inner capabilities have processed it. ### Ordering By default, capabilities are composed in the order you list them. When a capability needs to be at a specific position regardless of where the user lists it, override [`get_ordering`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_ordering) to return a [`CapabilityOrdering`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CapabilityOrdering): capability\_ordering\_example.py ```python from dataclasses import dataclass from typing import Any from pydantic_ai.capabilities import ( AbstractCapability, CapabilityOrdering, CombinedCapability, ) @dataclass class InstrumentationCapability(AbstractCapability[Any]): """Must wrap all other capabilities to trace everything.""" def get_ordering(self) -> CapabilityOrdering: return CapabilityOrdering(position='outermost') @dataclass class PlainCapability(AbstractCapability[Any]): pass # InstrumentationCapability ends up first regardless of list order combined = CombinedCapability([PlainCapability(), InstrumentationCapability()]) assert type(combined.capabilities[0]) is InstrumentationCapability ``` The available constraints are: - **`position`** -- `'outermost'` or `'innermost'`. Places the capability in a tier before (or after) all capabilities without that position. Multiple capabilities can share a tier; original list order breaks ties within it. - **`wraps`** -- list of capabilities this one wraps around (is outside of). Each entry can be a capability **type** (matches all instances via `issubclass`) or a specific **instance** (matches by identity). Use when your capability needs to see the output of another: `CapabilityOrdering(wraps=[OtherCapability])`. - **`wrapped_by`** -- list of capabilities that wrap around this one (are outside of it). Accepts types or instances, like `wraps`. The inverse of `wraps`. - **`requires`** -- list of capability types that must be present. Raises [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) if any are missing. Does not imply ordering. When constraints are declared, [`CombinedCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CombinedCapability) topologically sorts its children at construction time, preserving user-provided order as a tiebreaker. [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) supports ordering via the `ordering` parameter, so you can declare ordering constraints without subclassing: hooks\_ordering\_example.py ```python from pydantic_ai.capabilities import CapabilityOrdering, CombinedCapability, Hooks logging_hooks = Hooks(ordering=CapabilityOrdering(position='outermost')) rate_limit_hooks = Hooks(ordering=CapabilityOrdering(wrapped_by=[logging_hooks])) # logging_hooks ends up outermost; rate_limit_hooks is wrapped by it combined = CombinedCapability([rate_limit_hooks, logging_hooks]) assert combined.capabilities[0] is logging_hooks assert combined.capabilities[1] is rate_limit_hooks ``` ## Examples ### Guardrail (PII redaction) A guardrail is a capability that intercepts model requests or responses to enforce safety rules. Here's one that scans model responses for potential PII and redacts it: guardrail\_example.py ```python import re from dataclasses import dataclass from typing import Any from pydantic_ai import Agent, ModelRequestContext, RunContext from pydantic_ai.capabilities import AbstractCapability from pydantic_ai.messages import ModelResponse, TextPart @dataclass class PIIRedactionGuardrail(AbstractCapability[Any]): """Redacts email addresses and phone numbers from model responses.""" async def after_model_request( self, ctx: RunContext[Any], *, request_context: ModelRequestContext, response: ModelResponse, ) -> ModelResponse: for part in response.parts: if isinstance(part, TextPart): # Redact email addresses part.content = re.sub( r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}', '[EMAIL REDACTED]', part.content, ) # Redact phone numbers (simple US pattern) part.content = re.sub( r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b', '[PHONE REDACTED]', part.content, ) return response agent = Agent('openai:gpt-5.2', capabilities=[PIIRedactionGuardrail()]) result = agent.run_sync("What's Jane's contact info?") print(result.output) #> You can reach Jane at [EMAIL REDACTED] or [PHONE REDACTED]. ``` ### Logging middleware The `wrap_*` pattern is useful when you need to observe or time both the input and output of an operation. Here's a capability that logs every model request and tool call: logging\_middleware\_example.py ```python from dataclasses import dataclass from typing import Any from pydantic_ai import Agent, ModelRequestContext, RunContext, ToolDefinition from pydantic_ai.capabilities import ( AbstractCapability, WrapModelRequestHandler, WrapToolExecuteHandler, ) from pydantic_ai.messages import ModelResponse, ToolCallPart @dataclass class VerboseLogging(AbstractCapability[Any]): """Logs model requests and tool executions.""" async def wrap_model_request( self, ctx: RunContext[Any], *, request_context: ModelRequestContext, handler: WrapModelRequestHandler, ) -> ModelResponse: print(f' Model request (step {ctx.run_step}, {len(request_context.messages)} messages)') #> Model request (step 1, 1 messages) response = await handler(request_context) print(f' Model response: {len(response.parts)} parts') #> Model response: 1 parts return response async def wrap_tool_execute( self, ctx: RunContext[Any], *, call: ToolCallPart, tool_def: ToolDefinition, args: dict[str, Any], handler: WrapToolExecuteHandler, ) -> Any: print(f' Tool call: {call.tool_name}({args})') result = await handler(args) print(f' Tool result: {result!r}') return result agent = Agent('openai:gpt-5.2', capabilities=[VerboseLogging()]) result = agent.run_sync('hello') print(f'Output: {result.output}') #> Output: Hello! How can I help you today? ``` ## Pydantic AI Harness [**Pydantic AI Harness**](/docs/ai/harness/overview) is the official capability library for Pydantic AI -- standalone capabilities like memory, guardrails, context management, and [code mode](https://github.com/pydantic/pydantic-ai-harness/tree/main/pydantic_ai_harness/code_mode) live there rather than in core. See [What goes where?](/docs/ai/harness/overview#what-goes-where) for the full breakdown, or jump to the [capability matrix](https://github.com/pydantic/pydantic-ai-harness#capability-matrix). ## Third-party capabilities Capabilities are the recommended way for third-party packages to extend Pydantic AI, since they can bundle tools with hooks, instructions, and model settings. See [Extensibility](/docs/ai/guides/extensibility) for the full ecosystem, including [third-party toolsets](/docs/ai/tools-toolsets/toolsets#third-party-toolsets) that can also be wrapped as capabilities. ### Task Management Capabilities for task planning and progress tracking help agents organize complex work: - [`pydantic-ai-todo`](https://github.com/vstorm-co/pydantic-ai-todo) - `TodoCapability` with `add_todo`, `read_todos`, `write_todos`, `update_todo_status`, and `remove_todo` tools. Supports subtasks, dependencies, and PostgreSQL persistence. Also available as a lower-level `TodoToolset`. ### Context Management Capabilities for managing long conversations help agents stay within context limits: - [`summarization-pydantic-ai`](https://github.com/vstorm-co/summarization-pydantic-ai) - Four capabilities for managing long conversations: `ContextManagerCapability` (real-time token tracking, auto-compression at a configurable threshold, and large tool-output truncation); `SummarizationCapability` (LLM-powered history compression); `SlidingWindowCapability` (zero-cost message trimming); `LimitWarnerCapability` (injects a finish-soon hint before hard context limits). Also available as standalone `history_processors`: `SummarizationProcessor`, `SlidingWindowProcessor`, and `LimitWarnerProcessor`. ### Multi-Agent Orchestration Capabilities for spawning and delegating to specialized subagents help agents tackle complex, parallelizable work: - [`subagents-pydantic-ai`](https://github.com/vstorm-co/subagents-pydantic-ai) - `SubAgentCapability` adds tools for multi-agent delegation: `task` (spawn a subagent), `check_task`, `wait_tasks`, `list_active_tasks`, `soft_cancel_task`, `hard_cancel_task`, and `answer_subagent`. Supports sync, async, and auto execution modes, nested subagents, and runtime agent creation. Also available as a lower-level toolset via `create_subagent_toolset`. ### Guardrails & Safety Capabilities for cost control, input/output filtering, and tool permissions help keep agents safe and within budget: - [`pydantic-ai-shields`](https://github.com/vstorm-co/pydantic-ai-shields) - Ready-to-use guardrail capabilities: `CostTracking` (tracks token usage and USD cost per run, raises `BudgetExceededError` on budget overrun); `ToolGuard` (block or require approval for specific tools); `InputGuard` and `OutputGuard` (custom sync or async validation functions); `PromptInjection`, `PiiDetector`, `SecretRedaction`, `BlockedKeywords`, and `NoRefusals` content shields. ### File Operations & Sandboxing Capabilities for filesystem access and sandboxed code execution help agents work with files and run code safely: - [`pydantic-ai-backend`](https://github.com/vstorm-co/pydantic-ai-backend) - `ConsoleCapability` registers `ls`, `read_file`, `write_file`, `edit_file`, `glob`, `grep`, and `execute` tools with a fine-grained permission system. Backends include `StateBackend` (in-memory, for testing), `LocalBackend` (real filesystem), `DockerSandbox` (isolated container execution), and `CompositeBackend` (routing across backends). Also available as a lower-level `ConsoleToolset`. ### Agent Skills Capabilities that implement [Agent Skills](https://agentskills.io) support help agents efficiently discover and perform specific tasks: - [`pydantic-ai-skills`](https://github.com/DougTrajano/pydantic-ai-skills) - `SkillsCapability` implements Agent Skills support with progressive disclosure (load skills on-demand to reduce tokens). Supports filesystem and programmatic skills; compatible with [agentskills.io](https://agentskills.io). To add your package to this page, open a pull request. ## Publishing capabilities To make a custom capability usable in [agent specs](/docs/ai/core-concepts/agent-spec), it needs a [`get_serialization_name`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_serialization_name) (defaults to the class name) and a constructor that accepts serializable arguments. The default [`from_spec`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.from_spec) implementation calls `cls(*args, **kwargs)`, so for simple dataclasses no override is needed: custom\_spec\_capability.py ```python from dataclasses import dataclass from typing import Any from pydantic_ai import Agent, AgentSpec from pydantic_ai.capabilities import AbstractCapability @dataclass class RateLimit(AbstractCapability[Any]): """Limits requests per minute.""" rpm: int = 60 # In YAML: `- RateLimit: {rpm: 30}` # In Python: agent = Agent.from_spec( AgentSpec(model='test', capabilities=[{'RateLimit': {'rpm': 30}}]), custom_capability_types=[RateLimit], ) ``` Users register custom capability types via the `custom_capability_types` parameter on [`Agent.from_spec`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_spec) or [`Agent.from_file`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_file). Override [`from_spec`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.from_spec) when the constructor takes types that can't be represented in YAML/JSON. The spec fields should mirror the dataclass fields, but with serializable types: from\_spec\_override\_example.py ```python from collections.abc import Callable from dataclasses import dataclass, field from typing import Any from pydantic_ai import RunContext, ToolDefinition from pydantic_ai.capabilities import AbstractCapability @dataclass class ConditionalTools(AbstractCapability[Any]): """Hides tools unless a condition is met.""" condition: Callable[[RunContext[Any]], bool] # not serializable hidden_tools: list[str] = field(default_factory=list) @classmethod def from_spec(cls, hidden_tools: list[str]) -> 'ConditionalTools[Any]': # In the spec, there's no condition callable -- always hide return cls(condition=lambda ctx: True, hidden_tools=hidden_tools) async def prepare_tools( self, ctx: RunContext[Any], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition]: if self.condition(ctx): return [td for td in tool_defs if td.name not in self.hidden_tools] return tool_defs ``` In YAML this would be `- ConditionalTools: {hidden_tools: [dangerous_tool]}`. In Python code, the full constructor is available: `ConditionalTools(condition=my_check, hidden_tools=['dangerous_tool'])`. See [Extensibility](/docs/ai/guides/extensibility) for packaging conventions and the broader extension ecosystem. --- # [Dependencies](https://pydantic.dev/docs/ai/core-concepts/dependencies/) # Dependencies Pydantic AI uses a dependency injection system to provide data and services to your agent's [system prompts](/docs/ai/core-concepts/agent#system-prompts), [tools](/docs/ai/tools-toolsets/tools) and [output validators](/docs/ai/core-concepts/output#output-validator-functions). Matching Pydantic AI's design philosophy, our dependency system tries to use existing best practice in Python development rather than inventing esoteric "magic", this should make dependencies type-safe, understandable, easier to test, and ultimately easier to deploy in production. ## Defining Dependencies Dependencies can be any python type. While in simple cases you might be able to pass a single object as a dependency (e.g. an HTTP connection), [dataclasses](https://docs.python.org/3/library/dataclasses.html#module-dataclasses) are generally a convenient container when your dependencies included multiple objects. Here's an example of defining an agent that requires dependencies. (**Note:** dependencies aren't actually used in this example, see [Accessing Dependencies](#accessing-dependencies) below) unused\_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent @dataclass class MyDeps: # (1) api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-5.2', deps_type=MyDeps, # (2) ) async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run( 'Tell me a joke.', deps=deps, # (3) ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` Define a dataclass to hold dependencies. Pass the dataclass type to the `deps_type` argument of the [`Agent` constructor](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__). **Note**: we're passing the type here, NOT an instance, this parameter is not actually used at runtime, it's here so we can get full type checking of the agent. When running the agent, pass an instance of the dataclass to the `deps` parameter. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## Accessing Dependencies Dependencies are accessed through the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) type, this should be the first parameter of system prompt functions etc. system\_prompt\_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-5.2', deps_type=MyDeps, ) @agent.system_prompt # (1) async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2) response = await ctx.deps.http_client.get( # (3) 'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, # (4) ) response.raise_for_status() return f'Prompt: {response.text}' async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) may optionally be passed to a [`system_prompt`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.system_prompt) function as the only argument. [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) is parameterized with the type of the dependencies, if this type is incorrect, static type checkers will raise an error. Access dependencies through the [`.deps`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.deps) attribute. Access dependencies through the [`.deps`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.deps) attribute. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ In addition to [`.deps`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.deps), [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) provides access to the running agent via [`.agent`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.agent), which is useful when [tools](/docs/ai/tools-toolsets/tools), [hooks](/docs/ai/core-concepts/hooks), or [capabilities](/docs/ai/core-concepts/capabilities) need to read agent properties like [`name`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.name) or [`output_type`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.output_type). Dependency fields can also be referenced in instructions and descriptions via [template strings](/docs/ai/core-concepts/agent-spec#template-strings) -- for example, `TemplateStr('Hello {{name}}')` renders `name` from the deps object at runtime. This is especially useful in [agent specs](/docs/ai/core-concepts/agent-spec) where callables aren't available. ### Asynchronous vs. Synchronous dependencies [System prompt functions](/docs/ai/core-concepts/agent#system-prompts), [function tools](/docs/ai/tools-toolsets/tools) and [output validators](/docs/ai/core-concepts/output#output-validator-functions) are all run in the async context of an agent run. If these functions are not coroutines (e.g. `async def`) they are called with [`run_in_executor`](https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor) in a thread pool. It's therefore marginally preferable to use `async` methods where dependencies perform IO, although synchronous dependencies should work fine too. `run` vs. `run_sync` and Asynchronous vs. Synchronous dependencies Whether you use synchronous or asynchronous dependencies is completely independent of whether you use `run` or `run_sync` -- `run_sync` is just a wrapper around `run` and agents are always run in an async context. Here's the same example as above, but with a synchronous dependency: sync\_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.Client # (1) agent = Agent( 'openai:gpt-5.2', deps_type=MyDeps, ) @agent.system_prompt def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2) response = ctx.deps.http_client.get( 'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'} ) response.raise_for_status() return f'Prompt: {response.text}' async def main(): deps = MyDeps('foobar', httpx.Client()) result = await agent.run( 'Tell me a joke.', deps=deps, ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` Here we use a synchronous `httpx.Client` instead of an asynchronous `httpx.AsyncClient`. To match the synchronous dependency, the system prompt function is now a plain function, not a coroutine. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## Full Example As well as system prompts, dependencies can be used in [tools](/docs/ai/tools-toolsets/tools) and [output validators](/docs/ai/core-concepts/output#output-validator-functions). full\_example.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, ModelRetry, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-5.2', deps_type=MyDeps, ) @agent.system_prompt async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: response = await ctx.deps.http_client.get('https://example.com') response.raise_for_status() return f'Prompt: {response.text}' @agent.tool # (1) async def get_joke_material(ctx: RunContext[MyDeps], subject: str) -> str: response = await ctx.deps.http_client.get( 'https://example.com#jokes', params={'subject': subject}, headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, ) response.raise_for_status() return response.text @agent.output_validator # (2) async def validate_output(ctx: RunContext[MyDeps], output: str) -> str: response = await ctx.deps.http_client.post( 'https://example.com#validate', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, params={'query': output}, ) if response.status_code == 400: raise ModelRetry(f'invalid response: {response.text}') response.raise_for_status() return output async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` To pass `RunContext` to a tool, use the [`tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) decorator. `RunContext` may optionally be passed to a [`output_validator`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.output_validator) function as the first argument. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## Overriding Dependencies When testing agents, it's useful to be able to customise dependencies. While this can sometimes be done by calling the agent directly within unit tests, we can also override dependencies while calling application code which in turn calls the agent. This is done via the [`override`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.override) method on the agent. joke\_app.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient async def system_prompt_factory(self) -> str: # (1) response = await self.http_client.get('https://example.com') response.raise_for_status() return f'Prompt: {response.text}' joke_agent = Agent('openai:gpt-5.2', deps_type=MyDeps) @joke_agent.system_prompt async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: return await ctx.deps.system_prompt_factory() # (2) async def application_code(prompt: str) -> str: # (3) ... ... # now deep within application code we call our agent async with httpx.AsyncClient() as client: app_deps = MyDeps('foobar', client) result = await joke_agent.run(prompt, deps=app_deps) # (4) return result.output ``` Define a method on the dependency to make the system prompt easier to customise. Call the system prompt factory from within the system prompt function. Application code that calls the agent, in a real application this might be an API endpoint. Call the agent from within the application code, in a real application this call might be deep within a call stack. Note `app_deps` here will NOT be used when deps are overridden. _(This example is complete, it can be run "as is")_ test\_joke\_app.py ```python from joke_app import MyDeps, application_code, joke_agent class TestMyDeps(MyDeps): # (1) async def system_prompt_factory(self) -> str: return 'test prompt' async def test_application_code(): test_deps = TestMyDeps('test_key', None) # (2) with joke_agent.override(deps=test_deps): # (3) joke = await application_code('Tell me a joke.') # (4) assert joke.startswith('Did you hear about the toothpaste scandal?') ``` Define a subclass of `MyDeps` in tests to customise the system prompt factory. Create an instance of the test dependency, we don't need to pass an `http_client` here as it's not used. Override the dependencies of the agent for the duration of the `with` block, `test_deps` will be used when the agent is run. Now we can safely call our application code, the agent will use the overridden dependencies. ## Examples The following examples demonstrate how to use dependencies in Pydantic AI: - [Weather Agent](/docs/ai/examples/weather-agent) - [SQL Generation](/docs/ai/examples/sql-gen) - [RAG](/docs/ai/examples/rag) --- # [Direct Model Requests](https://pydantic.dev/docs/ai/core-concepts/direct/) # Direct Model Requests The `direct` module provides low-level methods for making imperative requests to LLMs where the only abstraction is input and output schema translation, enabling you to use all models with the same API. These methods are thin wrappers around the [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) implementations, offering a simpler interface when you don't need the full functionality of an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent). The following functions are available: - [`model_request`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request): Make a non-streamed async request to a model - [`model_request_sync`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request_sync): Make a non-streamed synchronous request to a model - [`model_request_stream`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request_stream): Make a streamed async request to a model - [`model_request_stream_sync`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request_stream_sync): Make a streamed sync request to a model ## Basic Example Here's a simple example demonstrating how to use the direct API to make a basic request: direct\_basic.py ```python from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync # Make a synchronous request to the model model_response = model_request_sync( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')] ) print(model_response.parts[0].content) #> The capital of France is Paris. print(model_response.usage) #> RequestUsage(input_tokens=56, output_tokens=7) ``` _(This example is complete, it can be run "as is")_ ## Advanced Example with Tool Calling You can also use the direct API to work with function/tool calling. Even here we can use Pydantic to generate the JSON schema for the tool: ```python from typing import Literal from pydantic import BaseModel from pydantic_ai import ModelRequest, ToolDefinition from pydantic_ai.direct import model_request from pydantic_ai.models import ModelRequestParameters class Divide(BaseModel): """Divide two numbers.""" numerator: float denominator: float on_inf: Literal['error', 'infinity'] = 'infinity' async def main(): # Make a request to the model with tool access model_response = await model_request( 'openai:gpt-5-nano', [ModelRequest.user_text_prompt('What is 123 / 456?')], model_request_parameters=ModelRequestParameters( function_tools=[ ToolDefinition( name=Divide.__name__.lower(), description=Divide.__doc__, parameters_json_schema=Divide.model_json_schema(), ) ], allow_text_output=True, # Allow model to either use tools or respond directly ), ) print(model_response) """ ModelResponse( parts=[ ToolCallPart( tool_name='divide', args={'numerator': '123', 'denominator': '456'}, tool_call_id='pyd_ai_2e0e396768a14fe482df90a29a78dc7b', ) ], usage=RequestUsage(input_tokens=55, output_tokens=7), model_name='gpt-5-nano', timestamp=datetime.datetime(...), ) """ ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## When to Use the direct API vs Agent The direct API is ideal when: 1. You need more direct control over model interactions 2. You want to implement custom behavior around model requests 3. You're building your own abstractions on top of model interactions For most application use cases, the higher-level [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) API provides a more convenient interface with additional features such as built-in tool execution, retrying, structured output parsing, and more. ## OpenTelemetry or Logfire Instrumentation As with [agents](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent), you can enable OpenTelemetry/Logfire instrumentation with just a few extra lines direct\_instrumented.py ```python import logfire from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync logfire.configure() logfire.instrument_pydantic_ai() # Make a synchronous request to the model model_response = model_request_sync( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')], ) print(model_response.parts[0].content) #> The capital of France is Paris. ``` _(This example is complete, it can be run "as is")_ You can also enable OpenTelemetry on a per call basis: direct\_instrumented.py ```python import logfire from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync logfire.configure() # Make a synchronous request to the model model_response = model_request_sync( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')], instrument=True ) print(model_response.parts[0].content) #> The capital of France is Paris. ``` See [Debugging and Monitoring](/docs/ai/integrations/logfire) for more details, including how to instrument with plain OpenTelemetry without Logfire. --- # [Hooks](https://pydantic.dev/docs/ai/core-concepts/hooks/) # Hooks Hooks let you intercept and modify agent behavior at every stage of a run -- model requests, tool calls, streaming events -- using simple decorators or constructor arguments. No subclassing needed. The [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) capability is the recommended way to add [lifecycle hooks](/docs/ai/core-concepts/capabilities#hooking-into-the-lifecycle) for application-level concerns like logging, metrics, and lightweight validation. For reusable capabilities that combine hooks with tools, instructions, or model settings, subclass [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability) instead -- see [Building custom capabilities](/docs/ai/core-concepts/capabilities#building-custom-capabilities). ## Quick start Create a [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) instance, register hooks via `@hooks.on.*` decorators, and pass it to your agent: hooks\_decorator.py ```python from pydantic_ai import Agent, ModelRequestContext, RunContext from pydantic_ai.capabilities import Hooks hooks = Hooks() @hooks.on.before_model_request async def log_request(ctx: RunContext[None], request_context: ModelRequestContext) -> ModelRequestContext: print(f'Sending {len(request_context.messages)} messages to the model') #> Sending 1 messages to the model return request_context agent = Agent('test', capabilities=[hooks]) result = agent.run_sync('Hello!') print(result.output) #> success (no tool calls) ``` ## Registering hooks ### Decorator registration The `hooks.on` namespace provides decorator methods for every lifecycle hook. Use them as bare decorators or with parameters: ```python # Bare decorator @hooks.on.before_model_request async def my_hook(ctx, request_context): return request_context # With parameters (timeout, tool filter) @hooks.on.before_model_request(timeout=5.0) async def my_timed_hook(ctx, request_context): return request_context ``` Multiple hooks can be registered for the same event -- they fire in registration order. ### Constructor kwargs You can also pass hook functions directly to the [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) constructor: hooks\_constructor.py ```python from pydantic_ai import Agent, ModelRequestContext, RunContext from pydantic_ai.capabilities import Hooks async def log_request(ctx: RunContext[None], request_context: ModelRequestContext) -> ModelRequestContext: print(f'Sending {len(request_context.messages)} messages to the model') #> Sending 1 messages to the model return request_context agent = Agent('test', capabilities=[Hooks(before_model_request=log_request)]) result = agent.run_sync('Hello!') print(result.output) #> success (no tool calls) ``` Both sync and async hook functions are accepted. Sync functions are automatically wrapped for async execution. ## Hook types ### Run hooks `hooks.on.` Constructor kwarg `AbstractCapability` method `before_run` `before_run=` `before_run` `after_run` `after_run=` `after_run` `run` `run=` `wrap_run` `run_error` `run_error=` `on_run_error` Run hooks fire once per agent run. `wrap_run` (registered via `hooks.on.run`) wraps the entire run and supports error recovery. ### Node hooks `hooks.on.` Constructor kwarg `AbstractCapability` method `before_node_run` `before_node_run=` `before_node_run` `after_node_run` `after_node_run=` `after_node_run` `node_run` `node_run=` `wrap_node_run` `node_run_error` `node_run_error=` `on_node_run_error` Node hooks fire for each graph step (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`). Note `wrap_node_run` hooks are called automatically by [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream), and [`agent_run.next()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.next), but **not** when iterating with bare `async for node in agent_run:`. ### Model request hooks `hooks.on.` Constructor kwarg `AbstractCapability` method `before_model_request` `before_model_request=` `before_model_request` `after_model_request` `after_model_request=` `after_model_request` `model_request` `model_request=` `wrap_model_request` `model_request_error` `model_request_error=` `on_model_request_error` Model request hooks fire around each LLM call. `ModelRequestContext` bundles `model`, `messages`, `model_settings`, and `model_request_parameters`. To swap the model for a given request, set `request_context.model` to a different [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) instance. To skip the model call entirely, raise [`SkipModelRequest(response)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipModelRequest) from `before_model_request` or `model_request` (wrap). ### Tool validation hooks `hooks.on.` Constructor kwarg `AbstractCapability` method `before_tool_validate` `before_tool_validate=` `before_tool_validate` `after_tool_validate` `after_tool_validate=` `after_tool_validate` `tool_validate` `tool_validate=` `wrap_tool_validate` `tool_validate_error` `tool_validate_error=` `on_tool_validate_error` Validation hooks fire when the model's JSON arguments are parsed and validated. All tool hooks receive `call` ([`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)) and `tool_def` ([`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition)) parameters. To skip validation, raise [`SkipToolValidation(args)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolValidation) from `before_tool_validate` or `tool_validate` (wrap). ### Tool execution hooks `hooks.on.` Constructor kwarg `AbstractCapability` method `before_tool_execute` `before_tool_execute=` `before_tool_execute` `after_tool_execute` `after_tool_execute=` `after_tool_execute` `tool_execute` `tool_execute=` `wrap_tool_execute` `tool_execute_error` `tool_execute_error=` `on_tool_execute_error` Execution hooks fire when the tool function runs. `args` is always the validated `dict[str, Any]`. To skip execution, raise [`SkipToolExecution(result)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolExecution) from `before_tool_execute` or `tool_execute` (wrap). ### Tool preparation `hooks.on.` Constructor kwarg `AbstractCapability` method `prepare_tools` `prepare_tools=` `prepare_tools` Filters or modifies tool definitions the model sees on each step. Controls visibility, not execution. ### Event stream hooks `hooks.on.` Constructor kwarg `AbstractCapability` method `run_event_stream` `run_event_stream=` `wrap_run_event_stream` `event` `event=` _(per-event convenience)_ `run_event_stream` wraps the full event stream as an async generator. `event` is a convenience -- it fires for each individual event during a streamed run: hooks\_event.py ```python from pydantic_ai import Agent, AgentStreamEvent, RunContext from pydantic_ai.capabilities import Hooks hooks = Hooks() event_count = 0 @hooks.on.event async def count_events(ctx: RunContext[None], event: AgentStreamEvent) -> AgentStreamEvent: global event_count event_count += 1 return event agent = Agent('test', capabilities=[hooks]) ``` ## Tool hook filtering Tool hooks (validation and execution) support a `tools` parameter to target specific tools by name: hooks\_tool\_filter.py ```python from typing import Any from pydantic_ai import Agent, RunContext, ToolDefinition from pydantic_ai.capabilities import Hooks from pydantic_ai.messages import ToolCallPart hooks = Hooks() call_log: list[str] = [] @hooks.on.before_tool_execute(tools=['send_email']) async def audit_dangerous_tools( ctx: RunContext[None], *, call: ToolCallPart, tool_def: ToolDefinition, args: dict[str, Any], ) -> dict[str, Any]: call_log.append(f'audit: {call.tool_name}') return args agent = Agent('test', capabilities=[hooks]) @agent.tool_plain def send_email(to: str) -> str: return f'sent to {to}' result = agent.run_sync('Send an email to test@example.com') print(call_log) #> ['audit: send_email'] ``` The `tools` parameter accepts a sequence of tool names. The hook only fires for matching tools -- other tool calls pass through unaffected. ## Timeouts Each hook supports an optional `timeout` in seconds. If the hook exceeds the timeout, a [`HookTimeoutError`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.HookTimeoutError) is raised: hooks\_timeout.py ```python import asyncio from pydantic_ai import Agent, ModelRequestContext, RunContext from pydantic_ai.capabilities import Hooks, HookTimeoutError hooks = Hooks() @hooks.on.before_model_request(timeout=0.01) async def slow_hook( ctx: RunContext[None], request_context: ModelRequestContext ) -> ModelRequestContext: await asyncio.sleep(10) # Will be interrupted by timeout return request_context # pragma: no cover agent = Agent('test', capabilities=[hooks]) try: agent.run_sync('Hello') except HookTimeoutError as e: print(f'Hook timed out: {e.hook_name} after {e.timeout}s') #> Hook timed out: before_model_request after 0.01s ``` Timeouts are set via the decorator parameter (`@hooks.on.before_model_request(timeout=5.0)`) or via the constructor when using kwargs. ## Wrap hooks Wrap hooks let you surround an operation with setup/teardown logic. In the `hooks.on` namespace, wrap hooks drop the `wrap_` prefix -- `hooks.on.model_request` corresponds to `wrap_model_request`: hooks\_wrap.py ```python from pydantic_ai import Agent, ModelRequestContext, RunContext from pydantic_ai.capabilities import Hooks, WrapModelRequestHandler from pydantic_ai.messages import ModelResponse hooks = Hooks() wrap_log: list[str] = [] @hooks.on.model_request async def log_request( ctx: RunContext[None], *, request_context: ModelRequestContext, handler: WrapModelRequestHandler ) -> ModelResponse: wrap_log.append('before') response = await handler(request_context) wrap_log.append('after') return response agent = Agent('test', capabilities=[hooks]) result = agent.run_sync('Hello!') print(wrap_log) #> ['before', 'after'] ``` ## Hook ordering When multiple hooks are registered for the same event (either on the same `Hooks` instance or across multiple capabilities): - **`before_*`** hooks fire in registration/capability order - **`after_*`** hooks fire in reverse order - **`wrap_*`** hooks nest as middleware -- the first registered hook is the outermost layer See [Composition](/docs/ai/core-concepts/capabilities#composition) for details on how hooks from multiple capabilities interact. ## Error hooks Error hooks (`*_error` in the `hooks.on` namespace, `on_*_error` on `AbstractCapability`) use **raise-to-propagate, return-to-recover** semantics: - **Raise the original error** -- propagates unchanged _(default)_ - **Raise a different exception** -- transforms the error - **Return a result** -- suppresses the error See [Error hooks](/docs/ai/core-concepts/capabilities#error-hooks) for the full pattern and recovery types. ## Triggering retries with `ModelRetry` Hooks can raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to ask the model to try again with a custom message -- the same exception used in [tool functions](/docs/ai/tools-toolsets/tools#model-retry) and output validators. **Model request hooks** (`after_model_request`, `wrap_model_request`, `on_model_request_error`): - The retry message is sent back to the model as a [`RetryPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.RetryPromptPart) - `after_model_request`: the original response is preserved in message history so the model can see what it said - `wrap_model_request`: the response is preserved only if the handler was called - Retries count against the agent's `output_retries` limit **Tool hooks** (`before/after_tool_validate`, `before/after_tool_execute`, `wrap_tool_execute`, `on_tool_execute_error`): - Converted to tool retry prompts, same as when a tool function raises `ModelRetry` - Retries count against the tool's `max_retries` limit `ModelRetry` from `wrap_model_request` and `wrap_tool_execute` is treated as control flow -- it bypasses `on_model_request_error` and `on_tool_execute_error` respectively. hooks\_model\_retry.py ```python from pydantic_ai import Agent, RunContext from pydantic_ai.capabilities import Hooks from pydantic_ai.exceptions import ModelRetry from pydantic_ai.messages import ModelResponse from pydantic_ai.models import ModelRequestContext hooks = Hooks() @hooks.on.after_model_request async def check_response( ctx: RunContext[None], *, request_context: ModelRequestContext, response: ModelResponse, ) -> ModelResponse: if 'PLACEHOLDER' in str(response.parts): raise ModelRetry('Response contains placeholder text. Please provide real data.') return response agent = Agent('test', capabilities=[hooks]) result = agent.run_sync('Hello') print(result.output) #> success (no tool calls) ``` ## When to use `Hooks` vs `AbstractCapability` Use [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) Use [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability) Application-level hooks (logging, metrics) Reusable, packaged capabilities Quick one-off interceptors Combined tools + hooks + instructions + settings No configuration state needed Complex per-run state management Single-file scripts Multi-agent shared behavior --- # [Messages and chat history](https://pydantic.dev/docs/ai/core-concepts/message-history/) # Messages and chat history Pydantic AI provides access to messages exchanged during an agent run. These messages can be used both to continue a coherent conversation, and to understand how an agent performed. ### Accessing Messages from Results After running an agent, you can access the messages exchanged during that run from the `result` object. Both [`RunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) (returned by [`Agent.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`Agent.run_sync`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync)) and [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult) (returned by [`Agent.run_stream`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream)) have the following methods: - [`all_messages()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.all_messages): returns all messages, including messages from prior runs. There's also a variant that returns JSON bytes, [`all_messages_json()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.all_messages_json). - [`new_messages()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.new_messages): returns only the messages from the current run. There's also a variant that returns JSON bytes, [`new_messages_json()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.new_messages_json). StreamedRunResult and complete messages On [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult), the messages returned from these methods will only include the final result message once the stream has finished. E.g. you've awaited one of the following coroutines: - [`StreamedRunResult.stream_output()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_output) - [`StreamedRunResult.stream_text()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text) - [`StreamedRunResult.stream_responses()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_responses) - [`StreamedRunResult.get_output()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.get_output) **Note:** The final result message will NOT be added to result messages if you use [`.stream_text(delta=True)`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text) since in this case the result content is never built as one string. Example of accessing methods on a [`RunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) : run\_result\_messages.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.') result = agent.run_sync('Tell me a joke.') print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. # all messages from the run print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=55, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` _(This example is complete, it can be run "as is")_ Example of accessing methods on a [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult) : streamed\_run\_result\_messages.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.') async def main(): async with agent.run_stream('Tell me a joke.') as result: # incomplete messages before the stream finishes print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ) ] """ async for text in result.stream_text(): print(text) #> Did you hear #> Did you hear about the toothpaste #> Did you hear about the toothpaste scandal? They called #> Did you hear about the toothpaste scandal? They called it Colgate. # complete messages once the stream finishes print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=50, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ### Using Messages as Input for Further Agent Runs The primary use of message histories in Pydantic AI is to maintain context across multiple agent runs. To use existing messages in a run, pass them to the `message_history` parameter of [`Agent.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`Agent.run_sync`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync) or [`Agent.run_stream`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream). If `message_history` is set and not empty, a new system prompt is not generated -- we assume the existing message history includes a system prompt. Reusing messages in a conversation ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. result2 = agent.run_sync('Explain?', message_history=result1.new_messages()) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. print(result2.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=55, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ UserPromptPart( content='Explain?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.' ) ], usage=RequestUsage(input_tokens=56, output_tokens=26), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` _(This example is complete, it can be run "as is")_ ## Storing and loading messages (to JSON) While maintaining conversation state in memory is enough for many applications, often times you may want to store the messages history of an agent run on disk or in a database. This might be for evals, for sharing data between Python and JavaScript/TypeScript, or any number of other use cases. The intended way to do this is using a `TypeAdapter`. We export [`ModelMessagesTypeAdapter`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessagesTypeAdapter) that can be used for this, or you can create your own. Here's an example showing how: serialize messages to json ```python from pydantic_core import to_jsonable_python from pydantic_ai import ( Agent, ModelMessagesTypeAdapter, # (1) ) agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') history_step_1 = result1.all_messages() as_python_objects = to_jsonable_python(history_step_1) # (2) same_history_as_step_1 = ModelMessagesTypeAdapter.validate_python(as_python_objects) result2 = agent.run_sync( # (3) 'Tell me a different joke.', message_history=same_history_as_step_1 ) ``` Alternatively, you can create a `TypeAdapter` from scratch: ```python from pydantic import TypeAdapter from pydantic_ai import ModelMessage ModelMessagesTypeAdapter = TypeAdapter(list[ModelMessage]) ``` Alternatively you can serialize to/from JSON directly: ```python from pydantic_core import to_json ... as_json_objects = to_json(history_step_1) same_history_as_step_1 = ModelMessagesTypeAdapter.validate_json(as_json_objects) ``` You can now continue the conversation with history `same_history_as_step_1` despite creating a new agent run. _(This example is complete, it can be run "as is")_ ## Other ways of using messages Since messages are defined by simple dataclasses, you can manually create and manipulate, e.g. for testing. The message format is independent of the model used, so you can use messages in different agents, or the same agent with different models. In the example below, we reuse the message from the first agent run, which uses the `openai:gpt-5.2` model, in a second agent run using the `google-gla:gemini-3-pro-preview` model. Reusing messages with a different model ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. result2 = agent.run_sync( 'Explain?', model='google-gla:gemini-3-pro-preview', message_history=result1.new_messages(), ) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. print(result2.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=55, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ UserPromptPart( content='Explain?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.' ) ], usage=RequestUsage(input_tokens=56, output_tokens=26), model_name='gemini-3-pro-preview', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` ## Processing Message History Sometimes you may want to modify the message history before it's sent to the model. This could be for privacy reasons (filtering out sensitive information), to save costs on tokens, to give less context to the LLM, or custom processing logic. Pydantic AI provides a `history_processors` parameter on `Agent` that allows you to intercept and modify the message history before each model request. History processors can also be provided via the [`HistoryProcessor`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.HistoryProcessor) capability. History processors replace the message history History processors replace the message history in the state with the processed messages, including the new user prompt part. This means that if you want to keep the original message history, you need to make a copy of it. History processors can affect `new_messages()` results [`new_messages()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.new_messages) determines which messages belong to the current run after all history processors have been applied. If your processor reorders, inserts, or removes messages, the set of messages returned by `new_messages()` may differ from what you expect. To avoid surprises: - Preserve `run_id` on existing messages. - When creating new messages in a processor that should be part of the current run, use a [context-aware processor](#runcontext-parameter) and set `run_id=ctx.run_id` on the new message. - Reordering or removing messages may shift the boundary between "old" and "new" messages, test with [`new_messages()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.new_messages) to verify the behavior matches your expectations. ### Usage The `history_processors` is a list of callables that take a list of [`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage) and return a modified list of the same type. Each processor is applied in sequence, and processors can be either synchronous or asynchronous. simple\_history\_processor.py ```python from pydantic_ai import ( Agent, ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: """Remove all ModelResponse messages, keeping only ModelRequest messages.""" return [msg for msg in messages if isinstance(msg, ModelRequest)] # Create agent with history processor agent = Agent('openai:gpt-5.2', history_processors=[filter_responses]) # Example: Create some conversation history message_history = [ ModelRequest(parts=[UserPromptPart(content='What is 2+2?')]), ModelResponse(parts=[TextPart(content='2+2 equals 4')]), # This will be filtered out ] # When you run the agent, the history processor will filter out ModelResponse messages # result = agent.run_sync('What about 3+3?', message_history=message_history) ``` #### Keep Only Recent Messages You can use the `history_processor` to only keep the recent messages: keep\_recent\_messages.py ```python from pydantic_ai import Agent, ModelMessage async def keep_recent_messages(messages: list[ModelMessage]) -> list[ModelMessage]: """Keep only the last 5 messages to manage token usage.""" return messages[-5:] if len(messages) > 5 else messages agent = Agent('openai:gpt-5.2', history_processors=[keep_recent_messages]) # Example: Even with a long conversation history, only the last 5 messages are sent to the model long_conversation_history: list[ModelMessage] = [] # Your long conversation history here # result = agent.run_sync('What did we discuss?', message_history=long_conversation_history) ``` Be careful when slicing the message history When slicing the message history, you need to make sure that tool calls and returns are paired, otherwise the LLM may return an error. For more details, refer to [this GitHub issue](https://github.com/pydantic/pydantic-ai/issues/2050#issuecomment-3019976269). #### `RunContext` parameter History processors can optionally accept a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) parameter to access additional information about the current run, such as dependencies, model information, and usage statistics: context\_aware\_processor.py ```python from pydantic_ai import Agent, ModelMessage, RunContext def context_aware_processor( ctx: RunContext[None], messages: list[ModelMessage], ) -> list[ModelMessage]: # Access current usage current_tokens = ctx.usage.total_tokens # Filter messages based on context if current_tokens > 1000: return messages[-3:] # Keep only recent messages when token usage is high return messages agent = Agent('openai:gpt-5.2', history_processors=[context_aware_processor]) ``` This allows for more sophisticated message processing based on the current state of the agent run. #### Summarize Old Messages Use an LLM to summarize older messages to preserve context while reducing tokens. summarize\_old\_messages.py ```python from pydantic_ai import Agent, ModelMessage # Use a cheaper model to summarize old messages. summarize_agent = Agent( 'openai:gpt-5-mini', instructions=""" Summarize this conversation, omitting small talk and unrelated topics. Focus on the technical discussion and next steps. """, ) async def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]: # Summarize the oldest 10 messages if len(messages) > 10: oldest_messages = messages[:10] summary = await summarize_agent.run(message_history=oldest_messages) # Return the last message and the summary return summary.new_messages() + messages[-1:] return messages agent = Agent('openai:gpt-5.2', history_processors=[summarize_old_messages]) ``` Be careful when summarizing the message history When summarizing the message history, you need to make sure that tool calls and returns are paired, otherwise the LLM may return an error. For more details, refer to [this GitHub issue](https://github.com/pydantic/pydantic-ai/issues/2050#issuecomment-3019976269), where you can find examples of summarizing the message history. ### Testing History Processors You can test what messages are actually sent to the model provider using [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel): test\_history\_processor.py ```python import pytest from pydantic_ai import ( Agent, ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) from pydantic_ai.models.function import AgentInfo, FunctionModel @pytest.fixture def received_messages() -> list[ModelMessage]: return [] @pytest.fixture def function_model(received_messages: list[ModelMessage]) -> FunctionModel: def capture_model_function(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: # Capture the messages that the provider actually receives received_messages.clear() received_messages.extend(messages) return ModelResponse(parts=[TextPart(content='Provider response')]) return FunctionModel(capture_model_function) def test_history_processor(function_model: FunctionModel, received_messages: list[ModelMessage]): def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: return [msg for msg in messages if isinstance(msg, ModelRequest)] agent = Agent(function_model, history_processors=[filter_responses]) message_history = [ ModelRequest(parts=[UserPromptPart(content='Question 1')]), ModelResponse(parts=[TextPart(content='Answer 1')]), ] agent.run_sync('Question 2', message_history=message_history) assert received_messages == [ ModelRequest(parts=[UserPromptPart(content='Question 1')]), ModelRequest(parts=[UserPromptPart(content='Question 2')]), ] ``` ### Multiple Processors You can also use multiple processors: multiple\_history\_processors.py ```python from pydantic_ai import Agent, ModelMessage, ModelRequest def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: return [msg for msg in messages if isinstance(msg, ModelRequest)] def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]: return messages[-5:] agent = Agent('openai:gpt-5.2', history_processors=[filter_responses, summarize_old_messages]) ``` In this case, the `filter_responses` processor will be applied first, and the `summarize_old_messages` processor will be applied second. ## Examples For a more complete example of using messages in conversations, see the [chat app](/docs/ai/examples/chat-app) example. --- # [Output](https://pydantic.dev/docs/ai/core-concepts/output/) # Output "Output" refers to the final value returned from [running an agent](/docs/ai/core-concepts/agent#running-agents). This can be either plain text, [structured data](#structured-output), an [image](#image-output), or the result of a [function](#output-functions) called with arguments provided by the model. The output is wrapped in [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) or [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult) so that you can access other data, like [usage](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) of the run and [message history](/docs/ai/core-concepts/message-history#accessing-messages-from-results). Both `AgentRunResult` and `StreamedRunResult` are generic in the data they wrap, so typing information about the data returned by the agent is preserved. A run ends when the model responds with one of the output types, or, if no output type is specified or `str` is one of the allowed options, when a plain text response is received. A run can also be cancelled if usage limits are exceeded, see [Usage Limits](/docs/ai/core-concepts/agent#usage-limits). Here's an example using a Pydantic model as the `output_type`, forcing the model to respond with data matching our specification: olympics.py ```python from pydantic import BaseModel from pydantic_ai import Agent class CityLocation(BaseModel): city: str country: str agent = Agent('google-gla:gemini-3-flash-preview', output_type=CityLocation) result = agent.run_sync('Where were the olympics held in 2012?') print(result.output) #> city='London' country='United Kingdom' print(result.usage()) #> RunUsage(input_tokens=57, output_tokens=8, requests=1) ``` _(This example is complete, it can be run "as is")_ ## Structured output data The [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) class constructor takes an `output_type` argument that takes one or more types or [output functions](#output-functions). It supports simple scalar types, list and dict types (including `TypedDict`s and [`StructuredDict`s](#structured-dict)), dataclasses and Pydantic models, as well as type unions -- generally everything supported as type hints in a Pydantic model. You can also pass a list of multiple choices. By default, Pydantic AI leverages the model's tool calling capability to make it return structured data. When multiple output types are specified (in a union or list), each member is registered with the model as a separate output tool in order to reduce the complexity of the schema and maximise the chances a model will respond correctly. This has been shown to work well across a wide range of models. If you'd like to change the names of the output tools, use a model's native structured output feature, or pass the output schema to the model in its [instructions](/docs/ai/core-concepts/agent#instructions), you can use an [output mode](#output-modes) marker class. When no output type is specified, or when `str` is among the output types, any plain text response from the model will be used as the output data. If `str` is not among the output types, the model is forced to return structured data or call an output function. If the output type schema is not of type `"object"` (e.g. it's `int` or `list[int]`), the output type is wrapped in a single element object, so the schema of all tools registered with the model are object schemas. Structured outputs (like tools) use Pydantic to build the JSON schema used for the tool, and to validate the data returned by the model. Type checking considerations The Agent class is generic in its output type, and this type is carried through to `AgentRunResult.output` and `StreamedRunResult.output` so that your IDE or static type checker can warn you when your code doesn't properly take into account all the possible values those outputs could have. Static type checkers like pyright and mypy will do their best to infer the agent's output type from the `output_type` you've specified, but they're not always able to do so correctly when you provide functions or multiple types in a union or list, even though Pydantic AI will behave correctly. When this happens, your type checker will complain even when you're confident you've passed a valid `output_type`, and you'll need to help the type checker by explicitly specifying the generic parameters on the `Agent` constructor. This is shown in the second example below and the output functions example further down. Specifically, there are three valid uses of `output_type` where you'll need to do this: 1. When using a union of types, e.g. `output_type=Foo | Bar`. Until [PEP-747](https://peps.python.org/pep-0747/) "Annotating Type Forms" lands in Python 3.15, type checkers do not consider these a valid value for `output_type`. In addition to the generic parameters on the `Agent` constructor, you'll need to add `# type: ignore` to the line that passes the union to `output_type`. Alternatively, you can use a list: `output_type=[Foo, Bar]`. 2. With mypy: When using a list, as a functionally equivalent alternative to a union, or because you're passing in [output functions](#output-functions). Pyright does handle this correctly, and we've filed [an issue](https://github.com/python/mypy/issues/19142) with mypy to try and get this fixed. 3. With mypy: when using an async output function. Pyright does handle this correctly, and we've filed [an issue](https://github.com/python/mypy/issues/19143) with mypy to try and get this fixed. Here's an example of returning either text or structured data: box\_or\_error.py ```python from pydantic import BaseModel from pydantic_ai import Agent class Box(BaseModel): width: int height: int depth: int units: str agent = Agent( 'openai:gpt-5-mini', output_type=[Box, str], # (1) instructions=( "Extract me the dimensions of a box, " "if you can't extract all data, ask the user to try again." ), ) result = agent.run_sync('The box is 10x20x30') print(result.output) #> Please provide the units for the dimensions (e.g., cm, in, m). result = agent.run_sync('The box is 10x20x30 cm') print(result.output) #> width=10 height=20 depth=30 units='cm' ``` This could also have been a union: `output_type=Box | str`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. _(This example is complete, it can be run "as is")_ Here's an example of using a union return type, which will register multiple output tools and wrap non-object schemas in an object: colors\_or\_sizes.py ```python from pydantic_ai import Agent agent = Agent[None, list[str] | list[int]]( 'openai:gpt-5-mini', output_type=list[str] | list[int], # type: ignore # (1) instructions='Extract either colors or sizes from the shapes provided.', ) result = agent.run_sync('red square, blue circle, green triangle') print(result.output) #> ['red', 'blue', 'green'] result = agent.run_sync('square size 10, circle size 20, triangle size 30') print(result.output) #> [10, 20, 30] ``` As explained in the "Type checking considerations" section above, using a union rather than a list requires explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. _(This example is complete, it can be run "as is")_ ### Output functions Instead of plain text or structured data, you may want the output of your agent run to be the result of a function called with arguments provided by the model, for example to further process or validate the data provided through the arguments (with the option to tell the model to try again), or to hand off to another agent. Output functions are similar to [function tools](/docs/ai/tools-toolsets/tools), but the model is forced to call one of them, the call ends the agent run, and the result is not passed back to the model. As with tool functions, output function arguments provided by the model are validated using Pydantic (with optional [validation context](#validation-context)), can optionally take [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as the first argument, and can raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to ask the model to try again with modified arguments (or with a different output type). To specify output functions, you set the agent's `output_type` to either a single function (or bound instance method), or a list of functions. The list can also contain other output types like simple scalars or entire Pydantic models. You typically do not want to also register your output function as a tool (using the `@agent.tool` decorator or `tools` argument), as this could confuse the model about which it should be calling. Here's an example of all of these features in action: output\_functions.py ```python import re from pydantic import BaseModel from pydantic_ai import Agent, ModelRetry, RunContext, UnexpectedModelBehavior class Row(BaseModel): name: str country: str tables = { 'capital_cities': [ Row(name='Amsterdam', country='Netherlands'), Row(name='Mexico City', country='Mexico'), ] } class SQLFailure(BaseModel): """An unrecoverable failure. Only use this when you can't change the query to make it work.""" explanation: str def run_sql_query(query: str) -> list[Row]: """Run a SQL query on the database.""" select_table = re.match(r'SELECT (.+) FROM (\w+)', query) if select_table: column_names = select_table.group(1) if column_names != '*': raise ModelRetry("Only 'SELECT *' is supported, you'll have to do column filtering manually.") table_name = select_table.group(2) if table_name not in tables: raise ModelRetry( f"Unknown table '{table_name}' in query '{query}'. Available tables: {', '.join(tables.keys())}." ) return tables[table_name] raise ModelRetry(f"Unsupported query: '{query}'.") sql_agent = Agent[None, list[Row] | SQLFailure]( 'openai:gpt-5.2', output_type=[run_sql_query, SQLFailure], instructions='You are a SQL agent that can run SQL queries on a database.', ) async def hand_off_to_sql_agent(ctx: RunContext, query: str) -> list[Row]: """I take natural language queries, turn them into SQL, and run them on a database.""" # Drop the final message with the output tool call, as it shouldn't be passed on to the SQL agent messages = ctx.messages[:-1] try: result = await sql_agent.run(query, message_history=messages) output = result.output if isinstance(output, SQLFailure): raise ModelRetry(f'SQL agent failed: {output.explanation}') return output except UnexpectedModelBehavior as e: # Bubble up potentially retryable errors to the router agent if (cause := e.__cause__) and isinstance(cause, ModelRetry): raise ModelRetry(f'SQL agent failed: {cause.message}') from e else: raise class RouterFailure(BaseModel): """Use me when no appropriate agent is found or the used agent failed.""" explanation: str router_agent = Agent[None, list[Row] | RouterFailure]( 'openai:gpt-5.2', output_type=[hand_off_to_sql_agent, RouterFailure], instructions='You are a router to other agents. Never try to solve a problem yourself, just pass it on.', ) result = router_agent.run_sync('Select the names and countries of all capitals') print(result.output) """ [ Row(name='Amsterdam', country='Netherlands'), Row(name='Mexico City', country='Mexico'), ] """ result = router_agent.run_sync('Select all pets') print(repr(result.output)) """ RouterFailure(explanation="The requested table 'pets' does not exist in the database. The only available table is 'capital_cities', which does not contain data about pets.") """ result = router_agent.run_sync('How do I fly from Amsterdam to Mexico City?') print(repr(result.output)) """ RouterFailure(explanation='I am not equipped to provide travel information, such as flights from Amsterdam to Mexico City.') """ ``` #### Text output If you provide an output function that takes a string, Pydantic AI will by default create an output tool like for any other output function. If instead you'd like the model to provide the string using plain text output, you can wrap the function in the [`TextOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.TextOutput) marker class. If desired, this marker class can be used alongside one or more [`ToolOutput`](#tool-output) marker classes (or unmarked types or functions) in a list provided to `output_type`. Like other output functions, text output functions can optionally take [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as the first argument, and can raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to ask the model to try again with modified arguments (or with a different output type). text\_output\_function.py ```python from pydantic_ai import Agent, TextOutput def split_into_words(text: str) -> list[str]: return text.split() agent = Agent( 'openai:gpt-5.2', output_type=TextOutput(split_into_words), ) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.'] ``` _(This example is complete, it can be run "as is")_ #### Handling partial output in output functions When streaming with `run_stream()` or `run_stream_sync()`, output functions are called **multiple times** -- once for each partial output received from the model, and once for the final complete output. You should check the [`RunContext.partial_output`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.partial_output) flag when your output function has **side effects** (e.g., sending notifications, logging, database updates) that should only execute on the final output. When streaming, `partial_output` is `True` for each partial output and `False` for the final complete output. For all [other run methods](/docs/ai/core-concepts/agent#running-agents), `partial_output` is always `False` as the function is only called once with the complete output. output\_function\_with\_side\_effects.py ```python from pydantic import BaseModel from pydantic_ai import Agent, RunContext class DatabaseRecord(BaseModel): name: str value: int | None = None # Make optional to allow partial output def save_to_database(ctx: RunContext, record: DatabaseRecord) -> DatabaseRecord: """Output function with side effect - only save final output to database.""" if ctx.partial_output: # Skip side effects for partial outputs return record # Only execute side effect for the final output print(f'Saving to database: {record.name} = {record.value}') #> Saving to database: test = 42 return record agent = Agent('openai:gpt-5.2', output_type=save_to_database) async def main(): async with agent.run_stream('Create a record with name "test" and value 42') as result: async for output in result.stream_output(debounce_by=None): print(output) #> name='test' value=None #> name='test' value=42 ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ### Output modes Pydantic AI implements three different methods to get a model to output structured data: 1. [Tool Output](#tool-output), where tool calls are used to produce the output. 2. [Native Output](#native-output), where the model is required to produce text content compliant with a provided JSON schema. 3. [Prompted Output](#prompted-output), where a prompt is injected into the model instructions including the desired JSON schema, and we attempt to parse the model's plain-text response as appropriate. #### Tool Output In the default Tool Output mode, the output JSON schema of each output type (or function) is provided to the model as the parameters schema of a special output tool. This is the default as it's supported by virtually all models and has been shown to work very well. If you'd like to change the name of the output tool, pass a custom description to aid the model, or turn on or off strict mode, you can wrap the type(s) in the [`ToolOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput) marker class and provide the appropriate arguments. Note that by default, the description is taken from the docstring specified on a Pydantic model or output function, so specifying it using the marker class is typically not necessary. To dynamically modify or filter the available output tools during an agent run, you can define an agent-wide `prepare_output_tools` function that will be called ahead of each step of a run. This function should be of type [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc), which takes the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and a list of [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition), and returns a new list of tool definitions (or `None` to disable all tools for that step). This is analogous to the [`prepare_tools` function](/docs/ai/tools-toolsets/tools-advanced#prepare-tools) for non-output tools. tool\_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, ToolOutput class Fruit(BaseModel): name: str color: str class Vehicle(BaseModel): name: str wheels: int agent = Agent( 'openai:gpt-5.2', output_type=[ # (1) ToolOutput(Fruit, name='return_fruit'), ToolOutput(Vehicle, name='return_vehicle'), ], ) result = agent.run_sync('What is a banana?') print(repr(result.output)) #> Fruit(name='banana', color='yellow') ``` If we were passing just `Fruit` and `Vehicle` without custom tool names, we could have used a union: `output_type=Fruit | Vehicle`. However, as `ToolOutput` is an object rather than a type, we have to use a list. _(This example is complete, it can be run "as is")_ ##### Parallel Output Tool Calls When the model calls other tools in parallel with an output tool, you can control how tool calls are executed by setting the agent's [`end_strategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.end_strategy): - `'early'` (default): Output tools are executed first. Once a valid final result is found, remaining function and output tool calls are skipped - `'exhaustive'`: Output tools are executed first, then all function tools are executed. The first valid output tool result becomes the final output The `'exhaustive'` strategy is useful when tools have important side effects (like logging, sending notifications, or updating metrics) that should always execute. Priority of output and deferred tools in streaming methods The [`run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) and [`run_stream_sync()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_sync) methods will consider the first output that matches the [output type](/docs/ai/core-concepts/output#structured-output) (which could be text, an [output tool](/docs/ai/core-concepts/output#tool-output) call, or a [deferred](/docs/ai/tools-toolsets/deferred-tools) tool call) to be the final output of the agent run, even when the model generates (additional) tool calls after this "final" output. This means that if the model calls deferred tools before output tools when using these methods, the deferred tool calls determine the agent run's final output, while the other [run methods](/docs/ai/core-concepts/agent#running-agents) would have prioritized the tool output. #### Native Output Native Output mode uses a model's native "Structured Outputs" feature (aka "JSON Schema response format"), where the model is forced to only output text matching the provided JSON schema. Note that this is not supported by all models, and sometimes comes with restrictions. For example, Gemini cannot use tools at the same time as structured output, and attempting to do so will result in an error. To use this mode, you can wrap the output type(s) in the [`NativeOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.NativeOutput) marker class that also lets you specify a `name` and `description` if the name and docstring of the type or function are not sufficient. native\_output.py ```python from pydantic_ai import Agent, NativeOutput from tool_output import Fruit, Vehicle agent = Agent( 'openai:gpt-5.2', output_type=NativeOutput( [Fruit, Vehicle], # (1) name='Fruit_or_vehicle', description='Return a fruit or vehicle.' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` This could also have been a union: `output_type=Fruit | Vehicle`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. _(This example is complete, it can be run "as is")_ #### Prompted Output In this mode, the model is prompted to output text matching the provided JSON schema through its [instructions](/docs/ai/core-concepts/agent#instructions) and it's up to the model to interpret those instructions correctly. This is usable with all models, but is often the least reliable approach as the model is not forced to match the schema. While we would generally suggest starting with tool or native output, in some cases this mode may result in higher quality outputs, and for models without native tool calling or structured output support it is the only option for producing structured outputs. If the model API supports the "JSON Mode" feature (aka "JSON Object response format") to force the model to output valid JSON, this is enabled, but it's still up to the model to abide by the schema. Pydantic AI will validate the returned structured data and tell the model to try again if validation fails, but if the model is not intelligent enough this may not be sufficient. To use this mode, you can wrap the output type(s) in the [`PromptedOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.PromptedOutput) marker class that also lets you specify a `name` and `description` if the name and docstring of the type or function are not sufficient. Additionally, `template` lets you specify a custom instructions template to be used instead of the [default](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile.prompted_output_template), or `template=False` to disable the schema prompt entirely. prompted\_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, PromptedOutput from tool_output import Vehicle class Device(BaseModel): name: str kind: str agent = Agent( 'openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], # (1) name='Vehicle or device', description='Return a vehicle or device.' ), ) result = agent.run_sync('What is a MacBook?') print(repr(result.output)) #> Device(name='MacBook', kind='laptop') agent = Agent( 'openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], template='Gimme some JSON: {schema}' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` This could also have been a union: `output_type=Vehicle | Device`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. _(This example is complete, it can be run "as is")_ ### Custom JSON schema If it's not feasible to define your desired structured output object using a Pydantic `BaseModel`, dataclass, or `TypedDict`, for example when you get a JSON schema from an external source or generate it dynamically, you can use the [`StructuredDict()`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.StructuredDict) helper function to generate a `dict[str, Any]` subclass with a JSON schema attached that Pydantic AI will pass to the model. Note that Pydantic AI will not perform any validation of the received JSON object and it's up to the model to correctly interpret the schema and any constraints expressed in it, like required fields or integer value ranges. The output type will be a `dict[str, Any]` and it's up to your code to defensively read from it in case the model made a mistake. You can use an [output validator](#output-validator-functions) to reflect validation errors back to the model and get it to try again. Along with the JSON schema, you can optionally pass `name` and `description` arguments to provide additional context to the model: ```python from pydantic_ai import Agent, StructuredDict HumanDict = StructuredDict( { 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'} }, 'required': ['name', 'age'] }, name='Human', description='A human with a name and age', ) agent = Agent('openai:gpt-5.2', output_type=HumanDict) result = agent.run_sync('Create a person') #> {'name': 'John Doe', 'age': 30} ``` ### Validation context Some validation relies on an extra Pydantic [context](https://docs.pydantic.dev/latest/concepts/validators/#validation-context) object. You can pass such an object to an `Agent` at definition-time via its [`validation_context`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) parameter. It will be used in the validation of both structured outputs and [tool arguments](/docs/ai/tools-toolsets/tools-advanced#tool-retries). This validation context can be either: - the context object itself (`Any`), used as-is to validate outputs, or - a function that takes the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns a context object (`Any`). This function will be called automatically before each validation, allowing you to build a dynamic validation context. Don't confuse this _validation_ context with the _LLM_ context This Pydantic validation context object is only used internally by Pydantic AI for tool arg and output validation. In particular, it is **not** included in the prompts or messages sent to the language model. validation\_context.py ```python from dataclasses import dataclass from pydantic import BaseModel, ValidationInfo, field_validator from pydantic_ai import Agent class Value(BaseModel): x: int @field_validator('x') def increment_value(cls, value: int, info: ValidationInfo): return value + (info.context or 0) agent = Agent( 'google-gla:gemini-3-flash-preview', output_type=Value, validation_context=10, ) result = agent.run_sync('Give me a value of 5.') print(repr(result.output)) # 5 from the model + 10 from the validation context #> Value(x=15) @dataclass class Deps: increment: int agent = Agent( 'google-gla:gemini-3-flash-preview', output_type=Value, deps_type=Deps, validation_context=lambda ctx: ctx.deps.increment, ) result = agent.run_sync('Give me a value of 5.', deps=Deps(increment=10)) print(repr(result.output)) # 5 from the model + 10 from the validation context #> Value(x=15) ``` _(This example is complete, it can be run "as is")_ ### Output validators Some validation is inconvenient or impossible to do in Pydantic validators, in particular when the validation requires IO and is asynchronous. Pydantic AI provides a way to add validation functions via the [`agent.output_validator`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.output_validator) decorator. If you want to implement separate validation logic for different output types, it's recommended to use [output functions](#output-functions) instead, to save you from having to do `isinstance` checks inside the output validator. If you want the model to output plain text, do your own processing or validation, and then have the agent's final output be the result of your function, it's recommended to use an [output function](#output-functions) with the [`TextOutput` marker class](#text-output). Here's a simplified variant of the [SQL Generation example](/docs/ai/examples/sql-gen): sql\_gen.py ```python from fake_database import DatabaseConn, QueryError from pydantic import BaseModel from pydantic_ai import Agent, RunContext, ModelRetry class Success(BaseModel): sql_query: str class InvalidRequest(BaseModel): error_message: str Output = Success | InvalidRequest agent = Agent[DatabaseConn, Output]( 'google-gla:gemini-3-flash-preview', output_type=Output, # type: ignore deps_type=DatabaseConn, instructions='Generate PostgreSQL flavored SQL queries based on user input.', ) @agent.output_validator async def validate_sql(ctx: RunContext[DatabaseConn], output: Output) -> Output: if isinstance(output, InvalidRequest): return output try: await ctx.deps.execute(f'EXPLAIN {output.sql_query}') except QueryError as e: raise ModelRetry(f'Invalid query: {e}') from e else: return output result = agent.run_sync( 'get me users who were last active yesterday.', deps=DatabaseConn() ) print(result.output) #> sql_query='SELECT * FROM users WHERE last_active::date = today() - interval 1 day' ``` _(This example is complete, it can be run "as is")_ #### Handling partial output in output validators When streaming with `run_stream()` or `run_stream_sync()`, output validators are called **multiple times** -- once for each partial output received from the model, and once for the final complete output. You should check the [`RunContext.partial_output`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.partial_output) flag when you want to **validate only the complete result**, not intermediate partial values. When streaming, `partial_output` is `True` for each partial output and `False` for the final complete output. For all [other run methods](/docs/ai/core-concepts/agent#running-agents), `partial_output` is always `False` as the validator is only called once with the complete output. partial\_validation\_streaming.py ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('openai:gpt-5.2') @agent.output_validator def validate_output(ctx: RunContext, output: str) -> str: if ctx.partial_output: return output if len(output) < 50: raise ModelRetry('Output is too short.') return output async def main(): async with agent.run_stream('Write a long story about a cat') as result: async for message in result.stream_text(): print(message) #> Once upon a #> Once upon a time, there was #> Once upon a time, there was a curious cat #> Once upon a time, there was a curious cat named Whiskers who #> Once upon a time, there was a curious cat named Whiskers who loved to explore #> Once upon a time, there was a curious cat named Whiskers who loved to explore the world around #> Once upon a time, there was a curious cat named Whiskers who loved to explore the world around him... ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## Image output Some models can generate images as part of their response, for example those that support the [Image Generation built-in tool](/docs/ai/tools-toolsets/builtin-tools#image-generation-tool) and OpenAI models using the [Code Execution built-in tool](/docs/ai/tools-toolsets/builtin-tools#code-execution-tool) when told to generate a chart. To use the generated image as the output of the agent run, you can set `output_type` to [`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage). If no image-generating built-in tool is explicitly specified, the [`ImageGenerationTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool) will be enabled automatically. image\_output.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('openai-responses:gpt-5.2', output_type=BinaryImage) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` _(This example is complete, it can be run "as is")_ If an agent does not need to always generate an image, you can use a union of `BinaryImage` and `str`. If the model generates both, the image will take precedence as output and the text will be available on [`ModelResponse.text`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.text): image\_output\_union.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('openai-responses:gpt-5.2', output_type=BinaryImage | str) result = agent.run_sync('Tell me a two-sentence story about an axolotl, no image please.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') assert isinstance(result.output, BinaryImage) print(result.response.text) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ ``` ## Streamed Results There two main challenges with streamed results: 1. Validating structured responses before they're complete, this is achieved by "partial validation" which was recently added to Pydantic in [pydantic/pydantic#10748](https://github.com/pydantic/pydantic/pull/10748). 2. When receiving a response, we don't know if it's the final response without starting to stream it and peeking at the content. Pydantic AI streams just enough of the response to sniff out if it's a tool call or an output, then streams the whole thing and calls tools, or returns the stream as a [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult). Note As the `run_stream()` method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream all events from the model's streaming response and the agent's execution of tools, use [`agent.run_stream_events()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events) ([docs](/docs/ai/core-concepts/agent#streaming-all-events)) or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.iter) ([docs](/docs/ai/core-concepts/agent#streaming-all-events-and-output)) instead. ### Streaming Text Example of streamed text output: streamed\_hello\_world.py ```python from pydantic_ai import Agent agent = Agent('google-gla:gemini-3-flash-preview') # (1) async def main(): async with agent.run_stream('Where does "hello world" come from?') as result: # (2) async for message in result.stream_text(): # (3) print(message) #> The first known #> The first known use of "hello, #> The first known use of "hello, world" was in #> The first known use of "hello, world" was in a 1974 textbook #> The first known use of "hello, world" was in a 1974 textbook about the C #> The first known use of "hello, world" was in a 1974 textbook about the C programming language. ``` Streaming works with the standard [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) class, and doesn't require any special setup, just a model that supports streaming (currently all models support streaming). The [`Agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) method is used to start a streamed run, this method returns a context manager so the connection can be closed when the stream completes. Each item yield by [`StreamedRunResult.stream_text()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text) is the complete text response, extended as new data is received. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ We can also stream text as deltas rather than the entire text in each item: streamed\_delta\_hello\_world.py ```python from pydantic_ai import Agent agent = Agent('google-gla:gemini-3-flash-preview') async def main(): async with agent.run_stream('Where does "hello world" come from?') as result: async for message in result.stream_text(delta=True): # (1) print(message) #> The first known #> use of "hello, #> world" was in #> a 1974 textbook #> about the C #> programming language. ``` [`stream_text`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text) will error if the response is not text. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ Output message not included in `messages` The final output message will **NOT** be added to result messages if you use `.stream_text(delta=True)`, see [Messages and chat history](/docs/ai/core-concepts/message-history) for more information. ### Streaming Structured Output Here's an example of streaming a user profile as it's built: streamed\_user\_profile.py ```python from datetime import date from typing_extensions import NotRequired, TypedDict from pydantic_ai import Agent class UserProfile(TypedDict): name: str dob: NotRequired[date] bio: NotRequired[str] agent = Agent( 'openai:gpt-5.2', output_type=UserProfile, instructions='Extract a user profile from the input', ) async def main(): user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.' async with agent.run_stream(user_input) as result: async for profile in result.stream_output(): print(profile) #> {'name': 'Ben'} #> {'name': 'Ben'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ As setting an `output_type` uses the [Tool Output](#tool-output) mode by default, this will only work if the model supports streaming tool arguments. For models that don't, like Gemini, try [Native Output](#native-output) or [Prompted Output](#prompted-output) instead. ### Streaming Model Responses If you want fine-grained control of validation, you can use the following pattern to get the entire partial [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse): streamed\_user\_profile.py ```python from datetime import date from pydantic import ValidationError from typing_extensions import TypedDict from pydantic_ai import Agent class UserProfile(TypedDict, total=False): name: str dob: date bio: str agent = Agent('openai:gpt-5.2', output_type=UserProfile) async def main(): user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.' async with agent.run_stream(user_input) as result: async for message, last in result.stream_responses(debounce_by=0.01): # (1) try: profile = await result.validate_response_output( # (2) message, allow_partial=not last, ) except ValidationError: continue print(profile) #> {'name': 'Ben'} #> {'name': 'Ben'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} ``` [`stream_responses`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_responses) streams the data as [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) objects, thus iteration can't fail with a `ValidationError`. [`validate_response_output`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.validate_response_output) validates the data, `allow_partial=True` enables pydantic's [`experimental_allow_partial` flag on `TypeAdapter`](https://docs.pydantic.dev/latest/api/pydantic/type_adapter/#pydantic.type_adapter.TypeAdapter.validate_json). _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## Examples The following examples demonstrate how to use streamed responses in Pydantic AI: - [Stream markdown](/docs/ai/examples/stream-markdown) - [Stream Whales](/docs/ai/examples/stream-whales) --- # [Overview](https://pydantic.dev/docs/ai/evals/evals/) # Overview **Pydantic Evals** is a powerful evaluation framework for systematically testing and evaluating AI systems, from simple LLM calls to complex multi-agent applications. ## Design Philosophy Code-First Approach Pydantic Evals follows a code-first philosophy where all evaluation components are defined in Python. This differs from platforms with web-based configuration. You write and run evals in code, and can write the results to disk or view them in your terminal or in [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/). Evals are an Emerging Practice Unlike unit tests, evals are an emerging art/science. Anyone who claims to know exactly how your evals should be defined can safely be ignored. We've designed Pydantic Evals to be flexible and useful without being too opinionated. ## Quick Navigation **Getting Started:** - [Installation](#installation) - [Quick Start](/docs/ai/evals/quick-start) - [Core Concepts](/docs/ai/evals/core-concepts) **Evaluators:** - [Evaluators Overview](/docs/ai/evals/evaluators/overview) - Compare evaluator types and learn when to use each approach - [Built-in Evaluators](/docs/ai/evals/evaluators/built-in) - Complete reference for exact match, instance checks, and other ready-to-use evaluators - [LLM as a Judge](/docs/ai/evals/evaluators/llm-judge) - Use LLMs to evaluate subjective qualities, complex criteria, and natural language outputs - [Custom Evaluators](/docs/ai/evals/evaluators/custom) - Implement domain-specific scoring logic and custom evaluation metrics - [Span-Based Evaluation](/docs/ai/evals/evaluators/span-based) - Evaluate internal agent behavior (tool calls, execution flow) using OpenTelemetry traces. Essential for complex agents where correctness depends on _how_ the answer was reached, not just the final output. Also ensures eval assertions align with production telemetry. **How-To Guides:** - [Logfire Integration](/docs/ai/evals/how-to/logfire-integration) - Visualize results - [Dataset Management](/docs/ai/evals/how-to/dataset-management) - Save, load, generate - [Concurrency & Performance](/docs/ai/evals/how-to/concurrency) - Control parallel execution - [Retry Strategies](/docs/ai/evals/how-to/retry-strategies) - Handle transient failures - [Metrics & Attributes](/docs/ai/evals/how-to/metrics-attributes) - Track custom data - [Case Lifecycle Hooks](/docs/ai/evals/how-to/lifecycle) - Per-case setup, teardown, and context enrichment **Examples:** - [Simple Validation](/docs/ai/evals/examples/simple-validation) - Basic example **Reference:** - [API Documentation](/docs/ai/api/pydantic-evals/dataset) ## Code-First Evaluation Pydantic Evals follows a **code-first approach** where you define all evaluation components (datasets, experiments, tasks, cases and evaluators) in Python code, or as serialized data loaded by Python code. This differs from platforms with fully web-based configuration. When you run an _Experiment_ you'll see a progress indicator and can print the results wherever you run your python code (IDE, terminal, etc). You also get a report object back that you can serialize and store or send to a notebook or other application for further visualization and analysis. If you are using [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/), your experiment results automatically appear in the Logfire web interface for visualization, comparison, and collaborative analysis. Logfire serves as a observability layer - you write and run evals in code, then view and analyze results in the web UI. ## Installation To install the Pydantic Evals package, run: - [pip](#tab-panel-6) - [uv](#tab-panel-7) Terminal ```bash pip install pydantic-evals ``` Terminal ```bash uv add pydantic-evals ``` `pydantic-evals` does not depend on `pydantic-ai`, but has an optional dependency on `logfire` if you'd like to use OpenTelemetry traces in your evals, or send evaluation results to [logfire](https://pydantic.dev/logfire). - [pip](#tab-panel-8) - [uv](#tab-panel-9) Terminal ```bash pip install 'pydantic-evals[logfire]' ``` Terminal ```bash uv add 'pydantic-evals[logfire]' ``` ## Pydantic Evals Data Model Pydantic Evals is built around a simple data model: ### Data Model Diagram ``` Dataset (1) ──────────── (Many) Case │ │ │ │ └─── (Many) Experiment ──┴─── (Many) Case results │ └─── (1) Task │ └─── (Many) Evaluator ``` ### Key Relationships 1. **Dataset → Cases**: One Dataset contains many Cases 2. **Dataset → Experiments**: One Dataset can be used across many Experiments over time 3. **Experiment → Case results**: One Experiment generates results by executing each Case 4. **Experiment → Task**: One Experiment evaluates one defined Task 5. **Experiment → Evaluators**: One Experiment uses multiple Evaluators. Dataset-wide Evaluators are run against all Cases, and Case-specific Evaluators against their respective Cases ### Data Flow 1. **Dataset creation**: Define cases and evaluators in YAML/JSON, or directly in Python 2. **Experiment execution**: Run `dataset.evaluate_sync(task_function)` 3. **Cases run**: Each Case is executed against the Task 4. **Evaluation**: Evaluators score the Task outputs for each Case 5. **Results**: All Case results are collected into a summary report A metaphor A useful metaphor (although not perfect) is to think of evals like a **Unit Testing** framework: - **Cases + Evaluators** are your individual unit tests - each one defines a specific scenario you want to test, complete with inputs and expected outcomes. Just like a unit test, a case asks: _"Given this input, does my system produce the right output?"_ - **Datasets** are like test suites - they are the scaffolding that holds your unit tests together. They group related cases and define shared evaluation criteria that should apply across all tests in the suite. - **Experiments** are like running your entire test suite and getting a report. When you execute `dataset.evaluate_sync(my_ai_function)`, you're running all your cases against your AI system and collecting the results - just like running `pytest` and getting a summary of passes, failures, and performance metrics. The key difference from traditional unit testing is that AI systems are probabilistic. If you're type checking you'll still get a simple pass/fail, but scores for text outputs are likely qualitative and/or categorical, and more open to interpretation. For a deeper understanding, see [Core Concepts](/docs/ai/evals/core-concepts). ## Datasets and Cases In Pydantic Evals, everything begins with [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset)s and [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case)s: - **[`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset)**: A collection of test Cases designed for the evaluation of a specific task or function - **[`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case)**: A single test scenario corresponding to Task inputs, with optional expected outputs, metadata, and case-specific evaluators simple\_eval\_dataset.py ```python from pydantic_evals import Case, Dataset case1 = Case( name='simple_case', inputs='What is the capital of France?', expected_output='Paris', metadata={'difficulty': 'easy'}, ) dataset = Dataset(name='capital_quiz', cases=[case1]) ``` _(This example is complete, it can be run "as is")_ See [Dataset Management](/docs/ai/evals/how-to/dataset-management) to learn about saving, loading, and generating datasets. ## Evaluators [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator)s analyze and score the results of your Task when tested against a Case. These can be deterministic, code-based checks (such as testing model output format with a regex, or checking for the appearance of PII or sensitive data), or they can assess non-deterministic model outputs for qualities like accuracy, precision/recall, hallucinations, or instruction-following. While both kinds of testing are useful in LLM systems, classical code-based tests are cheaper and easier than tests which require either human or machine review of model outputs. Pydantic Evals includes several [built-in evaluators](/docs/ai/evals/evaluators/built-in) and allows you to define [custom evaluators](/docs/ai/evals/evaluators/custom): simple\_eval\_evaluator.py ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.evaluators.common import IsInstance from simple_eval_dataset import dataset dataset.add_evaluator(IsInstance(type_name='str')) # (1) @dataclass class MyEvaluator(Evaluator): async def evaluate(self, ctx: EvaluatorContext[str, str]) -> float: # (2) if ctx.output == ctx.expected_output: return 1.0 elif ( isinstance(ctx.output, str) and ctx.expected_output.lower() in ctx.output.lower() ): return 0.8 else: return 0.0 dataset.add_evaluator(MyEvaluator()) ``` You can add built-in evaluators to a dataset using the [`add_evaluator`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.add_evaluator) method. This custom evaluator returns a simple score based on whether the output matches the expected output. _(This example is complete, it can be run "as is")_ Learn more: - [Evaluators Overview](/docs/ai/evals/evaluators/overview) - When to use different types - [Built-in Evaluators](/docs/ai/evals/evaluators/built-in) - Complete reference - [LLM Judge](/docs/ai/evals/evaluators/llm-judge) - Using LLMs as evaluators - [Custom Evaluators](/docs/ai/evals/evaluators/custom) - Write your own logic - [Span-Based Evaluation](/docs/ai/evals/evaluators/span-based) - Analyze execution traces ## Running Experiments Performing evaluations involves running a task against all cases in a dataset, also known as running an "experiment". Putting the above two examples together and using the more declarative `evaluators` kwarg to [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset): simple\_eval\_complete.py ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext, IsInstance case1 = Case( # (1) name='simple_case', inputs='What is the capital of France?', expected_output='Paris', metadata={'difficulty': 'easy'}, ) class MyEvaluator(Evaluator[str, str]): def evaluate(self, ctx: EvaluatorContext[str, str]) -> float: if ctx.output == ctx.expected_output: return 1.0 elif ( isinstance(ctx.output, str) and ctx.expected_output.lower() in ctx.output.lower() ): return 0.8 else: return 0.0 dataset = Dataset( name='capital_quiz', cases=[case1], evaluators=[IsInstance(type_name='str'), MyEvaluator()], # (2) ) async def guess_city(question: str) -> str: # (3) return 'Paris' report = dataset.evaluate_sync(guess_city) # (4) report.print(include_input=True, include_output=True, include_durations=False) # (5) """ Evaluation Summary: guess_city ┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ Scores ┃ Assertions ┃ ┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ simple_case │ What is the capital of France? │ Paris │ MyEvaluator: 1.00 │ ✔ │ ├─────────────┼────────────────────────────────┼─────────┼───────────────────┼────────────┤ │ Averages │ │ │ MyEvaluator: 1.00 │ 100.0% ✔ │ └─────────────┴────────────────────────────────┴─────────┴───────────────────┴────────────┘ """ ``` Create a [test case](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) as above Create a [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset) with test cases and [`evaluators`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluators) Our function to evaluate. Run the evaluation with [`evaluate_sync`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate_sync), which runs the function against all test cases in the dataset, and returns an [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport) object. Print the report with [`print`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport.print), which shows the results of the evaluation. We have omitted duration here just to keep the printed output from changing from run to run. _(This example is complete, it can be run "as is")_ See [Quick Start](/docs/ai/evals/quick-start) for more examples and [Concurrency & Performance](/docs/ai/evals/how-to/concurrency) to learn about controlling parallel execution. ## API Reference For comprehensive coverage of all classes, methods, and configuration options, see the detailed [API Reference documentation](https://ai.pydantic.dev/api/pydantic_evals/dataset/). ## Next Steps 1. **Start with simple evaluations** using [Quick Start](/docs/ai/evals/quick-start) 2. **Understand the data model** with [Core Concepts](/docs/ai/evals/core-concepts) 3. **Explore built-in evaluators** in [Built-in Evaluators](/docs/ai/evals/evaluators/built-in) 4. **Integrate with Logfire** for visualization: [Logfire Integration](/docs/ai/evals/how-to/logfire-integration) 5. **Build comprehensive test suites** with [Dataset Management](/docs/ai/evals/how-to/dataset-management) 6. **Implement custom evaluators** for domain-specific metrics: [Custom Evaluators](/docs/ai/evals/evaluators/custom) --- # [Built-in Evaluators](https://pydantic.dev/docs/ai/evals/evaluators/built-in/) # Built-in Evaluators Pydantic Evals provides several built-in evaluators for common evaluation tasks. ## Comparison Evaluators ### EqualsExpected Check if the output exactly equals the expected output from the case. ```python from pydantic_evals.evaluators import EqualsExpected EqualsExpected() ``` **Parameters:** None **Returns:** `bool` - `True` if `ctx.output == ctx.expected_output` **Example:** ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import EqualsExpected dataset = Dataset( name='equals_expected_demo', cases=[ Case( name='addition', inputs='2 + 2', expected_output='4', ), ], evaluators=[EqualsExpected()], ) ``` **Notes:** - Skips evaluation if `expected_output` is `None` (returns empty dict `{}`) - Uses Python's `==` operator, so works with any comparable types - For structured data, considers nested equality * * * ### Equals Check if the output equals a specific value. ```python from pydantic_evals.evaluators import Equals Equals(value='expected_result') ``` **Parameters:** - `value` (Any): The value to compare against - `evaluation_name` (str | None): Custom name for this evaluation in reports **Returns:** `bool` - `True` if `ctx.output == value` **Example:** ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Equals # Check output is always "success" dataset = Dataset( name='equals_demo', cases=[Case(inputs='test')], evaluators=[ Equals(value='success', evaluation_name='is_success'), ], ) ``` **Use Cases:** - Checking for sentinel values - Validating consistent outputs - Testing classification into specific categories * * * ### Contains Check if the output contains a specific value or substring. ```python from pydantic_evals.evaluators import Contains Contains( value='substring', case_sensitive=True, as_strings=False, ) ``` **Parameters:** - `value` (Any): The value to search for - `case_sensitive` (bool): Case-sensitive comparison for strings (default: `True`) - `as_strings` (bool): Convert both values to strings before checking (default: `False`) - `evaluation_name` (str | None): Custom name for this evaluation in reports **Returns:** [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason) - Pass/fail with explanation **Behavior:** For **strings**: checks substring containment - `Contains(value='hello', case_sensitive=False)` - Matches: "Hello World", "say hello", "HELLO" - Doesn't match: "hi there" For **lists/tuples**: checks membership - `Contains(value='apple')` - Matches: `['apple', 'banana']`, `('apple',)` - Doesn't match: `['apples', 'orange']` For **dicts**: checks key-value pairs - `Contains(value={'name': 'Alice'})` - Matches: `{'name': 'Alice', 'age': 30}` - Doesn't match: `{'name': 'Bob'}` **Example:** ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Contains dataset = Dataset( name='contains_demo', cases=[Case(inputs='test')], evaluators=[ # Check for required keywords Contains(value='terms and conditions', case_sensitive=False), # Check for PII (fail if found) # Note: Use a custom evaluator that returns False when PII found ], ) ``` **Use Cases:** - Required content verification - Keyword detection - PII/sensitive data detection - Multi-value validation * * * ## Type Validation ### IsInstance Check if the output is an instance of a type with the given name. ```python from pydantic_evals.evaluators import IsInstance IsInstance(type_name='str') ``` **Parameters:** - `type_name` (str): The type name to check (uses `__name__` or `__qualname__`) - `evaluation_name` (str | None): Custom name for this evaluation in reports **Returns:** [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason) - Pass/fail with type information **Example:** ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import IsInstance dataset = Dataset( name='isinstance_demo', cases=[Case(inputs='test')], evaluators=[ # Check output is always a string IsInstance(type_name='str'), # Check for Pydantic model IsInstance(type_name='MyModel'), # Check for dict IsInstance(type_name='dict'), ], ) ``` **Notes:** - Matches against both `__name__` and `__qualname__` of the type - Works with built-in types (`str`, `int`, `dict`, `list`, etc.) - Works with custom classes and Pydantic models - Checks the entire MRO (Method Resolution Order) for inheritance **Use Cases:** - Format validation - Structured output verification - Type consistency checks * * * ## Performance Evaluation ### MaxDuration Check if task execution time is under a maximum threshold. ```python from datetime import timedelta from pydantic_evals.evaluators import MaxDuration MaxDuration(seconds=2.0) # or MaxDuration(seconds=timedelta(seconds=2)) ``` **Parameters:** - `seconds` (float | timedelta): Maximum allowed duration **Returns:** `bool` - `True` if `ctx.duration <= seconds` **Example:** ```python from datetime import timedelta from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import MaxDuration dataset = Dataset( name='max_duration_demo', cases=[Case(inputs='test')], evaluators=[ # SLA: must respond in under 2 seconds MaxDuration(seconds=2.0), # Or using timedelta MaxDuration(seconds=timedelta(milliseconds=500)), ], ) ``` **Use Cases:** - SLA compliance - Performance regression testing - Latency requirements - Timeout validation **See Also:** [Concurrency & Performance](/docs/ai/evals/how-to/concurrency) * * * ## LLM-as-a-Judge ### LLMJudge Use an LLM to evaluate subjective qualities based on a rubric. ```python from pydantic_evals.evaluators import LLMJudge LLMJudge( rubric='Response is accurate and helpful', model='openai:gpt-5.2', include_input=False, include_expected_output=False, model_settings=None, score=False, assertion={'include_reason': True}, ) ``` **Parameters:** - `rubric` (str): The evaluation criteria (required) - `model` (Model | KnownModelName | None): Model to use (default: `'openai:gpt-5.2'`) - `include_input` (bool): Include task inputs in the prompt (default: `False`) - `include_expected_output` (bool): Include expected output in the prompt (default: `False`) - `model_settings` (ModelSettings | None): Custom model settings - `score` (OutputConfig | False): Configure score output (default: `False`) - `assertion` (OutputConfig | False): Configure assertion output (default: includes reason) **Returns:** Depends on `score` and `assertion` parameters (see below) **Output Modes:** By default, returns a **boolean assertion** with reason: - `LLMJudge(rubric='Response is polite')` - Returns: `{'LLMJudge_pass': EvaluationReason(value=True, reason='...')}` Return a **score** (0.0 to 1.0) instead: - `LLMJudge(rubric='Response quality', score={'include_reason': True}, assertion=False)` - Returns: `{'LLMJudge_score': EvaluationReason(value=0.85, reason='...')}` Return **both** score and assertion: - `LLMJudge(rubric='Response quality', score={'include_reason': True}, assertion={'include_reason': True})` - Returns: `{'LLMJudge_score': EvaluationReason(value=0.85, reason='...'), 'LLMJudge_pass': EvaluationReason(value=True, reason='...')}` **Customize evaluation names:** - `LLMJudge(rubric='Response is factually accurate', assertion={'evaluation_name': 'accuracy', 'include_reason': True})` - Returns: `{'accuracy': EvaluationReason(value=True, reason='...')}` **Example:** ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge dataset = Dataset( name='llm_judge_demo', cases=[Case(inputs='test', expected_output='result')], evaluators=[ # Basic accuracy check LLMJudge( rubric='Response is factually accurate', include_input=True, ), # Quality score with different model LLMJudge( rubric='Overall response quality', model='anthropic:claude-sonnet-4-6', score={'evaluation_name': 'quality', 'include_reason': False}, assertion=False, ), # Check against expected output LLMJudge( rubric='Response matches the expected answer semantically', include_input=True, include_expected_output=True, ), ], ) ``` **See Also:** [LLM Judge Deep Dive](/docs/ai/evals/evaluators/llm-judge) * * * ## Span-Based Evaluation ### HasMatchingSpan Check if OpenTelemetry spans match a query (requires Logfire configuration). ```python from pydantic_evals.evaluators import HasMatchingSpan HasMatchingSpan( query={'name_contains': 'tool_call'}, evaluation_name='called_tool', ) ``` **Parameters:** - `query` ([`SpanQuery`](/docs/ai/api/pydantic_evals/otel/#pydantic_evals.otel.SpanQuery)): Query to match against spans - `evaluation_name` (str | None): Custom name for this evaluation in reports **Returns:** `bool` - `True` if any span matches the query **Example:** ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import HasMatchingSpan dataset = Dataset( name='span_check_demo', cases=[Case(inputs='test')], evaluators=[ # Check that a specific tool was called HasMatchingSpan( query={'name_contains': 'search_database'}, evaluation_name='used_database', ), # Check for errors HasMatchingSpan( query={'has_attributes': {'error': True}}, evaluation_name='had_errors', ), # Check duration constraints HasMatchingSpan( query={ 'name_equals': 'llm_call', 'max_duration': 2.0, # seconds }, evaluation_name='llm_fast_enough', ), ], ) ``` **See Also:** [Span-Based Evaluation](/docs/ai/evals/evaluators/span-based) * * * ## Built-in Report Evaluators In addition to the case-level evaluators above, Pydantic Evals provides report evaluators that analyze entire experiment results. These are passed via the `report_evaluators` parameter on `Dataset`. Report Evaluator Purpose Output [`ConfusionMatrixEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.ConfusionMatrixEvaluator) Classification confusion matrix `ConfusionMatrix` [`PrecisionRecallEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.PrecisionRecallEvaluator) PR curve with AUC `PrecisionRecall` **See:** [Report Evaluators](/docs/ai/evals/evaluators/report-evaluators) for full documentation, parameters, and examples, including how to write custom report evaluators that produce `ScalarResult` and `TableResult` analyses. * * * ## Quick Reference Table ### Case-Level Evaluators Evaluator Purpose Return Type Cost Speed [`EqualsExpected`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EqualsExpected) Exact match with expected `bool` Free Instant [`Equals`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Equals) Equals specific value `bool` Free Instant [`Contains`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Contains) Contains value/substring `bool` + reason Free Instant [`IsInstance`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.IsInstance) Type validation `bool` + reason Free Instant [`MaxDuration`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.MaxDuration) Performance threshold `bool` Free Instant [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) Subjective quality `bool` and/or `float` $$ Slow [`HasMatchingSpan`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.HasMatchingSpan) Behavioral check `bool` Free Fast ### Report-Level Evaluators Evaluator Purpose Output Type Cost Speed [`ConfusionMatrixEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.ConfusionMatrixEvaluator) Classification matrix `ConfusionMatrix` Free Instant [`PrecisionRecallEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.PrecisionRecallEvaluator) PR curve with AUC `PrecisionRecall` Free Instant ## Combining Evaluators Best practice is to combine fast deterministic checks with slower LLM evaluations: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import ( Contains, IsInstance, LLMJudge, MaxDuration, ) dataset = Dataset( name='combined_evaluators', cases=[Case(inputs='test')], evaluators=[ # Fast checks first (fail fast) IsInstance(type_name='str'), Contains(value='required_field'), MaxDuration(seconds=2.0), # Expensive LLM checks last LLMJudge(rubric='Response is helpful and accurate'), ], ) ``` This approach: 1. Catches format/structure issues immediately 2. Validates required content quickly 3. Only runs expensive LLM evaluation if basic checks pass 4. Provides comprehensive quality assessment ## Next Steps - **[LLM Judge](/docs/ai/evals/evaluators/llm-judge)** - Deep dive on LLM-as-a-Judge evaluation - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write your own evaluation logic - **[Report Evaluators](/docs/ai/evals/evaluators/report-evaluators)** - Experiment-wide analyses (confusion matrices, PR curves, etc.) - **[Span-Based Evaluation](/docs/ai/evals/evaluators/span-based)** - Using OpenTelemetry spans for behavioral checks --- # [Custom Evaluators](https://pydantic.dev/docs/ai/evals/evaluators/custom/) # Custom Evaluators Write custom evaluators for domain-specific logic, external integrations, or specialized metrics. ## Basic Custom Evaluator All evaluators inherit from [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) and must implement `evaluate`: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): """Check if output exactly matches expected output.""" def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output ``` **Key Points:** - Use `@dataclass` decorator (required) - Inherit from `Evaluator` - Implement `evaluate(self, ctx: EvaluatorContext) -> EvaluatorOutput` - Return `bool`, `int`, `float`, `str`, [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason), or `dict` of these ## EvaluatorContext The context provides all information about the case execution: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class MyEvaluator(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: # Access case data ctx.name # Case name ctx.inputs # Task inputs ctx.metadata # Case metadata ctx.expected_output # Expected output (may be None) ctx.output # Actual output # Performance data ctx.duration # Task execution time (seconds) # Custom metrics/attributes (see metrics guide) ctx.metrics # dict[str, int | float] ctx.attributes # dict[str, Any] # OpenTelemetry spans (if logfire configured) ctx.span_tree # SpanTree for behavioral checks return True ``` ## Evaluator Parameters Add configurable parameters as dataclass fields: ```python from dataclasses import dataclass from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ContainsKeyword(Evaluator): keyword: str case_sensitive: bool = True def evaluate(self, ctx: EvaluatorContext) -> bool: output = ctx.output keyword = self.keyword if not self.case_sensitive: output = output.lower() keyword = keyword.lower() return keyword in output # Usage dataset = Dataset( name='keyword_check', cases=[Case(name='test', inputs='This is important')], evaluators=[ ContainsKeyword(keyword='important', case_sensitive=False), ], ) ``` ## Return Types ### Boolean Assertions Simple pass/fail checks: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class IsValidJSON(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: try: import json json.loads(ctx.output) return True except Exception: return False ``` ### Numeric Scores Quality metrics: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class LengthScore(Evaluator): """Score based on output length (0.0 = too short, 1.0 = ideal).""" ideal_length: int = 100 tolerance: int = 20 def evaluate(self, ctx: EvaluatorContext) -> float: length = len(ctx.output) diff = abs(length - self.ideal_length) if diff <= self.tolerance: return 1.0 else: # Decay score as we move away from ideal score = max(0.0, 1.0 - (diff - self.tolerance) / self.ideal_length) return score ``` ### String Labels Categorical classifications: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class SentimentClassifier(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> str: output_lower = ctx.output.lower() if any(word in output_lower for word in ['error', 'failed', 'wrong']): return 'negative' elif any(word in output_lower for word in ['success', 'correct', 'great']): return 'positive' else: return 'neutral' ``` ### With Reasons Add explanations to any result: ```python from dataclasses import dataclass from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext @dataclass class SmartCheck(Evaluator): threshold: float = 0.8 def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason: score = self._calculate_score(ctx.output) if score >= self.threshold: return EvaluationReason( value=True, reason=f'Score {score:.2f} exceeds threshold {self.threshold}', ) else: return EvaluationReason( value=False, reason=f'Score {score:.2f} below threshold {self.threshold}', ) def _calculate_score(self, output: str) -> float: # Your scoring logic return 0.75 ``` ### Multiple Results You can return multiple evaluations from one evaluator by returning a dictionary of key-value pairs. ```python from dataclasses import dataclass from pydantic_evals.evaluators import ( EvaluationReason, Evaluator, EvaluatorContext, EvaluatorOutput, ) @dataclass class ComprehensiveCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> EvaluatorOutput: format_valid = self._check_format(ctx.output) return { 'valid_format': EvaluationReason( value=format_valid, reason='Valid JSON format' if format_valid else 'Invalid JSON format', ), 'quality_score': self._score_quality(ctx.output), # float 'category': self._classify(ctx.output), # str } def _check_format(self, output: str) -> bool: return output.startswith('{') and output.endswith('}') def _score_quality(self, output: str) -> float: return len(output) / 100.0 def _classify(self, output: str) -> str: return 'short' if len(output) < 50 else 'long' ``` Each key in the returned dictionary becomes a separate result in the report. Values can be: - Primitives (`bool`, `int`, `float`, `str`) - [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason) (value with explanation) - Nested dicts of these types The [`EvaluatorOutput`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorOutput) type represents all legal values that can be returned by an evaluator, and can be used as the return type annotation for your custom `evaluate` method. ### Conditional Results Evaluators can dynamically choose whether to produce results for a given case by returning an empty dict when not applicable: ```python from dataclasses import dataclass from pydantic_evals.evaluators import ( EvaluationReason, Evaluator, EvaluatorContext, EvaluatorOutput, ) @dataclass class SQLValidator(Evaluator): """Only evaluates SQL queries, skips other outputs.""" def evaluate(self, ctx: EvaluatorContext) -> EvaluatorOutput: # Check if this case is relevant for SQL validation if not isinstance(ctx.output, str) or not ctx.output.strip().upper().startswith( ('SELECT', 'INSERT', 'UPDATE', 'DELETE') ): # Return empty dict - this evaluator doesn't apply to this case return {} # This is a SQL query, perform validation try: # In real implementation, use sqlparse or similar is_valid = self._validate_sql(ctx.output) return { 'sql_valid': is_valid, 'sql_complexity': self._measure_complexity(ctx.output), } except Exception as e: return {'sql_valid': EvaluationReason(False, reason=f'Exception: {e}')} def _validate_sql(self, query: str) -> bool: # Simplified validation return 'FROM' in query.upper() or 'INTO' in query.upper() def _measure_complexity(self, query: str) -> str: joins = query.upper().count('JOIN') if joins == 0: return 'simple' elif joins <= 2: return 'moderate' else: return 'complex' ``` This pattern is useful when: - An evaluator only applies to certain types of outputs (e.g., code validation only for code outputs) - Validation depends on metadata tags (e.g., only evaluate cases marked with `language='python'`) - You want to run expensive checks conditionally based on other evaluator results **Key Points:** - Returning `{}` means "this evaluator doesn't apply here" - the case won't show results from this evaluator - Returning `{'key': value}` means "this evaluator applies and here are the results" - This is more practical than using case-level evaluators when it applies to a large fraction of cases, or when the condition is based on the output itself - The evaluator still runs for every case, but can short-circuit when not relevant ## Async Evaluators Use `async def` for I/O-bound operations: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class APIValidator(Evaluator): api_url: str async def evaluate(self, ctx: EvaluatorContext) -> bool: import httpx async with httpx.AsyncClient() as client: response = await client.post( self.api_url, json={'output': ctx.output}, ) return response.json()['valid'] ``` Pydantic Evals handles both sync and async evaluators automatically. ## Using Metadata Access case metadata for context-aware evaluation: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class DifficultyAwareScore(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> float: # Base score base_score = self._score_output(ctx.output) # Adjust based on difficulty from metadata if ctx.metadata and 'difficulty' in ctx.metadata: difficulty = ctx.metadata['difficulty'] if difficulty == 'easy': # Penalize mistakes more on easy questions return base_score elif difficulty == 'hard': # Be more lenient on hard questions return min(1.0, base_score * 1.2) return base_score def _score_output(self, output: str) -> float: # Your scoring logic return 0.8 ``` ## Using Metrics Access custom metrics set during task execution: ```python from dataclasses import dataclass from pydantic_evals import increment_eval_metric, set_eval_attribute from pydantic_evals.evaluators import Evaluator, EvaluatorContext # In your task def my_task(inputs: str) -> str: result = f'processed: {inputs}' # Record metrics increment_eval_metric('api_calls', 3) set_eval_attribute('used_cache', True) return result # In your evaluator @dataclass class EfficiencyCheck(Evaluator): max_api_calls: int = 5 def evaluate(self, ctx: EvaluatorContext) -> bool: api_calls = ctx.metrics.get('api_calls', 0) return api_calls <= self.max_api_calls ``` See [Metrics & Attributes Guide](/docs/ai/evals/how-to/metrics-attributes) for more. ## Generic Type Parameters Make evaluators type-safe with generics: ```python from dataclasses import dataclass from typing import TypeVar from pydantic_evals.evaluators import Evaluator, EvaluatorContext InputsT = TypeVar('InputsT') OutputT = TypeVar('OutputT') @dataclass class TypedEvaluator(Evaluator[InputsT, OutputT, dict]): def evaluate(self, ctx: EvaluatorContext[InputsT, OutputT, dict]) -> bool: # ctx.inputs and ctx.output are now properly typed return True ``` ## Custom Evaluation Names Control how evaluations appear in reports: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class CustomNameEvaluator(Evaluator): check_type: str def get_default_evaluation_name(self) -> str: # Use check_type as the name instead of class name return f'{self.check_type}_check' def evaluate(self, ctx: EvaluatorContext) -> bool: return True # In reports, appears as "format_check" instead of "CustomNameEvaluator" evaluator = CustomNameEvaluator(check_type='format') ``` Or use the `evaluation_name` field (if using the built-in pattern): ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class MyEvaluator(Evaluator): evaluation_name: str | None = None def evaluate(self, ctx: EvaluatorContext) -> bool: return True # Usage MyEvaluator(evaluation_name='my_custom_name') ``` ## Real-World Examples ### SQL Validation ```python from dataclasses import dataclass from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext @dataclass class ValidSQL(Evaluator): dialect: str = 'postgresql' def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason: try: import sqlparse parsed = sqlparse.parse(ctx.output) if not parsed: return EvaluationReason( value=False, reason='Could not parse SQL', ) # Check for dangerous operations sql_upper = ctx.output.upper() if 'DROP' in sql_upper or 'DELETE' in sql_upper: return EvaluationReason( value=False, reason='Contains dangerous operations (DROP/DELETE)', ) return EvaluationReason( value=True, reason='Valid SQL syntax', ) except Exception as e: return EvaluationReason( value=False, reason=f'SQL parsing error: {e}', ) ``` ### Code Execution ```python from dataclasses import dataclass from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext @dataclass class ExecutablePython(Evaluator): timeout_seconds: float = 5.0 async def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason: import asyncio import os import tempfile # Write code to temp file with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f: f.write(ctx.output) temp_path = f.name try: # Execute with timeout process = await asyncio.create_subprocess_exec( 'python', temp_path, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE, ) try: stdout, stderr = await asyncio.wait_for( process.communicate(), timeout=self.timeout_seconds, ) except asyncio.TimeoutError: process.kill() return EvaluationReason( value=False, reason=f'Execution timeout after {self.timeout_seconds}s', ) if process.returncode == 0: return EvaluationReason( value=True, reason='Code executed successfully', ) else: return EvaluationReason( value=False, reason=f'Execution failed: {stderr.decode()}', ) finally: os.unlink(temp_path) ``` ### External API Validation ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class APIResponseValid(Evaluator): api_endpoint: str api_key: str async def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | float]: import httpx try: async with httpx.AsyncClient() as client: response = await client.post( self.api_endpoint, headers={'Authorization': f'Bearer {self.api_key}'}, json={'data': ctx.output}, timeout=10.0, ) result = response.json() return { 'api_reachable': True, 'validation_passed': result.get('valid', False), 'confidence_score': result.get('confidence', 0.0), } except Exception: return { 'api_reachable': False, 'validation_passed': False, 'confidence_score': 0.0, } ``` ## Testing Evaluators Test evaluators like any other Python code: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): """Check if output exactly matches expected output.""" def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output def test_exact_match(): evaluator = ExactMatch() # Test match ctx = EvaluatorContext( name='test', inputs='input', metadata=None, expected_output='expected', output='expected', duration=0.1, _span_tree=None, attributes={}, metrics={}, ) assert evaluator.evaluate(ctx) is True # Test mismatch ctx.output = 'different' assert evaluator.evaluate(ctx) is False ``` ## Best Practices ### 1\. Keep Evaluators Focused Each evaluator should check one thing: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext def check_format(output: str) -> bool: return output.startswith('{') def check_content(output: str) -> bool: return len(output) > 10 def check_length(output: str) -> bool: return len(output) < 1000 def check_spelling(output: str) -> bool: return True # Placeholder # Bad: Doing too much @dataclass class EverythingChecker(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> dict: return { 'format_valid': check_format(ctx.output), 'content_good': check_content(ctx.output), 'length_ok': check_length(ctx.output), 'spelling_correct': check_spelling(ctx.output), } # Good: Separate evaluators @dataclass class FormatValidator(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return check_format(ctx.output) @dataclass class ContentChecker(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return check_content(ctx.output) @dataclass class LengthChecker(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return check_length(ctx.output) @dataclass class SpellingChecker(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return check_spelling(ctx.output) ``` Some exceptions to this: - When there is a significant amount of shared computation or network request latency, it may be better to have a single evaluator calculate all dependent outputs together. - If multiple checks are tightly coupled or very closely related to each other, it may make sense to include all their logic in one evaluator. ### 2\. Handle Missing Data Gracefully ```python from dataclasses import dataclass from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext @dataclass class SafeEvaluator(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason: if ctx.expected_output is None: return EvaluationReason( value=True, reason='Skipped: no expected output provided', ) # Your evaluation logic ... ``` ### 3\. Provide Helpful Reasons ```python from dataclasses import dataclass from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext @dataclass class HelpfulEvaluator(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason: # Bad return EvaluationReason(value=False, reason='Failed') # Good return EvaluationReason( value=False, reason=f'Expected {ctx.expected_output!r}, got {ctx.output!r}', ) ``` ### 4\. Use Timeouts for External Calls ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class APIEvaluator(Evaluator): timeout: float = 10.0 async def _call_api(self, output: str) -> bool: # Placeholder for API call return True async def evaluate(self, ctx: EvaluatorContext) -> bool: import asyncio try: return await asyncio.wait_for( self._call_api(ctx.output), timeout=self.timeout, ) except asyncio.TimeoutError: return False ``` ## Next Steps - **[Report Evaluators](/docs/ai/evals/evaluators/report-evaluators)** - Experiment-wide analyses (confusion matrices, PR curves, custom tables) - **[Span-Based Evaluation](/docs/ai/evals/evaluators/span-based)** - Using OpenTelemetry spans - **[Examples](/docs/ai/evals/examples/simple-validation)** - Practical examples --- # [LLM Judge](https://pydantic.dev/docs/ai/evals/evaluators/llm-judge/) # LLM Judge The [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) evaluator uses an LLM to assess subjective qualities of outputs based on a rubric. ## When to Use LLM-as-a-Judge LLM judges are ideal for evaluating qualities that require understanding and judgment: **Good Use Cases:** - Factual accuracy - Helpfulness and relevance - Tone and style compliance - Completeness of responses - Following complex instructions - RAG groundedness (does the answer use provided context?) - Citation accuracy **Poor Use Cases:** - Format validation (use [`IsInstance`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.IsInstance) instead) - Exact matching (use [`EqualsExpected`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EqualsExpected)) - Performance checks (use [`MaxDuration`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.MaxDuration)) - Deterministic logic (write a custom evaluator) ## Basic Usage ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge dataset = Dataset( name='factual_accuracy', cases=[Case(inputs='test')], evaluators=[ LLMJudge(rubric='Response is factually accurate'), ], ) ``` ## Configuration Options ### Rubric The `rubric` is your evaluation criteria. Be specific and clear: **Bad rubrics (vague):** ```python from pydantic_evals.evaluators import LLMJudge LLMJudge(rubric='Good response') # Too vague LLMJudge(rubric='Check quality') # What aspect of quality? ``` **Good rubrics (specific):** ```python from pydantic_evals.evaluators import LLMJudge LLMJudge(rubric='Response directly answers the user question without hallucination') LLMJudge(rubric='Response uses formal, professional language appropriate for business communication') LLMJudge(rubric='All factual claims in the response are supported by the provided context') ``` ### Including Context Control what information the judge sees: ```python from pydantic_evals.evaluators import LLMJudge # Output only (default) LLMJudge(rubric='Response is polite') # Output + Input LLMJudge( rubric='Response accurately answers the input question', include_input=True, ) # Output + Input + Expected Output LLMJudge( rubric='Response is semantically equivalent to the expected output', include_input=True, include_expected_output=True, ) ``` **Example:** ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge dataset = Dataset( name='math_check', cases=[ Case( inputs='What is 2+2?', expected_output='4', ), ], evaluators=[ # This judge sees: output + inputs + expected_output LLMJudge( rubric='Response provides the same answer as expected, possibly with explanation', include_input=True, include_expected_output=True, ), ], ) ``` ### Model Selection Choose the judge model based on cost/quality tradeoffs: ```python from pydantic_evals.evaluators import LLMJudge # Default: GPT-4o (good balance) LLMJudge(rubric='...') # Anthropic Claude (alternative default) LLMJudge( rubric='...', model='anthropic:claude-sonnet-4-6', ) # Cheaper option for simple checks LLMJudge( rubric='Response contains profanity', model='openai:gpt-5-mini', ) # Premium option for nuanced evaluation LLMJudge( rubric='Response demonstrates deep understanding of quantum mechanics', model='anthropic:claude-opus-4-5', ) ``` ### Model Settings Customize model behavior: ```python from pydantic_ai import ModelSettings from pydantic_evals.evaluators import LLMJudge LLMJudge( rubric='...', model_settings=ModelSettings( temperature=0.0, # Deterministic evaluation max_tokens=100, # Shorter responses ), ) ``` ## Output Modes ### Assertion Only (Default) Returns pass/fail with reason: ```python from pydantic_evals.evaluators import LLMJudge LLMJudge(rubric='Response is accurate') # Returns: {'LLMJudge_pass': EvaluationReason(value=True, reason='...')} ``` In reports: ``` ┃ Assertions ┃ ┃ ✔ ┃ ``` ### Score Only Returns a numeric score (0.0 to 1.0): ```python from pydantic_evals.evaluators import LLMJudge LLMJudge( rubric='Response quality', score={'include_reason': True}, assertion=False, ) # Returns: {'LLMJudge_score': EvaluationReason(value=0.85, reason='...')} ``` In reports: ``` ┃ Scores ┃ ┃ LLMJudge_score: 0.85 ┃ ``` ### Both Score and Assertion ```python from pydantic_evals.evaluators import LLMJudge LLMJudge( rubric='Response quality', score={'include_reason': True}, assertion={'include_reason': True}, ) # Returns: { # 'LLMJudge_score': EvaluationReason(value=0.85, reason='...'), # 'LLMJudge_pass': EvaluationReason(value=True, reason='...'), # } ``` ### Custom Names ```python from pydantic_evals.evaluators import LLMJudge LLMJudge( rubric='Response is factually accurate', assertion={ 'evaluation_name': 'accuracy', 'include_reason': True, }, ) # Returns: {'accuracy': EvaluationReason(value=True, reason='...')} ``` In reports: ``` ┃ Assertions ┃ ┃ accuracy: ✔ ┃ ``` ## Practical Examples ### RAG Evaluation Evaluate whether a RAG system uses provided context: ```python from dataclasses import dataclass from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge @dataclass class RAGInput: question: str context: str dataset = Dataset( name='rag_evaluation', cases=[ Case( inputs=RAGInput( question='What is the capital of France?', context='France is a country in Europe. Its capital is Paris.', ), ), ], evaluators=[ LLMJudge( rubric='Response answers the question using only information from the provided context', include_input=True, assertion={'evaluation_name': 'grounded', 'include_reason': True}, ), LLMJudge( rubric='Response cites specific quotes or facts from the context', include_input=True, assertion={'evaluation_name': 'uses_citations', 'include_reason': True}, ), ], ) ``` ### Recipe Generation with Case-Specific Rubrics This example shows how to use both dataset-level and case-specific evaluators: recipe\_evaluation.py ```python from __future__ import annotations from typing import Any from pydantic import BaseModel from pydantic_ai import Agent, format_as_xml from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import IsInstance, LLMJudge class CustomerOrder(BaseModel): dish_name: str dietary_restriction: str | None = None class Recipe(BaseModel): ingredients: list[str] steps: list[str] recipe_agent = Agent( 'openai:gpt-5-mini', output_type=Recipe, instructions=( 'Generate a recipe to cook the dish that meets the dietary restrictions.' ), ) async def transform_recipe(customer_order: CustomerOrder) -> Recipe: r = await recipe_agent.run(format_as_xml(customer_order)) return r.output recipe_dataset = Dataset[CustomerOrder, Recipe, Any]( name='recipe_evaluation', cases=[ Case( name='vegetarian_recipe', inputs=CustomerOrder( dish_name='Spaghetti Bolognese', dietary_restriction='vegetarian' ), expected_output=None, metadata={'focus': 'vegetarian'}, evaluators=( # (1) LLMJudge( rubric='Recipe should not contain meat or animal products', ), ), ), Case( name='gluten_free_recipe', inputs=CustomerOrder( dish_name='Chocolate Cake', dietary_restriction='gluten-free' ), expected_output=None, metadata={'focus': 'gluten-free'}, evaluators=( # (2) LLMJudge( rubric='Recipe should not contain gluten or wheat products', ), ), ), ], evaluators=[ # (3) IsInstance(type_name='Recipe'), LLMJudge( rubric='Recipe should have clear steps and relevant ingredients', include_input=True, model='anthropic:claude-sonnet-4-6', ), ], ) report = recipe_dataset.evaluate_sync(transform_recipe) print(report) """ Evaluation Summary: transform_recipe ┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ vegetarian_recipe │ ✔✔✔ │ 38.1s │ ├────────────────────┼────────────┼──────────┤ │ gluten_free_recipe │ ✔✔✔ │ 22.4s │ ├────────────────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 30.3s │ └────────────────────┴────────────┴──────────┘ """ ``` Case-specific evaluator - only runs for the vegetarian recipe case Case-specific evaluator - only runs for the gluten-free recipe case Dataset-level evaluators - run for all cases ### Multi-Aspect Evaluation Use multiple judges for different quality dimensions: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge dataset = Dataset( name='multi_aspect', cases=[Case(inputs='test')], evaluators=[ # Accuracy LLMJudge( rubric='Response is factually accurate', include_input=True, assertion={'evaluation_name': 'accurate'}, ), # Helpfulness LLMJudge( rubric='Response is helpful and actionable', include_input=True, score={'evaluation_name': 'helpfulness'}, assertion=False, ), # Tone LLMJudge( rubric='Response uses professional, respectful language', assertion={'evaluation_name': 'professional_tone'}, ), # Safety LLMJudge( rubric='Response contains no harmful, biased, or inappropriate content', assertion={'evaluation_name': 'safe'}, ), ], ) ``` ### Comparative Evaluation Compare output against expected output: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge dataset = Dataset( name='comparative_eval', cases=[ Case( name='translation', inputs='Hello world', expected_output='Bonjour le monde', ), ], evaluators=[ LLMJudge( rubric='Response is semantically equivalent to the expected output', include_input=True, include_expected_output=True, score={'evaluation_name': 'semantic_similarity'}, assertion={'evaluation_name': 'correct_meaning'}, ), ], ) ``` ## Best Practices ### 1\. Be Specific in Rubrics **Bad:** ```python from pydantic_evals.evaluators import LLMJudge LLMJudge(rubric='Good answer') ``` **Better:** ```python from pydantic_evals.evaluators import LLMJudge LLMJudge(rubric='Response accurately answers the question without hallucinating facts') ``` **Best:** ```python from pydantic_evals.evaluators import LLMJudge LLMJudge( rubric=''' Response must: 1. Directly answer the question asked 2. Use only information from the provided context 3. Cite specific passages from the context 4. Acknowledge if information is insufficient ''', include_input=True, ) ``` ### 2\. Use Multiple Judges Don't always try to evaluate everything with one rubric: ```python from pydantic_evals.evaluators import LLMJudge # Instead of this: LLMJudge(rubric='Response is good, accurate, helpful, and safe') # Do this: evaluators = [ LLMJudge(rubric='Response is factually accurate'), LLMJudge(rubric='Response is helpful and actionable'), LLMJudge(rubric='Response is safe and appropriate'), ] ``` ### 3\. Combine with Deterministic Checks Don't use LLM evaluation for checks that can be done deterministically: ```python from pydantic_evals.evaluators import Contains, IsInstance, LLMJudge evaluators = [ IsInstance(type_name='str'), Contains(value='required_section'), LLMJudge(rubric='Response quality is high'), ] ``` ### 4\. Use Temperature 0 for Consistency ```python from pydantic_ai import ModelSettings from pydantic_evals.evaluators import LLMJudge LLMJudge( rubric='...', model_settings=ModelSettings(temperature=0.0), ) ``` ## Limitations ### Non-Determinism LLM judges are not deterministic. The same output may receive different scores across runs. **Mitigation:** - Use `temperature=0.0` for more consistency - Run multiple evaluations and average - Use retry strategies for flaky evaluations ### Cost LLM judges make API calls, which cost money and time. **Mitigation:** - Use cheaper models for simple checks (`gpt-5-mini`) - Run deterministic checks first to fail fast - Cache results when possible - Limit evaluation to changed cases ### Model Biases LLM judges inherit biases from their training data. **Mitigation:** - Use multiple judge models and compare - Review evaluation reasons, not just scores - Validate judges against human-labeled test sets - Be aware of known biases (length bias, style preferences) ### Context Limits Judges have token limits for inputs. **Mitigation:** - Truncate long inputs/outputs intelligently - Use focused rubrics that don't require full context - Consider chunked evaluation for very long content ## Debugging LLM Judges ### View Reasons ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge def my_task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset( name='debug_reasons', cases=[Case(inputs='test')], evaluators=[LLMJudge(rubric='Response is clear')], ) report = dataset.evaluate_sync(my_task) report.print(include_reasons=True) """ Evaluation Summary: my_task ┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━┩ │ Case 1 │ LLMJudge: ✔ │ 10ms │ │ │ Reason: - │ │ │ │ │ │ │ │ │ │ ├──────────┼─────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └──────────┴─────────────┴──────────┘ """ ``` Output: ``` ┃ Assertions ┃ ┃ accuracy: ✔ ┃ ┃ Reason: The response │ ┃ correctly states... │ ``` ### Access Programmatically ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge def my_task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset( name='programmatic_access', cases=[Case(inputs='test')], evaluators=[LLMJudge(rubric='Response is clear')], ) report = dataset.evaluate_sync(my_task) for case in report.cases: for name, result in case.assertions.items(): print(f'{name}: {result.value}') #> LLMJudge: True if result.reason: print(f' Reason: {result.reason}') #> Reason: - ``` ### Compare Judges Test the same cases with different judge models: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge def my_task(inputs: str) -> str: return f'Result: {inputs}' judges = [ LLMJudge(rubric='Response is clear', model='openai:gpt-5.2'), LLMJudge(rubric='Response is clear', model='anthropic:claude-sonnet-4-6'), LLMJudge(rubric='Response is clear', model='openai:gpt-5-mini'), ] for judge in judges: dataset = Dataset(name='judge_comparison', cases=[Case(inputs='test')], evaluators=[judge]) report = dataset.evaluate_sync(my_task) # Compare results ``` ## Advanced: Custom Judge Models Set a default judge model for all `LLMJudge` evaluators: ```python from pydantic_evals.evaluators import LLMJudge from pydantic_evals.evaluators.llm_as_a_judge import set_default_judge_model # Set default to Claude set_default_judge_model('anthropic:claude-sonnet-4-6') # Now all LLMJudge instances use Claude by default LLMJudge(rubric='...') # Uses Claude ``` ## Next Steps - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write custom evaluation logic - **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Complete evaluator reference --- # [Overview](https://pydantic.dev/docs/ai/evals/evaluators/overview/) # Overview Evaluators are the core of Pydantic Evals. They analyze task outputs and provide scores, labels, or pass/fail assertions. ## When to Use Different Evaluators ### Deterministic Checks (Fast & Reliable) Use deterministic evaluators when you can define exact rules: Evaluator Use Case Example [`EqualsExpected`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EqualsExpected) Exact output match Structured data, classification [`Equals`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Equals) Equals specific value Checking for sentinel values [`Contains`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Contains) Substring/element check Required keywords, PII detection [`IsInstance`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.IsInstance) Type validation Format validation [`MaxDuration`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.MaxDuration) Performance threshold SLA compliance [`HasMatchingSpan`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.HasMatchingSpan) Behavior verification Tool calls, code paths **Advantages:** - Fast execution (microseconds to milliseconds) - Deterministic results - No cost - Easy to debug **When to use:** - Format validation (JSON structure, type checking) - Required content checks (must contain X, must not contain Y) - Performance requirements (latency, token counts) - Behavioral checks (which tools were called, which code paths executed) ### LLM-as-a-Judge (Flexible & Nuanced) Use [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) when evaluation requires understanding or judgment: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge dataset = Dataset( name='llm_judge_example', cases=[Case(inputs='What is 2+2?', expected_output='4')], evaluators=[ LLMJudge( rubric='Response is factually accurate based on the input', include_input=True, ) ], ) ``` **Advantages:** - Can evaluate subjective qualities (helpfulness, tone, creativity) - Understands natural language - Can follow complex rubrics - Flexible across domains **Disadvantages:** - Slower (seconds per evaluation) - Costs money - Non-deterministic - Can have biases **When to use:** - Factual accuracy - Relevance and helpfulness - Tone and style - Completeness - Following instructions - RAG quality (groundedness, citation accuracy) ### Custom Evaluators Custom evaluators can be useful if you want to make use of any evaluation logic we don't provide with the framework. They are frequently useful for domain-specific logic: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ValidSQL(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: try: import sqlparse sqlparse.parse(ctx.output) return True except Exception: return False ``` **When to use:** - Domain-specific validation (SQL syntax, regex patterns, business rules) - External API calls (running generated code, checking databases) - Complex calculations (precision/recall, BLEU scores) - Integration checks (does API call succeed?) ## Evaluation Types Detailed Return Types Guide For full detail about precisely what custom Evaluators may return, see [Custom Evaluator Return Types](/docs/ai/evals/evaluators/custom#return-types). Evaluators essentially return three types of results: ### 1\. Assertions (bool) Pass/fail checks that appear as ✔ or ✗ in reports: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class HasKeyword(Evaluator): keyword: str def evaluate(self, ctx: EvaluatorContext) -> bool: return self.keyword in ctx.output ``` **Use for:** Binary checks, quality gates, compliance requirements ### 2\. Scores (int or float) Numeric metrics: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ConfidenceScore(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> float: # Analyze and return score return 0.87 # 87% confidence ``` **Use for:** Quality metrics, ranking, A/B testing, regression tracking ### 3\. Labels (str) Categorical classifications: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class SentimentClassifier(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> str: if 'error' in ctx.output.lower(): return 'error' elif 'success' in ctx.output.lower(): return 'success' return 'neutral' ``` **Use for:** Classification, error categorization, quality buckets ### Multiple Results You can return multiple evaluations from a single evaluator: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ComprehensiveCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | float | str]: return { 'valid_format': self._check_format(ctx.output), # bool 'quality_score': self._score_quality(ctx.output), # float 'category': self._classify(ctx.output), # str } def _check_format(self, output: str) -> bool: return True def _score_quality(self, output: str) -> float: return 0.85 def _classify(self, output: str) -> str: return 'good' ``` ## Combining Evaluators Mix and match evaluators to create comprehensive evaluation suites: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import ( Contains, IsInstance, LLMJudge, MaxDuration, ) dataset = Dataset( name='layered_evaluation', cases=[Case(inputs='test', expected_output='result')], evaluators=[ # Fast deterministic checks first IsInstance(type_name='str'), Contains(value='required_field'), MaxDuration(seconds=2.0), # Slower LLM checks after LLMJudge( rubric='Response is accurate and helpful', include_input=True, ), ], ) ``` ## Case-specific evaluators Case-specific evaluators are one of the most powerful features for building comprehensive evaluation suites. You can attach evaluators to individual [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) objects that only run for those specific cases: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import IsInstance, LLMJudge dataset = Dataset( name='case_specific_evaluators', cases=[ Case( name='greeting_response', inputs='Say hello', evaluators=[ # This evaluator only runs for this case LLMJudge( rubric='Response is warm and friendly, uses casual tone', include_input=True, ), ], ), Case( name='formal_response', inputs='Write a business email', evaluators=[ # Different requirements for this case LLMJudge( rubric='Response is professional and formal, uses business language', include_input=True, ), ], ), ], evaluators=[ # This runs for ALL cases IsInstance(type_name='str'), ], ) ``` ### Why Case-Specific Evaluators Matter Case-specific evaluators solve a fundamental problem with one-size-fits-all evaluation: **if you could write a single evaluator rubric that perfectly captured your requirements across all cases, you'd just incorporate that rubric into your agent's instructions**. (Note: this is less relevant in cases where you want to use a cheaper model in production and assess it using a more expensive model, but in many cases it makes sense to use the best model you can in production.) The power of case-specific evaluation comes from the nuance: - **Different cases have different requirements**: A customer support response needs empathy; a technical API response needs precision - **Avoid "inmates running the asylum"**: If your LLMJudge rubric is generic enough to work everywhere, your agent should already be following it - **Capture nuanced golden behavior**: Each case can specify exactly what "good" looks like for that scenario ### Building Golden Datasets with Case-Specific LLMJudge A particularly powerful pattern is using case-specific [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) evaluators to quickly build comprehensive, maintainable evaluation suites. Instead of needing exact `expected_output` values, you can describe what you care about: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge dataset = Dataset( name='golden_dataset', cases=[ Case( name='handle_refund_request', inputs={'query': 'I want my money back', 'order_id': '12345'}, evaluators=[ LLMJudge( rubric=""" Response should: 1. Acknowledge the refund request empathetically 2. Ask for the reason for the refund 3. Mention our 30-day refund policy 4. NOT process the refund immediately (needs manager approval) """, include_input=True, ), ], ), Case( name='handle_shipping_question', inputs={'query': 'Where is my order?', 'order_id': '12345'}, evaluators=[ LLMJudge( rubric=""" Response should: 1. Confirm the order number 2. Provide tracking information 3. Give estimated delivery date 4. Be brief and factual (not overly apologetic) """, include_input=True, ), ], ), Case( name='handle_angry_customer', inputs={'query': 'This is completely unacceptable!', 'order_id': '12345'}, evaluators=[ LLMJudge( rubric=""" Response should: 1. Prioritize de-escalation with empathy 2. Avoid being defensive 3. Offer concrete next steps 4. Use phrases like "I understand" and "Let me help" """, include_input=True, ), ], ), ], ) ``` This approach lets you: - **Build comprehensive test suites quickly**: Just describe what you want per case - **Maintain easily**: Update rubrics as requirements change, without regenerating outputs - **Cover edge cases naturally**: Add new cases with specific requirements as you discover them - **Capture domain knowledge**: Each rubric documents what "good" means for that scenario The LLM evaluator excels at understanding nuanced requirements and assessing compliance, making this a practical way to create thorough evaluation coverage without brittleness. ## Async vs Sync Evaluators can be sync or async: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class SyncEvaluator(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return True async def some_async_operation() -> bool: return True @dataclass class AsyncEvaluator(Evaluator): async def evaluate(self, ctx: EvaluatorContext) -> bool: result = await some_async_operation() return result ``` Pydantic Evals handles both automatically. Use async when: - Making API calls - Running database queries - Performing I/O operations - Calling LLMs (like [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge)) ## Evaluation Context All evaluators receive an [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext): - `ctx.inputs` - Task inputs - `ctx.output` - Task output (to evaluate) - `ctx.expected_output` - Expected output (if provided) - `ctx.metadata` - Case metadata (if provided) - `ctx.duration` - Task execution time (seconds) - `ctx.span_tree` - OpenTelemetry spans (if logfire configured) - `ctx.metrics` - Custom metrics dict - `ctx.attributes` - Custom attributes dict This gives evaluators full context to make informed assessments. ## Error Handling If an evaluator raises an exception, it's captured as an [`EvaluatorFailure`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorFailure): ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext def risky_operation(output: str) -> bool: # This might raise an exception if 'error' in output: raise ValueError('Found error in output') return True @dataclass class RiskyEvaluator(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: # If this raises an exception, it will be captured result = risky_operation(ctx.output) return result ``` Failures appear in `report.cases[i].evaluator_failures` with: - Evaluator name - Error message - Full stacktrace Use retry configuration to handle transient failures (see [Retry Strategies](/docs/ai/evals/how-to/retry-strategies)). ## Report Evaluators (Experiment-Wide) All the evaluators above run once per case. **Report evaluators** are different: they run once per experiment after all cases have been evaluated, and analyze the full set of results together. Use report evaluators for experiment-wide statistics like: - **Confusion matrices** -- visualize classification accuracy across classes - **Precision-recall curves** -- assess ranking quality with AUC scores - **Scalar metrics** -- overall accuracy, F1, BLEU, or any single number - **Summary tables** -- per-class breakdowns, error category summaries ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import ConfusionMatrixEvaluator dataset = Dataset( name='report_evaluator_example', cases=[ Case(inputs='meow', expected_output='cat'), Case(inputs='woof', expected_output='dog'), ], report_evaluators=[ ConfusionMatrixEvaluator( predicted_from='output', expected_from='expected_output', ), ], ) ``` **See:** [Report Evaluators](/docs/ai/evals/evaluators/report-evaluators) for the full guide, including built-in report evaluators and how to write custom ones. ## Next Steps - **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Complete reference of all provided evaluators - **[LLM Judge](/docs/ai/evals/evaluators/llm-judge)** - Deep dive on LLM-as-a-Judge evaluation - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write your own evaluation logic - **[Report Evaluators](/docs/ai/evals/evaluators/report-evaluators)** - Experiment-wide analyses - **[Span-Based Evaluation](/docs/ai/evals/evaluators/span-based)** - Evaluate using OpenTelemetry spans --- # [Report Evaluators](https://pydantic.dev/docs/ai/evals/evaluators/report-evaluators/) # Report Evaluators Report evaluators analyze entire experiment results rather than individual cases. Use them to compute experiment-wide statistics like confusion matrices, precision-recall curves, accuracy scores, or custom summary tables. ## How Report Evaluators Work Regular [evaluators](/docs/ai/evals/evaluators/overview) run once per case and assess individual outputs. Report evaluators run once per experiment _after_ all cases have been evaluated, receiving the full [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport) as input. ``` Cases executed → Case evaluators run → Report evaluators run → Final report ``` Results from report evaluators are stored as **analyses** on the report and, when Logfire is configured, are attached to the experiment span as structured attributes for visualization. ## Using Report Evaluators Pass report evaluators to `Dataset` via the `report_evaluators` parameter: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import ConfusionMatrixEvaluator def my_classifier(text: str) -> str: text = text.lower() if 'cat' in text or 'meow' in text: return 'cat' elif 'dog' in text or 'bark' in text: return 'dog' return 'unknown' dataset = Dataset( name='animal_classifier', cases=[ Case(name='cat', inputs='The cat goes meow', expected_output='cat'), Case(name='dog', inputs='The dog barks', expected_output='dog'), ], report_evaluators=[ ConfusionMatrixEvaluator( predicted_from='output', expected_from='expected_output', title='Animal Classification', ), ], ) report = dataset.evaluate_sync(my_classifier) # report.analyses contains the ConfusionMatrix result ``` ## Built-in Report Evaluators ### ConfusionMatrixEvaluator Builds a confusion matrix comparing predicted vs expected labels across all cases. ```python from pydantic_evals.evaluators import ConfusionMatrixEvaluator ConfusionMatrixEvaluator( predicted_from='output', expected_from='expected_output', title='My Confusion Matrix', ) ``` **Parameters:** Parameter Type Default Description `predicted_from` `'expected_output' | 'output' | 'metadata' | 'labels'` `'output'` Source for predicted values `predicted_key` `str | None` `None` Key to extract when using `metadata` or `labels` `expected_from` `'expected_output' | 'output' | 'metadata' | 'labels'` `'expected_output'` Source for expected/true values `expected_key` `str | None` `None` Key to extract when using `metadata` or `labels` `title` `str` `'Confusion Matrix'` Title shown in reports **Returns:** [`ConfusionMatrix`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ConfusionMatrix) **Data Sources:** - `'output'` -- the task's actual output (converted to string) - `'expected_output'` -- the case's expected output (converted to string) - `'metadata'` -- a value from the case's metadata dict (requires `key`) - `'labels'` -- a label result from a case-level evaluator (requires `key`) **Example -- classification with expected outputs:** ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import ConfusionMatrixEvaluator dataset = Dataset( name='animal_sounds', cases=[ Case(inputs='meow', expected_output='cat'), Case(inputs='woof', expected_output='dog'), Case(inputs='chirp', expected_output='bird'), ], report_evaluators=[ ConfusionMatrixEvaluator( predicted_from='output', expected_from='expected_output', ), ], ) ``` **Example -- using evaluator labels:** If a case-level evaluator produces a label like `predicted_class`, you can reference it: ```python from dataclasses import dataclass from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import ( ConfusionMatrixEvaluator, Evaluator, EvaluatorContext, ) @dataclass class ClassifyOutput(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> dict[str, str]: # Classify the output into a category return {'predicted_class': categorize(ctx.output)} def categorize(output: str) -> str: return 'positive' if 'good' in output.lower() else 'negative' dataset = Dataset( name='labels_example', cases=[Case(inputs='test', expected_output='positive')], evaluators=[ClassifyOutput()], report_evaluators=[ ConfusionMatrixEvaluator( predicted_from='labels', predicted_key='predicted_class', expected_from='expected_output', ), ], ) ``` * * * ### PrecisionRecallEvaluator Computes a precision-recall curve with AUC (area under the curve) from numeric scores and binary ground-truth labels. ```python from pydantic_evals.evaluators import PrecisionRecallEvaluator PrecisionRecallEvaluator( score_from='scores', score_key='confidence', positive_from='assertions', positive_key='is_correct', ) ``` **Parameters:** Parameter Type Default Description `score_key` `str` _(required)_ Key in scores or metrics dict `positive_from` `'expected_output' | 'assertions' | 'labels'` _(required)_ Source for ground-truth binary labels `positive_key` `str | None` `None` Key in assertions or labels dict `score_from` `'scores' | 'metrics'` `'scores'` Source for numeric scores `title` `str` `'Precision-Recall Curve'` Title shown in reports `n_thresholds` `int` `100` Number of threshold points on the curve **Returns:** [`PrecisionRecall`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.PrecisionRecall) + [`ScalarResult`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ScalarResult) (AUC) The AUC is computed at full resolution (using every unique score as a threshold) for accuracy, then the curve points are downsampled to `n_thresholds` for display. The AUC is returned both on the curve (for chart rendering) and as a separate `ScalarResult` for querying and sorting. **Score Sources:** - `'scores'` -- a numeric score from a case-level evaluator (looked up by `score_key`) - `'metrics'` -- a custom metric set during task execution (looked up by `score_key`) **Positive Sources:** - `'assertions'` -- a boolean assertion from a case-level evaluator (looked up by `positive_key`) - `'labels'` -- a label result cast to boolean (looked up by `positive_key`) - `'expected_output'` -- the case's expected output cast to boolean **Example:** ```python from dataclasses import dataclass from typing import Any from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import ( Evaluator, EvaluatorContext, PrecisionRecallEvaluator, ) @dataclass class ConfidenceEvaluator(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> dict[str, Any]: confidence = calculate_confidence(ctx.output) return { 'confidence': confidence, # numeric score 'is_correct': ctx.output == ctx.expected_output, # boolean assertion } def calculate_confidence(output: str) -> float: return 0.85 # placeholder dataset = Dataset( name='precision_recall_example', cases=[ Case(inputs='test 1', expected_output='cat'), Case(inputs='test 2', expected_output='dog'), ], evaluators=[ConfidenceEvaluator()], report_evaluators=[ PrecisionRecallEvaluator( score_from='scores', score_key='confidence', positive_from='assertions', positive_key='is_correct', ), ], ) ``` * * * ### ROCAUCEvaluator Computes an ROC (Receiver Operating Characteristic) curve and AUC from numeric scores and binary ground-truth labels. The ROC curve plots the True Positive Rate against the False Positive Rate at various threshold values, with a dashed random-baseline diagonal for reference. ```python from pydantic_evals.evaluators import ROCAUCEvaluator ROCAUCEvaluator( score_key='confidence', positive_from='assertions', positive_key='is_correct', ) ``` **Parameters:** Parameter Type Default Description `score_key` `str` _(required)_ Key in scores or metrics dict `positive_from` `'expected_output' | 'assertions' | 'labels'` _(required)_ Source for ground-truth binary labels `positive_key` `str | None` `None` Key in assertions or labels dict `score_from` `'scores' | 'metrics'` `'scores'` Source for numeric scores `title` `str` `'ROC Curve'` Title shown in reports `n_thresholds` `int` `100` Number of threshold points on the curve **Returns:** [`LinePlot`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.LinePlot) + [`ScalarResult`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ScalarResult) (AUC) The AUC is computed at full resolution. The chart includes a dashed "Random" baseline diagonal from (0, 0) to (1, 1) for visual comparison. **Score and Positive Sources:** Same as [`PrecisionRecallEvaluator`](#precisionrecallevaluator). * * * ### KolmogorovSmirnovEvaluator Computes a Kolmogorov-Smirnov plot and KS statistic from numeric scores and binary ground-truth labels. The KS plot shows the empirical CDFs (cumulative distribution functions) of the score distribution for positive and negative cases. The KS statistic is the maximum vertical distance between the two CDFs -- higher values indicate better class separation. ```python from pydantic_evals.evaluators import KolmogorovSmirnovEvaluator KolmogorovSmirnovEvaluator( score_key='confidence', positive_from='assertions', positive_key='is_correct', ) ``` **Parameters:** Parameter Type Default Description `score_key` `str` _(required)_ Key in scores or metrics dict `positive_from` `'expected_output' | 'assertions' | 'labels'` _(required)_ Source for ground-truth binary labels `positive_key` `str | None` `None` Key in assertions or labels dict `score_from` `'scores' | 'metrics'` `'scores'` Source for numeric scores `title` `str` `'KS Plot'` Title shown in reports `n_thresholds` `int` `100` Number of threshold points on the curve **Returns:** [`LinePlot`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.LinePlot) + [`ScalarResult`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ScalarResult) (KS Statistic) **Score and Positive Sources:** Same as [`PrecisionRecallEvaluator`](#precisionrecallevaluator). * * * ## Custom Report Evaluators Write custom report evaluators by inheriting from [`ReportEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.ReportEvaluator) and implementing the `evaluate` method: ```python from dataclasses import dataclass from pydantic_evals.evaluators import ReportEvaluator, ReportEvaluatorContext from pydantic_evals.reporting.analyses import ScalarResult @dataclass class AccuracyEvaluator(ReportEvaluator): """Computes overall accuracy as a scalar metric.""" def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult: cases = ctx.report.cases if not cases: return ScalarResult(title='Accuracy', value=0.0, unit='%') correct = sum( 1 for case in cases if case.output == case.expected_output ) accuracy = correct / len(cases) * 100 return ScalarResult(title='Accuracy', value=accuracy, unit='%') ``` ### ReportEvaluatorContext The context passed to `evaluate()` contains: - `ctx.name` -- the experiment name - `ctx.report` -- the full [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport) with all case results - `ctx.experiment_metadata` -- optional experiment-level metadata dict Through `ctx.report.cases`, you can access each case's inputs, outputs, expected outputs, scores, labels, assertions, metrics, and attributes. ### Return Types Report evaluators must return a `ReportAnalysis` or a `list[ReportAnalysis]`. The available analysis types are: #### ScalarResult A single numeric statistic: ```python from pydantic_evals.reporting.analyses import ScalarResult ScalarResult( title='Accuracy', value=93.3, unit='%', description='Percentage of correctly classified cases.', ) ``` Field Type Description `title` `str` Display name `value` `float | int` The numeric value `unit` `str | None` Optional unit label (e.g., `'%'`, `'ms'`) `description` `str | None` Optional longer description * * * #### TableResult A generic table of data: ```python from pydantic_evals.reporting.analyses import TableResult TableResult( title='Per-Class Metrics', columns=['Class', 'Precision', 'Recall', 'F1'], rows=[ ['cat', 0.95, 0.90, 0.924], ['dog', 0.88, 0.92, 0.899], ], description='Precision, recall, and F1 per class.', ) ``` Field Type Description `title` `str` Display name `columns` `list[str]` Column headers `rows` `list[list[str | int | float | bool | None]]` Row data `description` `str | None` Optional longer description * * * #### ConfusionMatrix A confusion matrix (typically produced by `ConfusionMatrixEvaluator`, but can be constructed directly): ```python from pydantic_evals.reporting.analyses import ConfusionMatrix ConfusionMatrix( title='Sentiment', class_labels=['positive', 'negative', 'neutral'], matrix=[ [45, 3, 2], # expected=positive [5, 40, 5], # expected=negative [1, 2, 47], # expected=neutral ], ) ``` Field Type Description `title` `str` Display name `class_labels` `list[str]` Ordered labels for both axes `matrix` `list[list[int]]` `matrixexpected` = count `description` `str | None` Optional longer description * * * #### PrecisionRecall Precision-recall curve data (typically produced by `PrecisionRecallEvaluator`): Field Type Description `title` `str` Display name `curves` `list[PrecisionRecallCurve]` One or more curves `description` `str | None` Optional longer description Each `PrecisionRecallCurve` contains a `name`, a list of `PrecisionRecallPoint`s (with `threshold`, `precision`, `recall`), and an optional `auc` value. * * * #### LinePlot A generic XY line chart with labeled axes, supporting multiple curves. Use this for ROC curves, KS plots, calibration curves, or any custom line chart: ```python from pydantic_evals.reporting.analyses import LinePlot, LinePlotCurve, LinePlotPoint LinePlot( title='ROC Curve', x_label='False Positive Rate', y_label='True Positive Rate', x_range=(0, 1), y_range=(0, 1), curves=[ LinePlotCurve( name='Model (AUC: 0.95)', points=[LinePlotPoint(x=0.0, y=0.0), LinePlotPoint(x=0.1, y=0.8), LinePlotPoint(x=1.0, y=1.0)], ), LinePlotCurve( name='Random', points=[LinePlotPoint(x=0, y=0), LinePlotPoint(x=1, y=1)], style='dashed', ), ], ) ``` Field Type Description `title` `str` Display name `x_label` `str` Label for the x-axis `y_label` `str` Label for the y-axis `x_range` `tuple[float, float] | None` Optional fixed range for x-axis `y_range` `tuple[float, float] | None` Optional fixed range for y-axis `curves` `list[LinePlotCurve]` One or more curves to plot `description` `str | None` Optional longer description Each `LinePlotCurve` contains a `name`, a list of `LinePlotPoint`s (with `x`, `y`), an optional `style` (`'solid'` or `'dashed'`), and an optional `step` interpolation mode (`'start'`, `'middle'`, or `'end'`) for step functions like empirical CDFs. `LinePlot` is the recommended return type for custom curve-based evaluators -- any evaluator that returns a `LinePlot` will be rendered as a line chart in the Logfire UI without requiring any frontend changes. ### Returning Multiple Analyses A single report evaluator can return multiple analyses by returning a list: ```python from dataclasses import dataclass from pydantic_evals.evaluators import ReportEvaluator, ReportEvaluatorContext from pydantic_evals.reporting.analyses import ReportAnalysis, ScalarResult, TableResult @dataclass class ClassificationSummary(ReportEvaluator): """Produces both a scalar accuracy and a per-class metrics table.""" def evaluate(self, ctx: ReportEvaluatorContext) -> list[ReportAnalysis]: cases = ctx.report.cases if not cases: return [] labels = sorted({str(c.expected_output) for c in cases if c.expected_output}) # Scalar: overall accuracy correct = sum(1 for c in cases if c.output == c.expected_output) accuracy = ScalarResult( title='Accuracy', value=correct / len(cases) * 100, unit='%' ) # Table: per-class breakdown rows = [] for label in labels: tp = sum(1 for c in cases if str(c.output) == label and str(c.expected_output) == label) fp = sum(1 for c in cases if str(c.output) == label and str(c.expected_output) != label) fn = sum(1 for c in cases if str(c.output) != label and str(c.expected_output) == label) p = tp / (tp + fp) if (tp + fp) > 0 else 0.0 r = tp / (tp + fn) if (tp + fn) > 0 else 0.0 f1 = 2 * p * r / (p + r) if (p + r) > 0 else 0.0 rows.append([label, round(p, 3), round(r, 3), round(f1, 3)]) table = TableResult( title='Per-Class Metrics', columns=['Class', 'Precision', 'Recall', 'F1'], rows=rows, ) return [accuracy, table] ``` ### Async Report Evaluators Report evaluators support async `evaluate` methods, handled automatically via `evaluate_async`: ```python from dataclasses import dataclass from pydantic_evals.evaluators import ReportEvaluator, ReportEvaluatorContext from pydantic_evals.reporting.analyses import ScalarResult @dataclass class AsyncAccuracy(ReportEvaluator): async def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult: # Can use async I/O here (e.g., call an external API) cases = ctx.report.cases correct = sum(1 for c in cases if c.output == c.expected_output) return ScalarResult( title='Accuracy', value=correct / len(cases) * 100 if cases else 0.0, unit='%', ) ``` ## Serialization Report evaluators are serialized to and from YAML/JSON dataset files using the same format as case-level evaluators. This means datasets with report evaluators can be fully round-tripped through file serialization. **Example YAML dataset with report evaluators:** ```yaml # yaml-language-server: $schema=./test_cases_schema.json name: classifier_eval cases: - name: cat_test inputs: The cat meows expected_output: cat - name: dog_test inputs: The dog barks expected_output: dog report_evaluators: - ConfusionMatrixEvaluator - PrecisionRecallEvaluator: score_key: confidence positive_from: assertions positive_key: is_correct ``` Built-in report evaluators (`ConfusionMatrixEvaluator`, `PrecisionRecallEvaluator`, `ROCAUCEvaluator`, `KolmogorovSmirnovEvaluator`) are recognized automatically. For custom report evaluators, pass them via `custom_report_evaluator_types`: ```python from pydantic_evals import Dataset dataset = Dataset[str, str, None].from_file( 'test_cases.yaml', custom_report_evaluator_types=[MyCustomReportEvaluator], ) ``` Similarly, when saving a dataset with custom report evaluators, pass them to `to_file` so the JSON schema includes them: ```python dataset.to_file( 'test_cases.yaml', custom_report_evaluator_types=[MyCustomReportEvaluator], ) ``` ## Viewing Analyses in Logfire When [Logfire is configured](/docs/ai/evals/how-to/logfire-integration), analyses are automatically attached to the experiment span as the `logfire.experiment.analyses` attribute. The Logfire UI renders them as interactive visualizations: - **Confusion matrices** are displayed as heatmaps - **Precision-recall curves** are rendered as line charts with AUC in the legend - **Line plots** (ROC curves, KS plots, etc.) are rendered as line charts with configurable axes - **Scalar results** are shown as labeled values - **Tables** are rendered as formatted data tables When comparing multiple experiments in the Logfire Evals view, analyses of the same type are displayed side by side for easy comparison. ## Complete Example A full example combining case-level evaluators with report evaluators: ```python from dataclasses import dataclass from typing import Any from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import ( ConfusionMatrixEvaluator, Evaluator, EvaluatorContext, KolmogorovSmirnovEvaluator, PrecisionRecallEvaluator, ReportEvaluator, ReportEvaluatorContext, ROCAUCEvaluator, ) from pydantic_evals.reporting.analyses import ScalarResult def my_classifier(text: str) -> str: text = text.lower() if 'cat' in text or 'meow' in text: return 'cat' elif 'dog' in text or 'bark' in text: return 'dog' elif 'bird' in text or 'chirp' in text: return 'bird' return 'unknown' # Case-level evaluator: runs per case @dataclass class ConfidenceEvaluator(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> dict[str, Any]: confidence = compute_confidence(ctx.output, ctx.inputs) is_correct = ctx.output == ctx.expected_output return { 'confidence': confidence, 'is_correct': is_correct, } def compute_confidence(output: str, inputs: str) -> float: return 0.85 # placeholder # Report-level evaluator: runs once over the full report @dataclass class AccuracyEvaluator(ReportEvaluator): def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult: cases = ctx.report.cases correct = sum(1 for c in cases if c.output == c.expected_output) return ScalarResult( title='Accuracy', value=correct / len(cases) * 100 if cases else 0.0, unit='%', ) dataset = Dataset( name='full_example', cases=[ Case(inputs='The cat meows', expected_output='cat'), Case(inputs='The dog barks', expected_output='dog'), Case(inputs='A bird chirps', expected_output='bird'), ], evaluators=[ConfidenceEvaluator()], report_evaluators=[ ConfusionMatrixEvaluator( predicted_from='output', expected_from='expected_output', title='Animal Classification', ), PrecisionRecallEvaluator( score_from='scores', score_key='confidence', positive_from='assertions', positive_key='is_correct', ), ROCAUCEvaluator( score_from='scores', score_key='confidence', positive_from='assertions', positive_key='is_correct', ), KolmogorovSmirnovEvaluator( score_from='scores', score_key='confidence', positive_from='assertions', positive_key='is_correct', ), AccuracyEvaluator(), ], ) report = dataset.evaluate_sync(my_classifier) # Access analyses programmatically for analysis in report.analyses: print(f'{analysis.type}: {analysis.title}') #> confusion_matrix: Animal Classification #> precision_recall: Precision-Recall Curve #> scalar: Precision-Recall Curve AUC #> line_plot: ROC Curve #> scalar: ROC Curve AUC #> line_plot: KS Plot #> scalar: KS Statistic #> scalar: Accuracy ``` ## Next Steps - **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** -- Case-level evaluator reference - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** -- Writing case-level evaluators - **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** -- Viewing analyses in the Logfire UI --- # [Span-Based](https://pydantic.dev/docs/ai/evals/evaluators/span-based/) # Span-Based Evaluate AI system behavior by analyzing OpenTelemetry spans captured during execution. Requires Logfire Span-based evaluation requires `logfire` to be installed and configured: Terminal ```bash pip install 'pydantic-evals[logfire]' ``` Span-based evaluation enables you to evaluate **how** your AI system executes, not just **what** it produces. This is essential for complex agents where ensuring the desired behavior depends on the execution path taken, not just the final output. ### Why Span-Based Evaluation? Traditional evaluators assess task inputs and outputs. For simple tasks, this may be sufficient--if the output is correct, the task succeeded. But for complex multi-step agents, the _process_ matters as much as the result: - **A correct answer reached incorrectly** - An agent might produce the right output by accident (e.g., guessing, using cached data when it should have searched, calling the wrong tools but getting lucky) - **Verification of required behaviors** - You need to ensure specific tools were called, certain code paths executed, or particular patterns followed - **Performance and efficiency** - The agent should reach the answer efficiently, without unnecessary tool calls, infinite loops, or excessive retries - **Safety and compliance** - Critical to verify that dangerous operations weren't attempted, sensitive data wasn't accessed inappropriately, or guardrails weren't bypassed ### Real-World Scenarios Span-based evaluation is particularly valuable for: - **RAG systems** - Verify documents were retrieved and reranked before generation, not just that the answer included citations - **Multi-agent coordination** - Ensure the orchestrator delegated to the right specialist agents in the correct order - **Tool-calling agents** - Confirm specific tools were used (or avoided), and in the expected sequence - **Debugging and regression testing** - Catch behavioral regressions where outputs remain correct but the internal logic deteriorates - **Production alignment** - Ensure your evaluation assertions operate on the same telemetry data captured in production, so eval insights directly translate to production monitoring ### How It Works When you configure logfire (`logfire.configure()`), Pydantic Evals captures all OpenTelemetry spans generated during task execution. You can then write evaluators that assert conditions on: - **Which tools were called** - `HasMatchingSpan(query={'name_contains': 'search_tool'})` - **Code paths executed** - Verify specific functions ran or particular branches taken - **Timing characteristics** - Check that operations complete within SLA bounds - **Error conditions** - Detect retries, fallbacks, or specific failure modes - **Execution structure** - Verify parent-child relationships, delegation patterns, or execution order This creates a fundamentally different evaluation paradigm: you're testing behavioral contracts, not just input-output relationships. ## Basic Usage ```python import logfire from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import HasMatchingSpan # Configure logfire to capture spans logfire.configure(send_to_logfire='if-token-present') dataset = Dataset( name='span_basic', cases=[Case(inputs='test')], evaluators=[ # Check that database was queried HasMatchingSpan( query={'name_contains': 'database_query'}, evaluation_name='used_database', ), ], ) ``` ## HasMatchingSpan Evaluator The [`HasMatchingSpan`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.HasMatchingSpan) evaluator checks if any span matches a query: ```python from pydantic_evals.evaluators import HasMatchingSpan HasMatchingSpan( query={'name_contains': 'test'}, evaluation_name='span_check', ) ``` **Returns:** `bool` - `True` if any span matches the query ## SpanQuery Reference A [`SpanQuery`](/docs/ai/api/pydantic_evals/otel/#pydantic_evals.otel.SpanQuery) is a dictionary with query conditions: ### Name Conditions Match spans by name: ```python # Exact name match {'name_equals': 'search_database'} # Contains substring {'name_contains': 'tool_call'} # Regex pattern {'name_matches_regex': r'llm_call_\d+'} ``` ### Attribute Conditions Match spans with specific attributes: ```python # Has specific attribute values {'has_attributes': {'operation': 'search', 'status': 'success'}} # Has attribute keys (any value) {'has_attribute_keys': ['user_id', 'request_id']} ``` ### Duration Conditions Match based on execution time: ```python from datetime import timedelta # Minimum duration {'min_duration': 1.0} # seconds {'min_duration': timedelta(seconds=1)} # Maximum duration {'max_duration': 5.0} # seconds {'max_duration': timedelta(seconds=5)} # Range {'min_duration': 0.5, 'max_duration': 2.0} ``` ### Logical Operators Combine conditions: ```python # NOT {'not_': {'name_contains': 'error'}} # AND (all must match) {'and_': [ {'name_contains': 'tool'}, {'max_duration': 1.0}, ]} # OR (any must match) {'or_': [ {'name_equals': 'search'}, {'name_equals': 'query'}, ]} ``` ### Child/Descendant Conditions Query relationships between spans: ```python # Count direct children {'min_child_count': 1} {'max_child_count': 5} # Some child matches query {'some_child_has': {'name_contains': 'retry'}} # All children match query {'all_children_have': {'max_duration': 0.5}} # No children match query {'no_child_has': {'has_attributes': {'error': True}}} # Descendant queries (recursive) {'min_descendant_count': 5} {'some_descendant_has': {'name_contains': 'api_call'}} ``` ### Ancestor/Depth Conditions Query span hierarchy: ```python # Depth (root spans have depth 0) {'min_depth': 1} # Not a root span {'max_depth': 2} # At most 2 levels deep # Ancestor queries {'some_ancestor_has': {'name_equals': 'agent_run'}} {'all_ancestors_have': {'max_duration': 10.0}} {'no_ancestor_has': {'has_attributes': {'error': True}}} ``` ### Stop Recursing Control recursive queries: ```python { 'some_descendant_has': {'name_contains': 'expensive'}, 'stop_recursing_when': {'name_equals': 'boundary'}, } # Only search descendants until hitting a span named 'boundary' ``` ## Practical Examples ### Verify Tool Usage Check that specific tools were called: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import HasMatchingSpan dataset = Dataset( name='tool_verification', cases=[Case(inputs='test')], evaluators=[ # Must call search tool HasMatchingSpan( query={'name_contains': 'search_tool'}, evaluation_name='used_search', ), # Must NOT call dangerous tool HasMatchingSpan( query={'not_': {'name_contains': 'delete_database'}}, evaluation_name='safe_execution', ), ], ) ``` ### Check Multiple Tools Verify a sequence of operations: ```python from pydantic_evals.evaluators import HasMatchingSpan evaluators = [ HasMatchingSpan( query={'name_contains': 'retrieve_context'}, evaluation_name='retrieved_context', ), HasMatchingSpan( query={'name_contains': 'generate_response'}, evaluation_name='generated_response', ), HasMatchingSpan( query={'and_': [ {'name_contains': 'cite'}, {'has_attribute_keys': ['source_id']}, ]}, evaluation_name='added_citations', ), ] ``` ### Performance Assertions Ensure operations meet latency requirements: ```python from pydantic_evals.evaluators import HasMatchingSpan evaluators = [ # Database queries should be fast HasMatchingSpan( query={'and_': [ {'name_contains': 'database'}, {'max_duration': 0.1}, # 100ms max ]}, evaluation_name='fast_db_queries', ), # Overall should complete quickly HasMatchingSpan( query={'and_': [ {'name_equals': 'task_execution'}, {'max_duration': 2.0}, ]}, evaluation_name='within_sla', ), ] ``` ### Error Detection Check for error conditions: ```python from pydantic_evals.evaluators import HasMatchingSpan evaluators = [ # No errors occurred HasMatchingSpan( query={'not_': {'has_attributes': {'error': True}}}, evaluation_name='no_errors', ), # Retries happened HasMatchingSpan( query={'name_contains': 'retry'}, evaluation_name='had_retries', ), # Fallback was used HasMatchingSpan( query={'name_contains': 'fallback_model'}, evaluation_name='used_fallback', ), ] ``` ### Complex Behavioral Checks Verify sophisticated behavior patterns: ```python from pydantic_evals.evaluators import HasMatchingSpan evaluators = [ # Agent delegated to sub-agent HasMatchingSpan( query={'and_': [ {'name_contains': 'agent'}, {'some_child_has': {'name_contains': 'delegate'}}, ]}, evaluation_name='used_delegation', ), # Made multiple LLM calls with retries HasMatchingSpan( query={'and_': [ {'name_contains': 'llm_call'}, {'some_descendant_has': {'name_contains': 'retry'}}, {'min_descendant_count': 3}, ]}, evaluation_name='retry_pattern', ), ] ``` ## Custom Evaluators with SpanTree For more complex span analysis, write custom evaluators: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class CustomSpanCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | int]: span_tree = ctx.span_tree # Find specific spans llm_spans = span_tree.find(lambda node: 'llm' in node.name) tool_spans = span_tree.find(lambda node: 'tool' in node.name) # Calculate metrics total_llm_time = sum( span.duration.total_seconds() for span in llm_spans ) return { 'used_llm': len(llm_spans) > 0, 'used_tools': len(tool_spans) > 0, 'tool_count': len(tool_spans), 'llm_fast': total_llm_time < 2.0, } ``` ### SpanTree API The [`SpanTree`](/docs/ai/api/pydantic_evals/otel/#pydantic_evals.otel.SpanTree) provides methods for span analysis: ```python from pydantic_evals.otel import SpanTree # Example API (requires span_tree from context) def example_api(span_tree: SpanTree) -> None: span_tree.find(lambda n: True) # Find all matching nodes span_tree.any({'name_contains': 'test'}) # Check if any span matches span_tree.all({'name_contains': 'test'}) # Check if all spans match span_tree.count({'name_contains': 'test'}) # Count matching spans # Iteration for node in span_tree: print(node.name, node.duration, node.attributes) ``` ### SpanNode Properties Each [`SpanNode`](/docs/ai/api/pydantic_evals/otel/#pydantic_evals.otel.SpanNode) has: ```python from pydantic_evals.otel import SpanNode # Example properties (requires node from context) def example_properties(node: SpanNode) -> None: _ = node.name # Span name _ = node.duration # timedelta _ = node.attributes # dict[str, AttributeValue] _ = node.start_timestamp # datetime _ = node.end_timestamp # datetime _ = node.children # list[SpanNode] _ = node.descendants # list[SpanNode] (recursive) _ = node.ancestors # list[SpanNode] _ = node.parent # SpanNode | None ``` ## Debugging Span Queries ### View Spans in Logfire If you're sending data to Logfire, you can view all spans in the web UI to understand the trace structure. ### Print Span Tree ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class DebugSpans(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: for node in ctx.span_tree: print(f"{' ' * len(node.ancestors)}{node.name} ({node.duration})") return True ``` ### Query Testing Test queries incrementally: ```python from pydantic_evals.evaluators import HasMatchingSpan # Start simple query = {'name_contains': 'tool'} # Add conditions gradually query = {'and_': [ {'name_contains': 'tool'}, {'max_duration': 1.0}, ]} # Test in evaluator HasMatchingSpan(query=query, evaluation_name='test') ``` ## Use Cases ### RAG System Verification Verify retrieval-augmented generation workflow: ```python from pydantic_evals.evaluators import HasMatchingSpan evaluators = [ # Retrieved documents HasMatchingSpan( query={'name_contains': 'vector_search'}, evaluation_name='retrieved_docs', ), # Reranked results HasMatchingSpan( query={'name_contains': 'rerank'}, evaluation_name='reranked_results', ), # Generated with context HasMatchingSpan( query={'and_': [ {'name_contains': 'generate'}, {'has_attribute_keys': ['context_ids']}, ]}, evaluation_name='used_context', ), ] ``` ### Multi-Agent Systems Verify agent coordination: ```python from pydantic_evals.evaluators import HasMatchingSpan evaluators = [ # Master agent ran HasMatchingSpan( query={'name_equals': 'master_agent'}, evaluation_name='master_ran', ), # Delegated to specialist HasMatchingSpan( query={'and_': [ {'name_contains': 'specialist_agent'}, {'some_ancestor_has': {'name_equals': 'master_agent'}}, ]}, evaluation_name='delegated_correctly', ), # No circular delegation HasMatchingSpan( query={'not_': {'and_': [ {'name_contains': 'agent'}, {'some_descendant_has': {'name_contains': 'agent'}}, {'some_ancestor_has': {'name_contains': 'agent'}}, ]}}, evaluation_name='no_circular_delegation', ), ] ``` ### Tool Usage Patterns Verify intelligent tool selection: ```python from pydantic_evals.evaluators import HasMatchingSpan evaluators = [ # Used search before answering HasMatchingSpan( query={'and_': [ {'name_contains': 'search'}, {'some_ancestor_has': {'name_contains': 'answer'}}, ]}, evaluation_name='searched_before_answering', ), # Limited tool calls (no loops) HasMatchingSpan( query={'and_': [ {'name_contains': 'tool'}, {'max_child_count': 5}, ]}, evaluation_name='reasonable_tool_usage', ), ] ``` ## Best Practices 1. **Start Simple**: Begin with basic name queries, add complexity as needed 2. **Use Descriptive Names**: Name your spans well in your application code 3. **Test Queries**: Verify queries work before running full evaluations 4. **Combine with Other Evaluators**: Use span checks alongside output validation 5. **Document Expectations**: Comment why specific spans should/shouldn't exist ## Next Steps - **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** - Set up Logfire for span capture - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write advanced span analysis - **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Other evaluator types --- # [Simple Validation](https://pydantic.dev/docs/ai/evals/examples/simple-validation/) # Simple Validation A proof of concept example of evaluating a simple text transformation function with deterministic checks. ## Scenario We're testing a function that converts text to title case. We want to verify: - Output is always a string - Output matches expected format - Function handles edge cases correctly - Performance meets requirements ## Complete Example ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import ( Contains, EqualsExpected, IsInstance, MaxDuration, ) # The function we're testing def to_title_case(text: str) -> str: """Convert text to title case.""" return text.title() # Create evaluation dataset dataset = Dataset( name='title_case_validation', cases=[ # Basic functionality Case( name='basic_lowercase', inputs='hello world', expected_output='Hello World', ), Case( name='basic_uppercase', inputs='HELLO WORLD', expected_output='Hello World', ), Case( name='mixed_case', inputs='HeLLo WoRLd', expected_output='Hello World', ), # Edge cases Case( name='empty_string', inputs='', expected_output='', ), Case( name='single_word', inputs='hello', expected_output='Hello', ), Case( name='with_punctuation', inputs='hello, world!', expected_output='Hello, World!', ), Case( name='with_numbers', inputs='hello 123 world', expected_output='Hello 123 World', ), Case( name='apostrophes', inputs="don't stop believin'", expected_output="Don'T Stop Believin'", ), ], evaluators=[ # Always returns a string IsInstance(type_name='str'), # Matches expected output EqualsExpected(), # Output should contain capital letters Contains(value='H', evaluation_name='has_capitals'), # Should be fast (under 1ms) MaxDuration(seconds=0.001), ], ) # Run evaluation if __name__ == '__main__': report = dataset.evaluate_sync(to_title_case) # Print results report.print(include_input=True, include_output=True) """ Evaluation Summary: to_title_case ┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ basic_lowercase │ hello world │ Hello World │ ✔✔✔✗ │ 10ms │ ├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤ │ basic_uppercase │ HELLO WORLD │ Hello World │ ✔✔✔✗ │ 10ms │ ├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤ │ mixed_case │ HeLLo WoRLd │ Hello World │ ✔✔✔✗ │ 10ms │ ├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤ │ empty_string │ - │ - │ ✔✔✗✗ │ 10ms │ ├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤ │ single_word │ hello │ Hello │ ✔✔✔✗ │ 10ms │ ├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤ │ with_punctuation │ hello, world! │ Hello, World! │ ✔✔✔✗ │ 10ms │ ├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤ │ with_numbers │ hello 123 world │ Hello 123 World │ ✔✔✔✗ │ 10ms │ ├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤ │ apostrophes │ don't stop believin' │ Don'T Stop Believin' │ ✔✔✗✗ │ 10ms │ ├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤ │ Averages │ │ │ 68.8% ✔ │ 10ms │ └──────────────────┴──────────────────────┴──────────────────────┴────────────┴──────────┘ """ # Check if all passed avg = report.averages() if avg and avg.assertions == 1.0: print('\n✅ All tests passed!') else: print(f'\n❌ Some tests failed (pass rate: {avg.assertions:.1%})') """ ❌ Some tests failed (pass rate: 68.8%) """ ``` ## Expected Output ``` Evaluation Summary: to_title_case ┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ basic_lowercase │ hello world │ Hello World │ ✔✔✔✔ │ <1ms│ ├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤ │ basic_uppercase │ HELLO WORLD │ Hello World │ ✔✔✔✔ │ <1ms│ ├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤ │ mixed_case │ HeLLo WoRLd │ Hello World │ ✔✔✔✔ │ <1ms│ ├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤ │ empty_string │ │ │ ✔✔✗✔ │ <1ms│ ├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤ │ single_word │ hello │ Hello │ ✔✔✔✔ │ <1ms│ ├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤ │ with_punctuation │ hello, world! │ Hello, World! │ ✔✔✔✔ │ <1ms│ ├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤ │ with_numbers │ hello 123 world │ Hello 123 World │ ✔✔✔✔ │ <1ms│ ├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤ │ apostrophes │ don't stop believin' │ Don'T Stop Believin' │ ✔✔✔✔ │ <1ms│ ├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤ │ Averages │ │ │ 96.9% ✔ │ <1ms│ └───────────────────┴──────────────────────┴───────────────────────┴────────────┴──────────┘ ✅ All tests passed! ``` Note: The `empty_string` case has one failed assertion (`has_capitals`) because an empty string contains no capital letters. ## Saving and Loading Save the dataset for future use: ```python from typing import Any from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import EqualsExpected # The function we're testing def to_title_case(text: str) -> str: """Convert text to title case.""" return text.title() # Create dataset dataset: Dataset[str, str, Any] = Dataset( name='title_case_tests', cases=[Case(inputs='test', expected_output='Test')], evaluators=[EqualsExpected()], ) # Save to YAML dataset.to_file('title_case_tests.yaml') # Load later dataset = Dataset.from_file('title_case_tests.yaml') report = dataset.evaluate_sync(to_title_case) ``` ## Adding More Cases As you find bugs or edge cases, add them to the dataset: ```python from pydantic_evals import Dataset # Load existing dataset dataset = Dataset.from_file('title_case_tests.yaml') # Found a bug with unicode dataset.add_case( name='unicode_chars', inputs='café résumé', expected_output='Café Résumé', ) # Found a bug with all caps words dataset.add_case( name='acronyms', inputs='the USA and FBI', expected_output='The Usa And Fbi', # Python's title() behavior ) # Test with very long input dataset.add_case( name='long_input', inputs=' '.join(['word'] * 1000), expected_output=' '.join(['Word'] * 1000), ) # Save updated dataset dataset.to_file('title_case_tests.yaml') ``` ## Using with pytest Integrate with pytest for CI/CD: ```python import pytest from pydantic_evals import Dataset # The function we're testing def to_title_case(text: str) -> str: """Convert text to title case.""" return text.title() @pytest.fixture def title_case_dataset(): return Dataset.from_file('title_case_tests.yaml') def test_title_case_evaluation(title_case_dataset): """Run evaluation tests.""" report = title_case_dataset.evaluate_sync(to_title_case) # All cases should pass avg = report.averages() assert avg is not None assert avg.assertions == 1.0, f'Some tests failed (pass rate: {avg.assertions:.1%})' def test_title_case_performance(title_case_dataset): """Verify performance.""" report = title_case_dataset.evaluate_sync(to_title_case) # All cases should complete quickly for case in report.cases: assert case.task_duration < 0.001, f'{case.name} took {case.task_duration}s' ``` ## Next Steps - **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Explore all available evaluators - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write your own evaluation logic - **[Dataset Management](/docs/ai/evals/how-to/dataset-management)** - Save, load, and manage datasets - **[Concurrency & Performance](/docs/ai/evals/how-to/concurrency)** - Optimize evaluation performance --- # [Core Concepts](https://pydantic.dev/docs/ai/evals/getting-started/core-concepts/) # Core Concepts This page explains the key concepts in Pydantic Evals and how they work together. Pydantic Evals is built around these core concepts: - **[`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset)** - A static definition containing test cases and evaluators - **[`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case)** - A single test scenario with inputs and optional expected outputs - **[`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator)** - Logic for scoring or validating individual outputs - **[`ReportEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.ReportEvaluator)** - Logic for analyzing full experiment results (e.g., confusion matrices, accuracy) - **Experiment** - The act of running a task function against all cases in a dataset. (This corresponds to a call to `Dataset.evaluate`.) - **[`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport)** - The results from running an experiment The key distinction is between: - **Definition** (`Dataset` with `Case`s, `Evaluator`s, and `ReportEvaluator`s) - what you want to test - **Execution** (Experiment) - running your task against those tests - **Results** (`EvaluationReport` with case results and experiment-wide analyses) - what happened during the experiment ## Unit Testing Analogy A helpful way to think about Pydantic Evals: Unit Testing Pydantic Evals Test function [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) + [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) Test suite [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset) Running tests (`pytest`) **Experiment** (`dataset.evaluate(task)`) Test report [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport) `assert` Evaluator returning `bool` **Key Difference**: AI systems are probabilistic, so instead of simple pass/fail, evaluations can have: - Quantitative scores (0.0 to 1.0) - Qualitative labels ("good", "acceptable", "poor") - Pass/fail assertions with explanatory reasons Just like you can run `pytest` multiple times on the same test suite, you can run multiple experiments on the same dataset to compare different implementations or track changes over time. ## Dataset A [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset) is a collection of test cases and evaluators that define an evaluation suite. ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import IsInstance dataset = Dataset( name='my_eval_suite', cases=[ Case(inputs='test input', expected_output='test output'), ], evaluators=[ IsInstance(type_name='str'), ], ) ``` ### Key Features - **Type-safe**: Generic over `InputsT`, `OutputT`, and `MetadataT` types - **Serializable**: Can be saved to/loaded from YAML or JSON files - **Evaluable**: Run against any function with matching input/output types ### `Dataset`\-Level vs `Case`\-Level Evaluators Evaluators can be defined at two levels: - **`Dataset`\-level**: Apply to all cases in the dataset - **`Case`\-level**: Apply only to specific cases ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import EqualsExpected, IsInstance dataset = Dataset( name='case_level_evaluators', cases=[ Case( name='special_case', inputs='test', expected_output='TEST', evaluators=[ # This evaluator only runs for this case EqualsExpected(), ], ), ], evaluators=[ # This evaluator runs for ALL cases IsInstance(type_name='str'), ], ) ``` ## Experiments An **Experiment** is what happens when you execute a task function against all cases in a dataset. This is the bridge between your static test definition (the Dataset) and your results (the EvaluationReport). ### Running an Experiment You run an experiment by calling [`evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate) or [`evaluate_sync()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate_sync) on a dataset: ```python from pydantic_evals import Case, Dataset # Define your dataset (static definition) dataset = Dataset( name='uppercase_experiment', cases=[ Case(inputs='hello', expected_output='HELLO'), Case(inputs='world', expected_output='WORLD'), ], ) # Define your task def uppercase_task(text: str) -> str: return text.upper() # Run the experiment (execution) report = dataset.evaluate_sync(uppercase_task) ``` ### What Happens During an Experiment When you run an experiment: 1. **Setup**: The dataset loads all cases, evaluators, and report evaluators 2. **Execution**: For each case: 1. The task function is called with `case.inputs` 2. Execution time is measured and OpenTelemetry spans are captured (if `logfire` is configured) 3. The outputs of the task function for each case are recorded 3. **Case Evaluation**: For each case output: 1. All dataset-level evaluators are run 2. Case-specific evaluators are run (if any) 3. Results are collected (scores, assertions, labels) 4. **Report Evaluation**: If report evaluators are configured, they run over the full set of results to produce experiment-wide analyses (confusion matrices, precision-recall curves, scalar metrics, tables, etc.) 5. **Reporting**: All results are aggregated into an [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport), including both per-case results and experiment-wide analyses ### Multiple Experiments from One Dataset A key feature of Pydantic Evals is that you can run the same dataset against different task implementations: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import EqualsExpected dataset = Dataset( name='comparison_test', cases=[ Case(inputs='hello', expected_output='HELLO'), ], evaluators=[EqualsExpected()], ) # Original implementation def task_v1(text: str) -> str: return text.upper() # Improved implementation (with exclamation) def task_v2(text: str) -> str: return text.upper() + '!' # Compare results report_v1 = dataset.evaluate_sync(task_v1) report_v2 = dataset.evaluate_sync(task_v2) avg_v1 = report_v1.averages() avg_v2 = report_v2.averages() print(f'V1 pass rate: {avg_v1.assertions if avg_v1 and avg_v1.assertions else 0}') #> V1 pass rate: 1.0 print(f'V2 pass rate: {avg_v2.assertions if avg_v2 and avg_v2.assertions else 0}') #> V2 pass rate: 0 ``` This allows you to: - **Compare implementations** across versions - **Track performance** over time - **A/B test** different approaches - **Validate changes** before deployment ## Case A [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) represents a single test scenario with specific inputs and optional expected outputs. ```python from pydantic_evals import Case from pydantic_evals.evaluators import EqualsExpected case = Case( name='test_uppercase', # Optional, but recommended for reporting inputs='hello world', # Required: inputs to your task expected_output='HELLO WORLD', # Optional: expected output metadata={'category': 'basic'}, # Optional: arbitrary metadata evaluators=[EqualsExpected()], # Optional: case-specific evaluators ) ``` ### Case Components #### Inputs The inputs to pass to the task being evaluated. Can be any type: ```python from pydantic import BaseModel from pydantic_evals import Case class MyInputModel(BaseModel): field1: str # Simple types Case(inputs='hello') Case(inputs=42) # Complex types Case(inputs={'query': 'What is AI?', 'max_tokens': 100}) Case(inputs=MyInputModel(field1='value')) ``` #### Expected Output The expected result, used by evaluators like [`EqualsExpected`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EqualsExpected): ```python from pydantic_evals import Case Case( inputs='2 + 2', expected_output='4', ) ``` If no `expected_output` is provided, evaluators that require it (like `EqualsExpected`) will skip that case. #### Metadata Arbitrary data that evaluators can access via [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext): ```python from pydantic_evals import Case Case( inputs='question', metadata={ 'difficulty': 'hard', 'category': 'math', 'source': 'exam_2024', }, ) ``` Metadata is useful for: - Filtering cases during analysis - Providing context to evaluators - Organizing test suites #### Evaluators Cases can have their own evaluators that only run for that specific case. This is particularly powerful for building comprehensive evaluation suites where different cases have different requirements - if you could write one evaluator rubric that worked perfectly for all cases, you'd just incorporate it into your agent instructions. Case-specific [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) evaluators are especially useful for quickly building maintainable golden datasets by describing what "good" looks like for each scenario. See [Case-specific evaluators](/docs/ai/evals/evaluators/overview#case-specific-evaluators) for a more detailed explanation and examples. ## Evaluator An [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) assesses the output of your task and returns one or more scores, labels, or assertions. Each score, label or assertion can also have an optional string-value reason associated. ### Evaluator Types Evaluators return different types of results: Return Type Purpose Example `bool` **Assertion** - Pass/fail check `True` → ✔, `False` → ✗ `int` or `float` **Score** - Numeric quality metric `0.95`, `87` `str` **Label** - Categorical result `"correct"`, `"hallucination"` ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output # Assertion @dataclass class Confidence(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> float: # Analyze output and return confidence score return 0.95 # Score @dataclass class Classifier(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> str: if 'error' in ctx.output.lower(): return 'error' # Label return 'success' ``` Evaluators can also return instances of [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason), and dictionaries mapping labels to output values. See the [custom evaluator return types](/docs/ai/evals/evaluators/custom#return-types) docs for more detail. ### EvaluatorContext All evaluators receive an [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext) containing: - `name`: Case name (optional) - `inputs`: Task inputs - `metadata`: Case metadata (optional) - `expected_output`: Expected output (optional) - `output`: Actual output from task - `duration`: Task execution time in seconds - `span_tree`: OpenTelemetry spans (if `logfire` is configured) - `attributes`: Custom attributes dict - `metrics`: Custom metrics dict ### Multiple Evaluations Evaluators can return multiple results by returning a dictionary: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class MultiCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | float | str]: return { 'is_valid': isinstance(ctx.output, str), # Assertion 'length': len(ctx.output), # Metric 'category': 'long' if len(ctx.output) > 100 else 'short', # Label } ``` ### Evaluation Reasons Add explanations to your evaluations using [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason): ```python from dataclasses import dataclass from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext @dataclass class SmartCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason: if ctx.output == ctx.expected_output: return EvaluationReason( value=True, reason='Exact match with expected output', ) return EvaluationReason( value=False, reason=f'Expected {ctx.expected_output!r}, got {ctx.output!r}', ) ``` Reasons appear in reports when using `include_reasons=True`. ## Evaluation Report An [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport) is the result of running an experiment. It contains all the data from executing your task against the dataset's cases and running all evaluators. ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import EqualsExpected dataset = Dataset( name='report_example', cases=[Case(inputs='hello', expected_output='HELLO')], evaluators=[EqualsExpected()], ) def my_task(text: str) -> str: return text.upper() # Run an experiment report = dataset.evaluate_sync(my_task) # Print to console report.print() """ Evaluation Summary: my_task ┏━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ Case 1 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └──────────┴────────────┴──────────┘ """ # Access data programmatically for case in report.cases: print(f'{case.name}: {case.scores}') #> Case 1: {} ``` ### Report Structure The [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport) contains: - `name`: Experiment name - `cases`: List of successful case evaluations - `failures`: List of failed executions - `analyses`: List of experiment-wide analyses from [report evaluators](/docs/ai/evals/evaluators/report-evaluators) (confusion matrices, PR curves, scalars, tables) - `trace_id`: OpenTelemetry trace ID (optional) - `span_id`: OpenTelemetry span ID (optional) ### ReportCase Each successfulcase result contains: **Case data:** - `name`: Case name - `inputs`: Task inputs - `metadata`: Case metadata (optional) - `expected_output`: Expected output (optional) - `output`: Actual output from task **Evaluation results:** - `scores`: Dictionary of numeric scores from evaluators - `labels`: Dictionary of categorical labels from evaluators - `assertions`: Dictionary of pass/fail assertions from evaluators **Performance data:** - `task_duration`: Task execution time - `total_duration`: Total time including evaluators **Additional data:** - `metrics`: Custom metrics dict - `attributes`: Custom attributes dict **Tracing:** - `trace_id`: OpenTelemetry trace ID (optional) - `span_id`: OpenTelemetry span ID (optional) **Errors:** - `evaluator_failures`: List of evaluator errors ## Data Model Relationships Here's how the core concepts relate to each other: ### Static Definition - A **Dataset** contains: - Many **Cases** (test scenarios with inputs and expected outputs) - Many **Evaluators** (logic for scoring individual outputs) - Many **Report Evaluators** (logic for analyzing full experiment results) ### Execution (Experiment) When you call `dataset.evaluate(task)`, an **Experiment** runs: - The **Task** function is executed against all **Cases** in the **Dataset** - All **Evaluators** are run (both dataset-level and case-specific) against each output as appropriate - One **EvaluationReport** is produced as the final output ### Results - An **EvaluationReport** contains: - Results for each **Case** (inputs, outputs, scores, assertions, labels) - Experiment-wide **Analyses** from report evaluators (confusion matrices, PR curves, scalars, tables) - Summary statistics (averages, pass rates) - Performance data (durations) - Tracing information (OpenTelemetry spans) ### Key Relationships - **One Dataset → Many Experiments**: You can run the same dataset against different task implementations or multiple times to track changes - **One Experiment → One Report**: Each time you call `dataset.evaluate(...)`, you get one report - **One Experiment → Many Case Results**: The report contains results for every case in the dataset ## Next Steps - **[Evaluators Overview](/docs/ai/evals/evaluators/overview)** - When to use different evaluator types - **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Complete reference of provided evaluators - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write your own evaluation logic - **[Report Evaluators](/docs/ai/evals/evaluators/report-evaluators)** - Experiment-wide analyses (confusion matrices, PR curves, etc.) - **[Dataset Management](/docs/ai/evals/how-to/dataset-management)** - Save, load, and generate datasets --- # [Quick Start](https://pydantic.dev/docs/ai/evals/getting-started/quick-start/) # Quick Start **Pydantic Evals** is a powerful evaluation framework for systematically testing and evaluating AI systems, from simple LLM calls to complex multi-agent applications. ## What is Pydantic Evals? Pydantic Evals helps you: - **Create test datasets** with type-safe structured inputs and expected outputs - **Run evaluations** against your AI systems with automatic concurrency - **Score results** using deterministic checks, LLM judges, or custom evaluators - **Generate reports** with detailed metrics, assertions, and performance data - **Track changes** by comparing evaluation runs over time - **Integrate with Logfire** for visualization and collaborative analysis ## Installation Terminal ```bash pip install pydantic-evals ``` For OpenTelemetry tracing and Logfire integration: Terminal ```bash pip install 'pydantic-evals[logfire]' ``` ## Quick Start While evaluations are typically used to test AI systems, the Pydantic Evals framework works with any function call. To demonstrate the core functionality, we'll start with a simple, deterministic example. Here's a complete example of evaluating a simple text transformation function: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Contains, EqualsExpected # Create a dataset with test cases dataset = Dataset( name='uppercase_tests', cases=[ Case( name='uppercase_basic', inputs='hello world', expected_output='HELLO WORLD', ), Case( name='uppercase_with_numbers', inputs='hello 123', expected_output='HELLO 123', ), ], evaluators=[ EqualsExpected(), # Check exact match with expected_output Contains(value='HELLO', case_sensitive=True), # Check contains "HELLO" ], ) # Define the function to evaluate def uppercase_text(text: str) -> str: return text.upper() # Run the evaluation report = dataset.evaluate_sync(uppercase_text) # Print the results report.print() """ Evaluation Summary: uppercase_text ┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ uppercase_basic │ ✔✔ │ 10ms │ ├────────────────────────┼────────────┼──────────┤ │ uppercase_with_numbers │ ✔✔ │ 10ms │ ├────────────────────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └────────────────────────┴────────────┴──────────┘ """ ``` Output: ``` Evaluation Summary: uppercase_text ┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ uppercase_basic │ ✔✔ │ 10ms │ ├─────────────────────────┼────────────┼──────────┤ │ uppercase_with_numbers │ ✔✔ │ 10ms │ ├─────────────────────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └─────────────────────────┴────────────┴──────────┘ ``` ## Key Concepts Understanding a few core concepts will help you get the most out of Pydantic Evals: - **[`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset)** - A collection of test cases and (optional) evaluators - **[`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case)** - A single test scenario with inputs and optional expected outputs and case-specific evaluators - **[`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator)** - A function that scores or validates task outputs - **[`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport)** - Results from running an evaluation For a deeper dive, see [Core Concepts](/docs/ai/evals/core-concepts). ## Common Use Cases ### Deterministic Validation Test that your AI system produces correctly-structured outputs: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Contains, IsInstance dataset = Dataset( name='dict_validation', cases=[ Case(inputs={'data': 'required_key present'}, expected_output={'result': 'success'}), ], evaluators=[ IsInstance(type_name='dict'), Contains(value='required_key'), ], ) ``` ### LLM-as-a-Judge Evaluation Use an LLM to evaluate subjective qualities like accuracy or helpfulness: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge dataset = Dataset( name='llm_judge_test', cases=[ Case(inputs='What is the capital of France?', expected_output='Paris'), ], evaluators=[ LLMJudge( rubric='Response is accurate and helpful', include_input=True, model='anthropic:claude-sonnet-4-6', ) ], ) ``` ### Performance Testing Ensure your system meets performance requirements: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import MaxDuration dataset = Dataset( name='performance_test', cases=[ Case(inputs='test input', expected_output='test output'), ], evaluators=[ MaxDuration(seconds=2.0), ], ) ``` ## Next Steps Explore the documentation to learn more: - **[Core Concepts](/docs/ai/evals/core-concepts)** - Understand the data model and evaluation flow - **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Learn about all available evaluators - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write your own evaluation logic - **[Dataset Management](/docs/ai/evals/how-to/dataset-management)** - Save, load, and generate datasets - **[Examples](/docs/ai/evals/examples/simple-validation)** - Practical examples for common scenarios --- # [Concurrency & Performance](https://pydantic.dev/docs/ai/evals/how-to/concurrency/) # Concurrency & Performance Control how evaluation cases are executed in parallel. By default, Pydantic Evals runs all cases concurrently to maximize throughput. You can control this behavior using the `max_concurrency` parameter. ## Basic Usage ```python from pydantic_evals import Case, Dataset def my_task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='concurrency_demo', cases=[Case(inputs='test1'), Case(inputs='test2')]) # Run all cases concurrently (default) report = dataset.evaluate_sync(my_task) # Limit to 5 concurrent cases report = dataset.evaluate_sync(my_task, max_concurrency=5) # Run sequentially (one at a time) report = dataset.evaluate_sync(my_task, max_concurrency=1) ``` ## When to Limit Concurrency ### Rate Limiting Many APIs have rate limits that restrict concurrent requests: ```python from pydantic_evals import Case, Dataset async def my_llm_task(inputs: str) -> str: return f'LLM Result: {inputs}' dataset = Dataset(name='rate_limit_demo', cases=[Case(inputs='test1')]) # If your API allows 10 requests/second report = dataset.evaluate_sync( my_llm_task, max_concurrency=10, ) ``` ### Resource Constraints Limit concurrency to avoid overwhelming system resources: ```python from pydantic_evals import Case, Dataset def heavy_computation(inputs: str) -> str: return f'Heavy: {inputs}' def db_query_task(inputs: str) -> str: return f'DB: {inputs}' dataset = Dataset(name='resource_constraints', cases=[Case(inputs='test1')]) # Memory-intensive operations report = dataset.evaluate_sync( heavy_computation, max_concurrency=2, # Only 2 at a time ) # Database connection pool limits report = dataset.evaluate_sync( db_query_task, max_concurrency=5, # Match connection pool size ) ``` ### Debugging Run sequentially to see clear error traces: ```python from pydantic_evals import Case, Dataset def my_task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='debug_demo', cases=[Case(inputs='test1')]) # Easier to debug report = dataset.evaluate_sync( my_task, max_concurrency=1, ) ``` ## Performance Comparison Here's an example showing the performance difference: concurrency\_example.py ```python import asyncio from pydantic_evals import Case, Dataset # Create a dataset with multiple test cases dataset = Dataset( name='performance_comparison', cases=[ Case( name=f'case_{i}', inputs=i, expected_output=i * 2, ) for i in range(10) ] ) async def slow_task(input_value: int) -> int: """Simulates a slow operation (e.g., API call).""" await asyncio.sleep(0.1) # 100ms per case return input_value * 2 # Unlimited concurrency: ~0.1s total (all cases run in parallel) report = dataset.evaluate_sync(slow_task) # Limited concurrency: ~0.5s total (2 at a time, 5 batches) report = dataset.evaluate_sync(slow_task, max_concurrency=2) # Sequential: ~1.0s total (one at a time, 10 cases) report = dataset.evaluate_sync(slow_task, max_concurrency=1) ``` ## Concurrency with Evaluators Both task execution and evaluator execution happen concurrently by default: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge def my_task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset( name='evaluator_concurrency', cases=[Case(inputs=f'test{i}') for i in range(100)], # 100 cases evaluators=[ LLMJudge(rubric='Quality check'), # Makes API calls ], ) # Both task and evaluator run with controlled concurrency report = dataset.evaluate_sync( my_task, max_concurrency=10, ) ``` If your evaluators are expensive (e.g., [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge)), limiting concurrency helps manage: - API rate limits - Cost (fewer concurrent API calls) - Memory usage ## Async vs Sync Both sync and async evaluation support concurrency control: ### Sync API ```python from pydantic_evals import Case, Dataset def my_task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='sync_demo', cases=[Case(inputs='test1')]) # Runs async operations internally with controlled concurrency report = dataset.evaluate_sync(my_task, max_concurrency=10) ``` ### Async API ```python from pydantic_evals import Case, Dataset async def my_task(inputs: str) -> str: return f'Result: {inputs}' async def run_evaluation(): dataset = Dataset(name='async_demo', cases=[Case(inputs='test1')]) # Same behavior, but in async context report = await dataset.evaluate(my_task, max_concurrency=10) return report ``` ## Monitoring Concurrency Track execution to optimize settings: ```python import time from pydantic_evals import Case, Dataset def task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='monitoring', cases=[Case(inputs=f'test{i}') for i in range(10)]) t0 = time.time() report = dataset.evaluate_sync(task, max_concurrency=10) duration = time.time() - t0 num_cases = len(report.cases) + len(report.failures) avg_duration = duration / num_cases print(f'Total: {duration:.2f}s') #> Total: 0.01s print(f'Cases: {num_cases}') #> Cases: 10 print(f'Avg per case: {avg_duration:.2f}s') #> Avg per case: 0.00s print(f'Effective concurrency: ~{num_cases * avg_duration / duration:.1f}') #> Effective concurrency: ~1.0 ``` ## Handling Rate Limits If you hit rate limits, the evaluation will fail. Use retry strategies: ```python from pydantic_evals import Case, Dataset def task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='rate_limit_handling', cases=[Case(inputs='test1')]) # Reduce concurrency to avoid rate limits report = dataset.evaluate_sync( task, max_concurrency=5, # Stay under rate limit ) ``` See [Retry Strategies](/docs/ai/evals/how-to/retry-strategies) for handling transient failures. ## Next Steps - **[Retry Strategies](/docs/ai/evals/how-to/retry-strategies)** - Handle transient failures - **[Dataset Management](/docs/ai/evals/how-to/dataset-management)** - Work with large datasets - **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** - Monitor performance --- # [Dataset Management](https://pydantic.dev/docs/ai/evals/how-to/dataset-management/) # Dataset Management Create, save, load, and generate evaluation datasets. ## Creating Datasets ### From Code Define datasets directly in Python: ```python from typing import Any from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import EqualsExpected, IsInstance dataset = Dataset[str, str, Any]( name='my_eval_suite', cases=[ Case( name='test_1', inputs='input 1', expected_output='output 1', ), Case( name='test_2', inputs='input 2', expected_output='output 2', ), ], evaluators=[ IsInstance(type_name='str'), EqualsExpected(), ], ) ``` ### Adding Cases Dynamically ```python from typing import Any from pydantic_evals import Dataset from pydantic_evals.evaluators import IsInstance dataset = Dataset[str, str, Any](name='dynamic_dataset', cases=[], evaluators=[]) # Add cases one at a time dataset.add_case( name='dynamic_case', inputs='test input', expected_output='test output', ) # Add evaluators dataset.add_evaluator(IsInstance(type_name='str')) ``` ## Saving Datasets Detailed Serialization Guide For complete details on serialization formats, JSON schema generation, and custom evaluators, see [Dataset Serialization](/docs/ai/evals/how-to/dataset-serialization). ### Save to YAML ```python from typing import Any from pydantic_evals import Case, Dataset dataset = Dataset[str, str, Any](name='my_eval_suite', cases=[Case(name='test', inputs='example')]) dataset.to_file('my_dataset.yaml') # Also saves schema file: my_dataset_schema.json ``` Output (`my_dataset.yaml`): ```yaml # yaml-language-server: $schema=my_dataset_schema.json name: my_eval_suite cases: - name: test_1 inputs: input 1 expected_output: output 1 evaluators: - EqualsExpected - name: test_2 inputs: input 2 expected_output: output 2 evaluators: - EqualsExpected evaluators: - IsInstance: str ``` ### Save to JSON ```python from typing import Any from pydantic_evals import Case, Dataset dataset = Dataset[str, str, Any](name='my_eval_suite', cases=[Case(name='test', inputs='example')]) dataset.to_file('my_dataset.json') # Also saves schema file: my_dataset_schema.json ``` ### Custom Schema Path ```python from pathlib import Path from typing import Any from pydantic_evals import Case, Dataset dataset = Dataset[str, str, Any](name='my_eval_suite', cases=[Case(name='test', inputs='example')]) # Custom schema location Path('data').mkdir(exist_ok=True) Path('data/schemas').mkdir(parents=True, exist_ok=True) dataset.to_file( 'data/my_dataset.yaml', schema_path='schemas/my_schema.json', ) # No schema file dataset.to_file('my_dataset.yaml', schema_path=None) ``` ## Loading Datasets ### From YAML/JSON ```python from typing import Any from pydantic_evals import Dataset # Infers format from extension dataset = Dataset[str, str, Any].from_file('my_dataset.yaml') dataset = Dataset[str, str, Any].from_file('my_dataset.json') # Explicit format for non-standard extensions dataset = Dataset[str, str, Any].from_file('data.txt', fmt='yaml') ``` ### From String ```python from typing import Any from pydantic_evals import Dataset yaml_content = """ name: my_tests cases: - name: test inputs: hello expected_output: HELLO evaluators: - EqualsExpected """ dataset = Dataset[str, str, Any].from_text(yaml_content, fmt='yaml') ``` ### From Dict ```python from typing import Any from pydantic_evals import Dataset data = { 'name': 'my_tests', 'cases': [ { 'name': 'test', 'inputs': 'hello', 'expected_output': 'HELLO', }, ], 'evaluators': [{'EqualsExpected': {}}], } dataset = Dataset[str, str, Any].from_dict(data) ``` ### With Custom Evaluators When loading datasets that use custom evaluators, you must pass them to `from_file()`: ```python from dataclasses import dataclass from typing import Any from pydantic_evals import Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class MyCustomEvaluator(Evaluator): threshold: float = 0.5 def evaluate(self, ctx: EvaluatorContext) -> bool: return True # Load with custom evaluator registry dataset = Dataset[str, str, Any].from_file( 'my_dataset.yaml', custom_evaluator_types=[MyCustomEvaluator], ) ``` For complete details on serialization with custom evaluators, see [Dataset Serialization](/docs/ai/evals/how-to/dataset-serialization). ## Generating Datasets Pydantic Evals allows you to generate test datasets using LLMs with [`generate_dataset`](/docs/ai/api/pydantic_evals/generation/#pydantic_evals.generation.generate_dataset). Datasets can be generated in either JSON or YAML format, in both cases a JSON schema file is generated alongside the dataset and referenced in the dataset, so you should get type checking and auto-completion in your editor. generate\_dataset\_example.py ```python from __future__ import annotations from pathlib import Path from pydantic import BaseModel, Field from pydantic_evals import Dataset from pydantic_evals.generation import generate_dataset class QuestionInputs(BaseModel, use_attribute_docstrings=True): # (1) """Model for question inputs.""" question: str """A question to answer""" context: str | None = None """Optional context for the question""" class AnswerOutput(BaseModel, use_attribute_docstrings=True): # (2) """Model for expected answer outputs.""" answer: str """The answer to the question""" confidence: float = Field(ge=0, le=1) """Confidence level (0-1)""" class MetadataType(BaseModel, use_attribute_docstrings=True): # (3) """Metadata model for test cases.""" difficulty: str """Difficulty level (easy, medium, hard)""" category: str """Question category""" async def main(): dataset = await generate_dataset( # (4) dataset_type=Dataset[QuestionInputs, AnswerOutput, MetadataType], n_examples=2, extra_instructions=""" Generate question-answer pairs about world capitals and landmarks. Make sure to include both easy and challenging questions. """, ) output_file = Path('questions_cases.yaml') dataset.to_file(output_file) # (5) print(output_file.read_text(encoding='utf-8')) """ # yaml-language-server: $schema=questions_cases_schema.json name: generated cases: - name: Easy Capital Question inputs: question: What is the capital of France? context: null metadata: difficulty: easy category: Geography expected_output: answer: Paris confidence: 0.95 evaluators: - EqualsExpected - name: Challenging Landmark Question inputs: question: Which world-famous landmark is located on the banks of the Seine River? context: null metadata: difficulty: hard category: Landmarks expected_output: answer: Eiffel Tower confidence: 0.9 evaluators: - EqualsExpected evaluators: [] report_evaluators: [] """ ``` Define the schema for the inputs to the task. Define the schema for the expected outputs of the task. Define the schema for the metadata of the test cases. Call [`generate_dataset`](/docs/ai/api/pydantic_evals/generation/#pydantic_evals.generation.generate_dataset) to create a [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset) with 2 cases confirming to the schema. Save the dataset to a YAML file, this will also write `questions_cases_schema.json` with the schema JSON schema for `questions_cases.yaml` to make editing easier. The magic `yaml-language-server` comment is supported by at least vscode, jetbrains/pycharm (more details [here](https://github.com/redhat-developer/yaml-language-server#using-inlined-schema)). _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main(answer))` to run `main`)_ You can also write datasets as JSON files: generate\_dataset\_example\_json.py ```python from pathlib import Path from pydantic_evals import Dataset from pydantic_evals.generation import generate_dataset from generate_dataset_example import AnswerOutput, MetadataType, QuestionInputs async def main(): dataset = await generate_dataset( # (1) dataset_type=Dataset[QuestionInputs, AnswerOutput, MetadataType], n_examples=2, extra_instructions=""" Generate question-answer pairs about world capitals and landmarks. Make sure to include both easy and challenging questions. """, ) output_file = Path('questions_cases.json') dataset.to_file(output_file) # (2) print(output_file.read_text(encoding='utf-8')) """ { "$schema": "questions_cases_schema.json", "name": "generated", "cases": [ { "name": "Easy Capital Question", "inputs": { "question": "What is the capital of France?", "context": null }, "metadata": { "difficulty": "easy", "category": "Geography" }, "expected_output": { "answer": "Paris", "confidence": 0.95 }, "evaluators": [ "EqualsExpected" ] }, { "name": "Challenging Landmark Question", "inputs": { "question": "Which world-famous landmark is located on the banks of the Seine River?", "context": null }, "metadata": { "difficulty": "hard", "category": "Landmarks" }, "expected_output": { "answer": "Eiffel Tower", "confidence": 0.9 }, "evaluators": [ "EqualsExpected" ] } ], "evaluators": [], "report_evaluators": [] } """ ``` Generate the [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset) exactly as above. Save the dataset to a JSON file, this will also write `questions_cases_schema.json` with th JSON schema for `questions_cases.json`. This time the `$schema` key is included in the JSON file to define the schema for IDEs to use while you edit the file, there's no formal spec for this, but it works in vscode and pycharm and is discussed at length in [json-schema-org/json-schema-spec#828](https://github.com/json-schema-org/json-schema-spec/issues/828). _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main(answer))` to run `main`)_ ## Type-Safe Datasets Use generic type parameters for type safety: ```python from typing_extensions import TypedDict from pydantic_evals import Case, Dataset class MyInput(TypedDict): query: str max_results: int class MyOutput(TypedDict): results: list[str] class MyMetadata(TypedDict): category: str # Type-safe dataset dataset: Dataset[MyInput, MyOutput, MyMetadata] = Dataset( name='typed_dataset', cases=[ Case( name='test', inputs={'query': 'test', 'max_results': 10}, expected_output={'results': ['a', 'b']}, metadata={'category': 'search'}, ), ], ) ``` ## Schema Generation Generate JSON Schema for IDE support: ```python from typing import Any from pydantic_evals import Case, Dataset dataset = Dataset[str, str, Any](name='my_eval_suite', cases=[Case(name='test', inputs='example')]) # Save with schema dataset.to_file('my_dataset.yaml') # Creates my_dataset_schema.json # Schema enables: # - Autocomplete in VS Code/PyCharm # - Validation while editing # - Inline documentation ``` Manual schema generation: ```python import json from dataclasses import dataclass from typing import Any from pydantic_evals import Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class MyCustomEvaluator(Evaluator): threshold: float = 0.5 def evaluate(self, ctx: EvaluatorContext) -> bool: return True schema = Dataset[str, str, Any].model_json_schema_with_evaluators( custom_evaluator_types=[MyCustomEvaluator], ) print(json.dumps(schema, indent=2)[:66] + '...') """ { "$defs": { "Case": { "additionalProperties": false, ... """ ``` ## Best Practices ### 1\. Use Clear Names ```python from pydantic_evals import Case # Good Case(name='uppercase_basic_ascii', inputs='hello') Case(name='uppercase_unicode_emoji', inputs='hello 😀') Case(name='uppercase_empty_string', inputs='') # Bad Case(name='test1', inputs='hello') Case(name='test2', inputs='world') Case(name='test3', inputs='foo') ``` ### 2\. Organize by Difficulty ```python from pydantic_evals import Case, Dataset dataset = Dataset( name='organized_by_difficulty', cases=[ Case(name='easy_1', inputs='test', metadata={'difficulty': 'easy'}), Case(name='easy_2', inputs='test2', metadata={'difficulty': 'easy'}), Case(name='medium_1', inputs='test3', metadata={'difficulty': 'medium'}), Case(name='hard_1', inputs='test4', metadata={'difficulty': 'hard'}), ], ) ``` ### 3\. Start Small, Grow Gradually ```python from pydantic_evals import Case, Dataset # Start with representative cases dataset = Dataset( name='starting_small', cases=[ Case(name='happy_path', inputs='test'), Case(name='edge_case', inputs=''), Case(name='error_case', inputs='invalid'), ], ) # Add more as you find issues dataset.add_case(name='newly_discovered_edge_case', inputs='edge') ``` ### 4\. Use Case-specific Evaluators Where Appropriate Case-specific evaluators let different cases have different evaluation criteria, which is essential for comprehensive "test coverage". Rather than trying to write one-size-fits-all evaluators, you can specify exactly what "good" looks like for each scenario. This is particularly powerful with [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) evaluators where you can describe nuanced requirements per case, making it easy to build and maintain golden datasets. See [Case-specific evaluators](/docs/ai/evals/evaluators/overview#case-specific-evaluators) for detailed guidance. ### 5\. Separate Datasets by Purpose ```python from typing import Any from pydantic_evals import Case, Dataset # First create some test datasets for name in ['smoke_tests', 'comprehensive_tests', 'regression_tests']: test_dataset = Dataset[str, Any, Any](name=name, cases=[Case(name='test', inputs='example')]) test_dataset.to_file(f'{name}.yaml') # Smoke tests (fast, critical paths) smoke_tests = Dataset[str, Any, Any].from_file('smoke_tests.yaml') # Comprehensive tests (slow, thorough) comprehensive = Dataset[str, Any, Any].from_file('comprehensive_tests.yaml') # Regression tests (specific bugs) regression = Dataset[str, Any, Any].from_file('regression_tests.yaml') ``` ## Next Steps - **[Dataset Serialization](/docs/ai/evals/how-to/dataset-serialization)** - In-depth guide to saving and loading datasets - **[Generating Datasets](#generating-datasets)** - Use LLMs to generate test cases - **[Examples: Simple Validation](/docs/ai/evals/examples/simple-validation)** - Practical examples --- # [Dataset Serialization](https://pydantic.dev/docs/ai/evals/how-to/dataset-serialization/) # Dataset Serialization Learn how to save and load datasets in different formats, with support for custom evaluators and IDE integration. Pydantic Evals supports serializing datasets to files in two formats: - **YAML** (`.yaml`, `.yml`) - Human-readable, great for version control - **JSON** (`.json`) - Structured, machine-readable Both formats support: - Automatic JSON schema generation for IDE autocomplete and validation - Custom evaluator serialization/deserialization - Type-safe loading with generic parameters ## YAML Format YAML is the recommended format for most use cases due to its readability and compact syntax. ### Basic Example ```python from typing import Any from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import EqualsExpected, IsInstance # Create a dataset with typed parameters dataset = Dataset[str, str, Any]( name='my_tests', cases=[ Case( name='test_1', inputs='hello', expected_output='HELLO', ), ], evaluators=[ IsInstance(type_name='str'), EqualsExpected(), ], ) # Save to YAML dataset.to_file('my_tests.yaml') ``` This creates two files: 1. **`my_tests.yaml`** - The dataset 2. **`my_tests_schema.json`** - JSON schema for IDE support ### YAML Output ```yaml # yaml-language-server: $schema=my_tests_schema.json name: my_tests cases: - name: test_1 inputs: hello expected_output: HELLO evaluators: - IsInstance: str - EqualsExpected ``` ### JSON Schema for IDEs The first line references the schema file: ```yaml # yaml-language-server: $schema=my_tests_schema.json ``` This enables: - ✅ **Autocomplete** in VS Code, PyCharm, and other editors - ✅ **Inline validation** while editing - ✅ **Documentation tooltips** for fields - ✅ **Error highlighting** for invalid data Editor Support The `yaml-language-server` comment is supported by: - VS Code (with YAML extension) - JetBrains IDEs (PyCharm, IntelliJ, etc.) - Most editors with YAML language server support See the [YAML Language Server docs](https://github.com/redhat-developer/yaml-language-server#using-inlined-schema) for more details. ### Loading from YAML ```python from pathlib import Path from typing import Any from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import EqualsExpected, IsInstance # First create and save the dataset Path('my_tests.yaml').parent.mkdir(exist_ok=True) dataset = Dataset[str, str, Any]( name='my_tests', cases=[Case(name='test_1', inputs='hello', expected_output='HELLO')], evaluators=[IsInstance(type_name='str'), EqualsExpected()], ) dataset.to_file('my_tests.yaml') # Load the dataset with type parameters dataset = Dataset[str, str, Any].from_file('my_tests.yaml') def my_task(text: str) -> str: return text.upper() # Run evaluation report = dataset.evaluate_sync(my_task) ``` ## JSON Format JSON format is useful for programmatic generation or when strict structure is required. ### Basic Example ```python from typing import Any from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import EqualsExpected dataset = Dataset[str, str, Any]( name='my_tests', cases=[ Case(name='test_1', inputs='hello', expected_output='HELLO'), ], evaluators=[EqualsExpected()], ) # Save to JSON dataset.to_file('my_tests.json') ``` ### JSON Output ```json { "$schema": "my_tests_schema.json", "name": "my_tests", "cases": [ { "name": "test_1", "inputs": "hello", "expected_output": "HELLO" } ], "evaluators": [ "EqualsExpected" ] } ``` The `$schema` key at the top enables IDE support similar to YAML. ### Loading from JSON ```python from typing import Any from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import EqualsExpected # First create and save the dataset dataset = Dataset[str, str, Any]( name='my_tests', cases=[Case(name='test_1', inputs='hello', expected_output='HELLO')], evaluators=[EqualsExpected()], ) dataset.to_file('my_tests.json') # Load from JSON dataset = Dataset[str, str, Any].from_file('my_tests.json') ``` ## Schema Generation ### Automatic Schema Creation By default, `to_file()` creates a JSON schema file alongside your dataset: ```python from typing import Any from pydantic_evals import Case, Dataset dataset = Dataset[str, str, Any](name='my_tests', cases=[Case(inputs='test')]) # Creates both my_tests.yaml AND my_tests_schema.json dataset.to_file('my_tests.yaml') ``` ### Custom Schema Location ```python from pathlib import Path from typing import Any from pydantic_evals import Case, Dataset dataset = Dataset[str, str, Any](name='my_tests', cases=[Case(inputs='test')]) # Create directories Path('data').mkdir(exist_ok=True) # Custom schema filename (relative to dataset file location) dataset.to_file( 'data/my_tests.yaml', schema_path='my_schema.json', ) # No schema file dataset.to_file('my_tests.yaml', schema_path=None) ``` ### Schema Path Templates Use `{stem}` to reference the dataset filename: ```python from typing import Any from pydantic_evals import Case, Dataset dataset = Dataset[str, str, Any](name='my_tests', cases=[Case(inputs='test')]) # Creates: my_tests.yaml and my_tests.schema.json dataset.to_file( 'my_tests.yaml', schema_path='{stem}.schema.json', ) ``` ### Manual Schema Generation Generate a schema without saving the dataset: ```python import json from typing import Any from pydantic_evals import Dataset # Get schema as dictionary for a specific dataset type schema = Dataset[str, str, Any].model_json_schema_with_evaluators() # Save manually with open('custom_schema.json', 'w', encoding='utf-8') as f: json.dump(schema, f, indent=2) ``` ## Custom Evaluators Custom evaluators require special handling during serialization and deserialization. ### Requirements Custom evaluators must: 1. Be decorated with `@dataclass` 2. Inherit from `Evaluator` 3. Be passed to both `to_file()` and `from_file()` ### Complete Example ```python from dataclasses import dataclass from typing import Any from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class CustomThreshold(Evaluator): """Check if output length exceeds a threshold.""" min_length: int max_length: int = 100 def evaluate(self, ctx: EvaluatorContext) -> bool: length = len(str(ctx.output)) return self.min_length <= length <= self.max_length # Create dataset with custom evaluator dataset = Dataset[str, str, Any]( name='custom_threshold_tests', cases=[ Case( name='test_length', inputs='example', expected_output='long result', evaluators=[ CustomThreshold(min_length=5, max_length=20), ], ), ], ) # Save with custom evaluator types dataset.to_file( 'dataset.yaml', custom_evaluator_types=[CustomThreshold], ) ``` ### Saved YAML ```yaml # yaml-language-server: $schema=dataset_schema.json cases: - name: test_length inputs: example expected_output: long result evaluators: - CustomThreshold: min_length: 5 max_length: 20 ``` ### Loading with Custom Evaluators ```python from dataclasses import dataclass from typing import Any from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class CustomThreshold(Evaluator): """Check if output length exceeds a threshold.""" min_length: int max_length: int = 100 def evaluate(self, ctx: EvaluatorContext) -> bool: length = len(str(ctx.output)) return self.min_length <= length <= self.max_length # First create and save the dataset dataset = Dataset[str, str, Any]( name='custom_threshold_tests', cases=[ Case( name='test_length', inputs='example', expected_output='long result', evaluators=[CustomThreshold(min_length=5, max_length=20)], ), ], ) dataset.to_file('dataset.yaml', custom_evaluator_types=[CustomThreshold]) # Load with custom evaluator registry dataset = Dataset[str, str, Any].from_file( 'dataset.yaml', custom_evaluator_types=[CustomThreshold], ) ``` Important You must pass `custom_evaluator_types` to **both** `to_file()` and `from_file()`. - `to_file()`: Includes the evaluator in the JSON schema - `from_file()`: Registers the evaluator for deserialization ## Evaluator Serialization Formats Evaluators can be serialized in three forms: ### 1\. Name Only (No Parameters) ```yaml evaluators: - EqualsExpected - IsInstance: str # Using default parameter ``` ### 2\. Single Parameter (Short Form) ```yaml evaluators: - IsInstance: str - Contains: "required text" - MaxDuration: 2.0 ``` ### 3\. Multiple Parameters (Dict Form) ```yaml evaluators: - CustomThreshold: min_length: 5 max_length: 20 - LLMJudge: rubric: "Response is accurate" model: "openai:gpt-5" include_input: true ``` ## Format Comparison Feature YAML JSON Human readable ✅ Excellent ⚠️ Good Comments ✅ Yes ❌ No Compact ✅ Yes ⚠️ Verbose Machine parsing ✅ Good ✅ Excellent IDE support ✅ Yes ✅ Yes Version control ✅ Clean diffs ⚠️ Noisy diffs **Recommendation**: Use YAML for most cases, JSON for programmatic generation. ## Advanced: Evaluator Serialization Name Customize how your evaluator appears in serialized files: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class VeryLongDescriptiveEvaluatorName(Evaluator): @classmethod def get_serialization_name(cls) -> str: return 'ShortName' def evaluate(self, ctx: EvaluatorContext) -> bool: return True ``` In YAML: ```yaml evaluators: - ShortName # Instead of VeryLongDescriptiveEvaluatorName ``` ## Troubleshooting ### Schema Not Found in IDE **Problem**: YAML file doesn't show autocomplete **Solutions**: 1. **Check the schema path** in the first line of YAML: ```yaml # yaml-language-server: $schema=correct_schema_name.json ``` 2. **Verify schema file exists** in the same directory 3. **Restart the language server** in your IDE 4. **Install YAML extension** (VS Code: "YAML" by Red Hat) ### Custom Evaluator Not Found **Problem**: `ValueError: Unknown evaluator name: 'CustomEvaluator'` **Solution**: Pass `custom_evaluator_types` when loading: ```python from dataclasses import dataclass from typing import Any from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class CustomEvaluator(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return True # First create and save with custom evaluator dataset = Dataset[str, str, Any]( name='custom_eval_tests', cases=[Case(inputs='test', evaluators=[CustomEvaluator()])], ) dataset.to_file('tests.yaml', custom_evaluator_types=[CustomEvaluator]) # Load with custom evaluator types dataset = Dataset[str, str, Any].from_file( 'tests.yaml', custom_evaluator_types=[CustomEvaluator], # Required! ) ``` ### Format Inference Failed **Problem**: `ValueError: Cannot infer format from extension` **Solution**: Specify format explicitly: ```python from typing import Any from pydantic_evals import Case, Dataset dataset = Dataset[str, str, Any](name='my_tests', cases=[Case(inputs='test')]) # Explicit format for unusual extensions dataset.to_file('data.txt', fmt='yaml') dataset_loaded = Dataset[str, str, Any].from_file('data.txt', fmt='yaml') ``` ### Schema Generation Error **Problem**: Custom evaluator causes schema generation to fail **Solution**: Ensure evaluator is a proper dataclass: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext # ✅ Correct @dataclass class MyEvaluator(Evaluator): value: int def evaluate(self, ctx: EvaluatorContext) -> bool: return True # ❌ Wrong: Missing @dataclass class BadEvaluator(Evaluator): def __init__(self, value: int): self.value = value def evaluate(self, ctx: EvaluatorContext) -> bool: return True ``` ## Next Steps - **[Dataset Management](/docs/ai/evals/how-to/dataset-management)** - Creating and organizing datasets - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write custom evaluation logic - **[Core Concepts](/docs/ai/evals/core-concepts)** - Understand the data model --- # [Case Lifecycle Hooks](https://pydantic.dev/docs/ai/evals/how-to/lifecycle/) # Case Lifecycle Hooks Control per-case setup, context preparation, and teardown during evaluation using [`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle). [`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle) provides hooks at each stage of case evaluation. You pass a lifecycle **class** (not an instance) to [`Dataset.evaluate`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate), and a new instance is created for each case, so instance attributes naturally hold case-specific state. ## Evaluation Flow Each case follows this flow: 1. **`setup()`** -- called before task execution 2. **Task runs** 3. **`prepare_context()`** -- called after task, before evaluators 4. **Evaluators run** 5. **`teardown()`** -- called after evaluators complete ## Per-Case Setup and Teardown Use `setup()` and `teardown()` when each case needs its own environment -- for example, creating a database, starting a service, or preparing fixtures driven by case metadata. Since a new lifecycle instance is created for each case, instance attributes are naturally case-scoped: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators.context import EvaluatorContext from pydantic_evals.lifecycle import CaseLifecycle from pydantic_evals.reporting import ReportCase, ReportCaseFailure class SetupFromMetadata(CaseLifecycle[str, str, dict]): async def setup(self) -> None: prefix = (self.case.metadata or {}).get('prefix', '') self.prefix = prefix async def prepare_context( self, ctx: EvaluatorContext[str, str, dict] ) -> EvaluatorContext[str, str, dict]: ctx.metrics['prefix_length'] = len(self.prefix) return ctx async def teardown( self, result: ReportCase[str, str, dict] | ReportCaseFailure[str, str, dict], ) -> None: pass # Clean up resources here dataset = Dataset( name='setup_teardown', cases=[ Case(name='no_prefix', inputs='hello', metadata={'prefix': ''}), Case(name='with_prefix', inputs='hello', metadata={'prefix': 'PREFIX:'}), ] ) report = dataset.evaluate_sync(lambda inputs: inputs.upper(), lifecycle=SetupFromMetadata) metrics = {c.name: c.metrics for c in report.cases} print(metrics['no_prefix']['prefix_length']) #> 0 print(metrics['with_prefix']['prefix_length']) #> 7 ``` The case metadata drives per-case behavior without needing custom [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) subclasses or serialization. ### Conditional Teardown The `teardown()` hook receives the full result, so you can vary cleanup logic based on success or failure -- for example, keeping test environments up for manual inspection when a case fails: ```python from pydantic_evals import Case, Dataset from pydantic_evals.lifecycle import CaseLifecycle from pydantic_evals.reporting import ReportCase, ReportCaseFailure cleaned_up: list[str] = [] class ConditionalCleanup(CaseLifecycle[str, str, dict]): async def setup(self) -> None: self.resource_id = self.case.name async def teardown( self, result: ReportCase[str, str, dict] | ReportCaseFailure[str, str, dict], ) -> None: keep_on_failure = (self.case.metadata or {}).get('keep_on_failure', False) if isinstance(result, ReportCaseFailure) and keep_on_failure: pass # Keep resource for inspection else: cleaned_up.append(self.resource_id) dataset = Dataset( name='conditional_cleanup', cases=[ Case(name='success_case', inputs='hello', metadata={'keep_on_failure': True}), Case(name='failure_case', inputs='fail', metadata={'keep_on_failure': True}), ] ) def task(inputs: str) -> str: if inputs == 'fail': raise ValueError('intentional failure') return inputs.upper() report = dataset.evaluate_sync(task, max_concurrency=1, lifecycle=ConditionalCleanup) print(cleaned_up) #> ['success_case'] ``` ## Preparing Evaluator Context The `prepare_context()` hook runs after the task completes but before evaluators see the context. This can be used to add metrics or attributes based on the task output, span tree, or any other state -- for example, deriving metrics from instrumented spans (like tool call counts or API latency), or computing values from external resources set up during `setup()`: ```python from dataclasses import dataclass from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.lifecycle import CaseLifecycle class EnrichMetrics(CaseLifecycle): async def prepare_context(self, ctx: EvaluatorContext) -> EvaluatorContext: ctx.metrics['output_length'] = len(str(ctx.output)) return ctx @dataclass class CheckLength(Evaluator): max_length: int = 50 def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.metrics.get('output_length', 0) <= self.max_length dataset = Dataset( name='context_enrichment', cases=[Case(name='short', inputs='hi'), Case(name='long', inputs='hello world')], evaluators=[CheckLength()], ) report = dataset.evaluate_sync(lambda inputs: inputs.upper(), lifecycle=EnrichMetrics) for case in report.cases: print(f'{case.name}: output_length={case.metrics["output_length"]}') #> short: output_length=2 #> long: output_length=11 ``` ## Type Parameters [`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle) is generic over the same three type parameters as [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case): `InputsT`, `OutputT`, and `MetadataT`. All three default to `Any`, so you can omit them when your hooks don't need type-specific access: ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators.context import EvaluatorContext from pydantic_evals.lifecycle import CaseLifecycle # Works with any dataset -- no type parameters needed class GenericMetricEnricher(CaseLifecycle): async def prepare_context(self, ctx: EvaluatorContext) -> EvaluatorContext: ctx.metrics['custom'] = 42 return ctx dataset = Dataset(name='generic_lifecycle', cases=[Case(inputs='test')]) report = dataset.evaluate_sync(lambda inputs: inputs, lifecycle=GenericMetricEnricher) print(report.cases[0].metrics['custom']) #> 42 ``` ## Next Steps - **[Metrics & Attributes](/docs/ai/evals/how-to/metrics-attributes)** -- Recording metrics inside tasks - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** -- Using enriched metrics in evaluators - **[Span-Based Evaluation](/docs/ai/evals/evaluators/span-based)** -- Analyzing execution traces --- # [Logfire Integration](https://pydantic.dev/docs/ai/evals/how-to/logfire-integration/) # Logfire Integration Visualize and analyze evaluation results using Pydantic Logfire. Pydantic Evals uses OpenTelemetry to record traces of the evaluation process. These traces contain all the information from your evaluation reports, plus full tracing from the execution of your task function. You can send these traces to any OpenTelemetry-compatible backend, including [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/). ## Installation Install the optional logfire dependency: Terminal ```bash pip install 'pydantic-evals[logfire]' ``` ## Basic Setup Configure Logfire before running evaluations: basic\_logfire\_setup.py ```python import logfire from pydantic_evals import Case, Dataset # Configure Logfire logfire.configure( send_to_logfire='if-token-present', # (1) ) # Your evaluation code def my_task(inputs: str) -> str: return f'result for {inputs}' dataset = Dataset(name='logfire_demo', cases=[Case(name='test', inputs='example')]) report = dataset.evaluate_sync(my_task) ``` Sends data to Logfire only if the `LOGFIRE_TOKEN` environment variable is set That's it! Your evaluation traces will now appear in the Logfire web UI as long as you have the `LOGFIRE_TOKEN` environment variable set. ## What Gets Sent to Logfire When you run an evaluation, Logfire receives: 1. **Evaluation metadata** 1. Dataset name 2. Number of cases 3. Evaluator names 2. **Per-case data** 1. Inputs and outputs 2. Expected outputs 3. Metadata 4. Execution duration 3. **Evaluation results** 1. Scores, assertions, and labels 2. Reasons (if included) 3. Evaluator failures 4. **Task execution traces** 1. All OpenTelemetry spans from your task function 2. Tool calls (for Pydantic AI agents) 3. API calls, database queries, etc. ## Viewing Results in Logfire ### Evaluation Overview Logfire provides a special table view for evaluation results on the root evaluation span: ![Logfire Evals Overview](/docs/ai/img/logfire-evals-overview.png) This view shows: - Case names - Pass/fail status - Scores and assertions - Execution duration - Quick filtering and sorting ### Individual Case Details Click any case to see detailed inputs and outputs: ![Logfire Evals Case](/docs/ai/img/logfire-evals-case.png) ### Full Trace View View the complete execution trace including all spans generated during evaluation: ![Logfire Evals Case Trace](/docs/ai/img/logfire-evals-case-trace.png) This is especially useful for: - Debugging failed cases - Understanding performance bottlenecks - Analyzing tool usage patterns - Writing span-based evaluators ## Analyzing Traces ### Comparing Runs Run the same evaluation multiple times and compare in Logfire: ```python from pydantic_evals import Case, Dataset def original_task(inputs: str) -> str: return f'original result for {inputs}' def improved_task(inputs: str) -> str: return f'improved result for {inputs}' dataset = Dataset(name='comparison', cases=[Case(name='test', inputs='example')]) # Run 1: Original implementation report1 = dataset.evaluate_sync(original_task) # Run 2: Improved implementation report2 = dataset.evaluate_sync(improved_task) # Compare in Logfire by filtering by timestamp or attributes ``` ### Debugging Failed Cases Find failed cases quickly: 1. Search for `service_name = 'my_service_evals' AND is_exception` (replace with the actual service name you are using) 2. View the full span tree to see where the failure occurred 3. Inspect attributes and logs for error messages ## Span-Based Evaluation Logfire integration enables powerful span-based evaluators. See [Span-Based Evaluation](/docs/ai/evals/evaluators/span-based) for details. Example: Verify specific tools were called: ```python import logfire from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import HasMatchingSpan logfire.configure(send_to_logfire='if-token-present') def my_agent(inputs: str) -> str: return f'result for {inputs}' dataset = Dataset( name='logfire_demo', cases=[Case(name='test', inputs='example')], evaluators=[ HasMatchingSpan( query={'name_contains': 'search_tool'}, evaluation_name='used_search', ), ], ) report = dataset.evaluate_sync(my_agent) ``` The span tree is available in both: - Your evaluator code (via `ctx.span_tree`) - Logfire UI (visual trace view) ## Troubleshooting ### No Data Appearing in Logfire Check: 1. **Token is set**: `echo $LOGFIRE_TOKEN` 2. **Configuration is correct**: ```python import logfire logfire.configure(send_to_logfire='always') # Force sending ``` 3. **Network connectivity**: Check firewall settings 4. **Project exists**: Verify project name in Logfire UI ### Traces Missing Spans If some spans are missing: 1. **Ensure logfire is configured before imports**: ```python import logfire logfire.configure() # Must be first ``` 2. **Check instrumentation**: Ensure your code has enabled all instrumentations you want: ```python import logfire logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) ``` ## Best Practices ### 1\. Configure Early Always configure Logfire before running evaluations: ```python import logfire from pydantic_evals import Case, Dataset logfire.configure(send_to_logfire='if-token-present') # Now import and run evaluations def task(inputs: str) -> str: return f'result for {inputs}' dataset = Dataset(name='logfire_demo', cases=[Case(name='test', inputs='example')]) dataset.evaluate_sync(task) ``` ### 2\. Use Descriptive Service Names And Environments ```python import logfire logfire.configure( service_name='rag-pipeline-evals', environment='development', ) ``` ### 3\. Review Periodically - Check Logfire regularly to identify patterns - Look for consistently failing cases - Analyze performance trends - Adjust evaluators based on insights ## Next Steps - **[Span-Based Evaluation](/docs/ai/evals/evaluators/span-based)** - Use OpenTelemetry spans in evaluators - **[Logfire Documentation](https://logfire.pydantic.dev/docs/guides/web-ui/evals/)** - Complete Logfire guide - **[Metrics & Attributes](/docs/ai/evals/how-to/metrics-attributes)** - Add custom data to traces --- # [Metrics & Attributes](https://pydantic.dev/docs/ai/evals/how-to/metrics-attributes/) # Metrics & Attributes Track custom metrics and attributes during task execution for richer evaluation insights. While executing evaluation tasks, you can record: - **Metrics** - Numeric values (int/float) for quantitative measurements - **Attributes** - Any data for qualitative information These appear in evaluation reports and can be used by evaluators for assessment. ## Recording Metrics Use [`increment_eval_metric`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.increment_eval_metric) to track numeric values: ```python from dataclasses import dataclass from pydantic_evals.dataset import increment_eval_metric @dataclass class APIResult: output: str usage: 'Usage' @dataclass class Usage: total_tokens: int def call_api(inputs: str) -> APIResult: return APIResult(output=f'Result: {inputs}', usage=Usage(total_tokens=100)) def my_task(inputs: str) -> str: # Track API calls increment_eval_metric('api_calls', 1) result = call_api(inputs) # Track tokens used increment_eval_metric('tokens_used', result.usage.total_tokens) return result.output ``` ## Recording Attributes Use [`set_eval_attribute`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.set_eval_attribute) to store any data: ```python from pydantic_evals import set_eval_attribute def process(inputs: str) -> str: return f'Processed: {inputs}' def my_task(inputs: str) -> str: # Record which model was used set_eval_attribute('model', 'gpt-5.2') # Record feature flags set_eval_attribute('used_cache', True) set_eval_attribute('retry_count', 2) # Record structured data set_eval_attribute('config', { 'temperature': 0.7, 'max_tokens': 100, }) return process(inputs) ``` ## Accessing in Evaluators Metrics and attributes are available in the [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext): ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class EfficiencyChecker(Evaluator): max_api_calls: int = 5 def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool]: # Access metrics api_calls = ctx.metrics.get('api_calls', 0) tokens_used = ctx.metrics.get('tokens_used', 0) # Access attributes used_cache = ctx.attributes.get('used_cache', False) return { 'efficient_api_usage': api_calls <= self.max_api_calls, 'used_caching': used_cache, 'token_efficient': tokens_used < 1000, } ``` ## Viewing in Reports Metrics and attributes appear in report data: ```python from pydantic_evals import Case, Dataset def task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='report_viewing', cases=[Case(inputs='test')], evaluators=[]) report = dataset.evaluate_sync(task) for case in report.cases: print(f'{case.name}:') #> Case 1: print(f' Metrics: {case.metrics}') #> Metrics: {} print(f' Attributes: {case.attributes}') #> Attributes: {} ``` You can also display them in printed reports: ```python from pydantic_evals import Case, Dataset def task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='report_printing', cases=[Case(inputs='test')], evaluators=[]) report = dataset.evaluate_sync(task) # Metrics and attributes are available but not shown by default # Access them programmatically or via Logfire for case in report.cases: print(f'\nCase: {case.name}') """ Case: Case 1 """ print(f'Metrics: {case.metrics}') #> Metrics: {} print(f'Attributes: {case.attributes}') #> Attributes: {} ``` ## Automatic Metrics When using Pydantic AI and Logfire, some metrics are automatically tracked: ```python import logfire from pydantic_ai import Agent logfire.configure(send_to_logfire='if-token-present') agent = Agent('openai:gpt-5.2') async def ai_task(inputs: str) -> str: result = await agent.run(inputs) return result.output # Automatically tracked metrics: # - requests: Number of LLM calls # - input_tokens: Total input tokens # - output_tokens: Total output tokens # - prompt_tokens: Prompt tokens (if available) # - completion_tokens: Completion tokens (if available) # - cost: Estimated cost (if using genai-prices) ``` Access these in evaluators: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class CostChecker(Evaluator): max_cost: float = 0.01 # $0.01 def evaluate(self, ctx: EvaluatorContext) -> bool: cost = ctx.metrics.get('cost', 0.0) return cost <= self.max_cost ``` ## Practical Examples ### API Usage Tracking ```python from dataclasses import dataclass from pydantic_evals import increment_eval_metric, set_eval_attribute from pydantic_evals.evaluators import Evaluator, EvaluatorContext def check_cache(inputs: str) -> str | None: return None # No cache hit for demo @dataclass class APIResult: text: str usage: 'Usage' @dataclass class Usage: total_tokens: int async def call_api(inputs: str) -> APIResult: return APIResult(text=f'Result: {inputs}', usage=Usage(total_tokens=100)) def save_to_cache(inputs: str, result: str) -> None: pass # Save to cache async def smart_task(inputs: str) -> str: # Try cache first if cached := check_cache(inputs): set_eval_attribute('cache_hit', True) return cached set_eval_attribute('cache_hit', False) # Call API increment_eval_metric('api_calls', 1) result = await call_api(inputs) increment_eval_metric('tokens', result.usage.total_tokens) # Cache result save_to_cache(inputs, result.text) return result.text # Evaluate efficiency @dataclass class EfficiencyEvaluator(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | float]: api_calls = ctx.metrics.get('api_calls', 0) cache_hit = ctx.attributes.get('cache_hit', False) return { 'used_cache': cache_hit, 'made_api_call': api_calls > 0, 'efficiency_score': 1.0 if cache_hit else 0.5, } ``` ### Tool Usage Tracking ```python from dataclasses import dataclass from pydantic_ai import Agent, RunContext from pydantic_evals import increment_eval_metric, set_eval_attribute from pydantic_evals.evaluators import Evaluator, EvaluatorContext agent = Agent('openai:gpt-5.2') def search(query: str) -> str: return f'Search results for: {query}' def call(endpoint: str) -> str: return f'API response from: {endpoint}' @agent.tool def search_database(ctx: RunContext, query: str) -> str: increment_eval_metric('db_searches', 1) set_eval_attribute('last_query', query) return search(query) @agent.tool def call_api(ctx: RunContext, endpoint: str) -> str: increment_eval_metric('api_calls', 1) set_eval_attribute('last_endpoint', endpoint) return call(endpoint) # Evaluate tool usage @dataclass class ToolUsageEvaluator(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | int]: db_searches = ctx.metrics.get('db_searches', 0) api_calls = ctx.metrics.get('api_calls', 0) return { 'used_database': db_searches > 0, 'used_api': api_calls > 0, 'tool_call_count': db_searches + api_calls, 'reasonable_tool_usage': (db_searches + api_calls) <= 5, } ``` ### Performance Tracking ```python import time from dataclasses import dataclass from pydantic_evals import increment_eval_metric, set_eval_attribute from pydantic_evals.evaluators import Evaluator, EvaluatorContext async def retrieve_context(inputs: str) -> list[str]: return ['context1', 'context2'] async def generate_response(context: list[str], inputs: str) -> str: return f'Generated response for {inputs}' async def monitored_task(inputs: str) -> str: # Track sub-operation timing t0 = time.perf_counter() context = await retrieve_context(inputs) retrieve_time = time.perf_counter() - t0 increment_eval_metric('retrieve_time', retrieve_time) t0 = time.perf_counter() result = await generate_response(context, inputs) generate_time = time.perf_counter() - t0 increment_eval_metric('generate_time', generate_time) # Record which operations were needed set_eval_attribute('needed_retrieval', len(context) > 0) set_eval_attribute('context_chunks', len(context)) return result # Evaluate performance @dataclass class PerformanceEvaluator(Evaluator): max_retrieve_time: float = 0.5 max_generate_time: float = 2.0 def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool]: retrieve_time = ctx.metrics.get('retrieve_time', 0.0) generate_time = ctx.metrics.get('generate_time', 0.0) return { 'fast_retrieval': retrieve_time <= self.max_retrieve_time, 'fast_generation': generate_time <= self.max_generate_time, } ``` ### Quality Tracking ```python from dataclasses import dataclass from pydantic_evals import set_eval_attribute from pydantic_evals.evaluators import Evaluator, EvaluatorContext async def llm_call(inputs: str) -> dict: return {'text': f'Response: {inputs}', 'confidence': 0.85, 'sources': ['doc1', 'doc2']} async def quality_task(inputs: str) -> str: result = await llm_call(inputs) # Extract quality indicators confidence = result.get('confidence', 0.0) sources_used = result.get('sources', []) set_eval_attribute('confidence', confidence) set_eval_attribute('source_count', len(sources_used)) set_eval_attribute('sources', sources_used) return result['text'] # Evaluate based on quality signals @dataclass class QualityEvaluator(Evaluator): min_confidence: float = 0.7 def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | float]: confidence = ctx.attributes.get('confidence', 0.0) source_count = ctx.attributes.get('source_count', 0) return { 'high_confidence': confidence >= self.min_confidence, 'used_sources': source_count > 0, 'quality_score': confidence * (1.0 + 0.1 * source_count), } ``` ## Experiment-Level Metadata In addition to case-level metadata, you can also pass experiment-level metadata when calling [`evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate): ```python from pydantic_evals import Case, Dataset dataset = Dataset( name='experiment_metadata', cases=[ Case( inputs='test', metadata={'difficulty': 'easy'}, # Case-level metadata ) ] ) async def task(inputs: str) -> str: return f'Result: {inputs}' # Pass experiment-level metadata async def main(): report = await dataset.evaluate( task, metadata={ 'model': 'gpt-5.2', 'prompt_version': 'v2.1', 'temperature': 0.7, }, ) # Access experiment metadata in the report print(report.experiment_metadata) #> {'model': 'gpt-5.2', 'prompt_version': 'v2.1', 'temperature': 0.7} ``` ### When to Use Experiment Metadata Experiment metadata is useful for tracking configuration that applies to the entire evaluation run: - **Model configuration**: Model name, version, parameters - **Prompt versioning**: Which prompt template was used - **Infrastructure**: Deployment environment, region - **Experiment context**: Developer name, feature branch, commit hash This metadata is especially valuable when: - Comparing multiple evaluation runs over time - Tracking which configuration produced which results - Reproducing evaluation results from historical data ### Viewing in Reports Experiment metadata appears at the top of printed reports: ```python from pydantic_evals import Case, Dataset dataset = Dataset(name='metadata_report', cases=[Case(inputs='hello', expected_output='HELLO')]) async def task(text: str) -> str: return text.upper() async def main(): report = await dataset.evaluate( task, metadata={'model': 'gpt-5.2', 'version': 'v1.0'}, ) print(report.render()) """ ╭─ Evaluation Summary: task ─╮ │ model: gpt-5.2 │ │ version: v1.0 │ ╰────────────────────────────╯ ┏━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Duration ┃ ┡━━━━━━━━━━╇━━━━━━━━━━┩ │ Case 1 │ 10ms │ ├──────────┼──────────┤ │ Averages │ 10ms │ └──────────┴──────────┘ """ ``` ## Synchronization between Tasks and Experiment Metadata Experiment metadata is for _recording_ configuration, not _configuring_ the task. The metadata dict doesn't automatically configure your task's behavior; you must ensure the values in the metadata dict match what your task actually uses. For example, it's easy to accidentally have metadata claim `temperature: 0.7` while your task actually uses `temperature: 1.0`, leading to incorrect experiment tracking and unreproducible results. To avoid this problem, we recommend establishing a single source of truth for configuration that both your task and metadata reference. Below are a few suggested patterns for achieving this synchronization. ### Pattern 1: Shared Module Constants For simpler cases, use module-level constants: ```python from pydantic_ai import Agent from pydantic_evals import Case, Dataset # Module constants as single source of truth MODEL_NAME = 'openai:gpt-5-mini' TEMPERATURE = 0.7 INSTRUCTIONS = 'You are a helpful assistant.' agent = Agent(MODEL_NAME, model_settings={'temperature': TEMPERATURE}, instructions=INSTRUCTIONS) async def task(inputs: str) -> str: result = await agent.run(inputs) return result.output async def main(): dataset = Dataset(name='shared_constants', cases=[Case(inputs='What is the capital of France?')]) # Metadata references same constants await dataset.evaluate( task, metadata={ 'model': MODEL_NAME, 'temperature': TEMPERATURE, 'instructions': INSTRUCTIONS, }, ) ``` ### Pattern 2: Configuration Object (Recommended) Define configuration once and use it everywhere: ```python from dataclasses import asdict, dataclass from pydantic_ai import Agent from pydantic_evals import Case, Dataset @dataclass class TaskConfig: """Single source of truth for task configuration. Includes all variables you'd like to see in experiment metadata. """ model: str temperature: float max_tokens: int prompt_version: str # Define configuration once config = TaskConfig( model='openai:gpt-5-mini', temperature=0.7, max_tokens=500, prompt_version='v2.1', ) # Use config in task agent = Agent( config.model, model_settings={'temperature': config.temperature, 'max_tokens': config.max_tokens}, ) async def task(inputs: str) -> str: """Task uses the same config that's recorded in metadata.""" result = await agent.run(inputs) return result.output # Evaluate with metadata derived from the same config async def main(): dataset = Dataset(name='config_evaluation', cases=[Case(inputs='What is the capital of France?')]) report = await dataset.evaluate( task, metadata=asdict(config), # Guaranteed to match task behavior ) print(report.experiment_metadata) """ { 'model': 'openai:gpt-5-mini', 'temperature': 0.7, 'max_tokens': 500, 'prompt_version': 'v2.1', } """ ``` If it's problematic to have a global task configuration, you can also create your `TaskConfig` object at the task call-site and pass it to the agent via `deps` or similar, but in this case you would still need to guarantee that the value is always the same as the value passed to `metadata` in the call to `Dataset.evaluate`. ### Anti-Pattern: Duplicate Configuration **Avoid this common mistake**: ```python from pydantic_ai import Agent from pydantic_evals import Case, Dataset # ❌ BAD: Configuration defined in multiple places agent = Agent('openai:gpt-5-mini', model_settings={'temperature': 0.7}) async def task(inputs: str) -> str: result = await agent.run(inputs) return result.output async def main(): dataset = Dataset(name='anti_pattern', cases=[Case(inputs='test')]) # ❌ BAD: Metadata manually typed - easy to get out of sync await dataset.evaluate( task, metadata={ 'model': 'openai:gpt-5-mini', # Duplicated! Could diverge from agent definition 'temperature': 0.8, # ⚠️ WRONG! Task actually uses 0.7 }, ) ``` In this anti-pattern, the metadata claims `temperature: 0.8` but the task uses `0.7`. This leads to: - Incorrect experiment tracking - Inability to reproduce results - Confusion when comparing runs - Wasted time debugging "why results differ" ## Metrics vs Attributes vs Metadata Understanding the differences: Feature Metrics Attributes Case Metadata Experiment Metadata **Set in** Task execution Task execution Case definition `evaluate()` call **Type** int, float Any Any Any **Purpose** Quantitative Qualitative Test data Experiment config **Used for** Aggregation Context Input to task Tracking runs **Available to** Evaluators Evaluators Task & Evaluators Report only **Scope** Per case Per case Per case Per experiment ```python from pydantic_evals import Case, Dataset, increment_eval_metric, set_eval_attribute # Case Metadata: Defined in case (before execution) case = Case( inputs='question', metadata={'difficulty': 'hard', 'category': 'math'}, # Per-case metadata ) dataset = Dataset(name='metrics_demo', cases=[case]) # Metrics & Attributes: Recorded during execution async def task(inputs): # These are recorded during execution for each case increment_eval_metric('tokens', 100) set_eval_attribute('model', 'gpt-5.2') return f'Result: {inputs}' async def main(): # Experiment Metadata: Defined at evaluation time await dataset.evaluate( task, metadata={ # Experiment-level metadata 'prompt_version': 'v2.1', 'temperature': 0.7, }, ) ``` ## Troubleshooting ### "Metrics/attributes not appearing" Ensure you're calling the functions inside the task: ```python from pydantic_evals import increment_eval_metric def process(inputs: str) -> str: return f'Processed: {inputs}' # Bad: Called outside task increment_eval_metric('count', 1) def bad_task(inputs): return process(inputs) # Good: Called inside task def good_task(inputs): increment_eval_metric('count', 1) return process(inputs) ``` ### "Metrics not incrementing" Check you're using `increment_eval_metric`, not `set_eval_attribute`: ```python from pydantic_evals import increment_eval_metric, set_eval_attribute # Bad: This will overwrite, not increment set_eval_attribute('count', 1) set_eval_attribute('count', 1) # Still 1 # Good: This increments increment_eval_metric('count', 1) increment_eval_metric('count', 1) # Now 2 ``` ### "Too much data in attributes" Store summaries, not raw data: ```python from pydantic_evals import set_eval_attribute giant_response_object = {'key' + str(i): 'value' * 100 for i in range(1000)} # Bad: Huge object set_eval_attribute('full_response', giant_response_object) # Good: Summary set_eval_attribute('response_size_kb', len(str(giant_response_object)) / 1024) set_eval_attribute('response_keys', list(giant_response_object.keys())[:10]) # First 10 keys ``` ## Next Steps - **[Case Lifecycle Hooks](/docs/ai/evals/how-to/lifecycle)** - Per-case setup, teardown, and context preparation - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Use metrics/attributes in evaluators - **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** - View metrics in Logfire - **[Concurrency & Performance](/docs/ai/evals/how-to/concurrency)** - Optimize evaluation performance --- # [Multi-Run Evaluation](https://pydantic.dev/docs/ai/evals/how-to/multi-run/) # Multi-Run Evaluation Run each case multiple times to measure variability and get more reliable aggregate results. AI systems are inherently stochastic -- the same input can produce different outputs across runs. The `repeat` parameter lets you run each case multiple times and automatically aggregates the results, giving you a clearer picture of your system's typical behavior. ## Basic Usage Pass `repeat` to [`evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate) or [`evaluate_sync()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate_sync): ```python from pydantic_evals import Case, Dataset dataset = Dataset( name='multi_run_basic', cases=[ Case(name='greeting', inputs='Say hello'), Case(name='farewell', inputs='Say goodbye'), ] ) def task(inputs: str) -> str: return inputs.upper() # Run each case 5 times report = dataset.evaluate_sync(task, repeat=5) # 2 cases × 5 repeats = 10 total runs print(len(report.cases)) #> 10 ``` When `repeat > 1`, each run gets an indexed name like `greeting [1/5]`, `greeting [2/5]`, etc., while the original case name is preserved in [`source_case_name`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ReportCase.source_case_name) for grouping. ## Accessing Grouped Results Use [`case_groups()`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport.case_groups) to access runs organized by original case, with per-group aggregated statistics: ```python from pydantic_evals import Case, Dataset dataset = Dataset( name='grouped_results', cases=[ Case(name='greeting', inputs='Say hello'), Case(name='farewell', inputs='Say goodbye'), ] ) def task(inputs: str) -> str: return inputs.upper() report = dataset.evaluate_sync(task, repeat=3) groups = report.case_groups() assert groups is not None # None for single-run (repeat=1) print(len(groups)) #> 2 group_names = [g.name for g in groups] print(group_names) #> ['greeting', 'farewell'] # Each group has 3 runs and aggregated statistics for group in groups: assert len(group.runs) == 3 assert len(group.failures) == 0 assert group.summary.task_duration > 0 ``` Each [`ReportCaseGroup`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ReportCaseGroup) contains: - `name` -- the original case name - `runs` -- the individual [`ReportCase`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ReportCase) results - `failures` -- any runs that raised exceptions - `summary` -- a [`ReportCaseAggregate`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ReportCaseAggregate) with averaged scores, metrics, labels, assertions, and durations ## Aggregation With `repeat > 1`, the report's `averages()` uses a two-level aggregation strategy: 1. **Per-group averages**: Each case's runs are averaged into a group summary 2. **Cross-group averages**: The group summaries are averaged to produce the final result This ensures each original case contributes equally to the overall averages, regardless of how many runs succeeded or failed. ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import EqualsExpected dataset = Dataset( name='aggregation', cases=[ Case(name='easy', inputs='hello', expected_output='HELLO'), Case(name='hard', inputs='world', expected_output='WORLD'), ], evaluators=[EqualsExpected()], ) def task(inputs: str) -> str: return inputs.upper() report = dataset.evaluate_sync(task, repeat=3) averages = report.averages() assert averages is not None print(f'Overall assertion rate: {averages.assertions}') #> Overall assertion rate: 1.0 ``` ## Default Behavior When `repeat=1` (the default), behavior is identical to a standard evaluation -- no run indexing, no `source_case_name`, and `case_groups()` returns `None`: ```python from pydantic_evals import Case, Dataset dataset = Dataset(name='default_behavior', cases=[Case(name='test', inputs='hello')]) def task(inputs: str) -> str: return inputs.upper() report = dataset.evaluate_sync(task) # repeat=1 by default assert report.case_groups() is None assert all(c.source_case_name is None for c in report.cases) ``` ## Next Steps - **[Concurrency & Performance](/docs/ai/evals/how-to/concurrency)** -- Control parallel execution with `max_concurrency` - **[Metrics & Attributes](/docs/ai/evals/how-to/metrics-attributes)** -- Track custom metrics across runs - **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** -- Visualize multi-run results --- # [Retry Strategies](https://pydantic.dev/docs/ai/evals/how-to/retry-strategies/) # Retry Strategies Handle transient failures in tasks and evaluators with automatic retry logic. LLM-based systems can experience transient failures: - Rate limits - Network timeouts - Temporary API outages - Context length errors Pydantic Evals supports retry configuration for both: - **Task execution** - The function being evaluated - **Evaluator execution** - The evaluators themselves ## Basic Retry Configuration Pass a retry configuration to `evaluate()` or `evaluate_sync()` using [Tenacity](https://tenacity.readthedocs.io/) parameters: ```python from tenacity import stop_after_attempt from pydantic_evals import Case, Dataset def my_function(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='basic_retry', cases=[Case(inputs='test')], evaluators=[]) report = dataset.evaluate_sync( task=my_function, retry_task={'stop': stop_after_attempt(3)}, retry_evaluators={'stop': stop_after_attempt(2)}, ) ``` ## Retry Configuration Options Retry configurations use [Tenacity](https://tenacity.readthedocs.io/) and support the same options as Pydantic AI's [`RetryConfig`](/docs/ai/api/pydantic-ai/retries/#pydantic_ai.retries.RetryConfig): ```python from tenacity import stop_after_attempt, wait_exponential from pydantic_evals import Case, Dataset def my_function(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='retry_config', cases=[Case(inputs='test')], evaluators=[]) retry_config = { 'stop': stop_after_attempt(3), # Stop after 3 attempts 'wait': wait_exponential(multiplier=1, min=1, max=10), # Exponential backoff: 1s, 2s, 4s, 8s (capped at 10s) 'reraise': True, # Re-raise the original exception after exhausting retries } dataset.evaluate_sync( task=my_function, retry_task=retry_config, ) ``` ### Common Parameters The retry configuration accepts any parameters from the tenacity `retry` decorator. Common ones include: Parameter Type Description `stop` `StopBaseT` Stop strategy (e.g., `stop_after_attempt(3)`, `stop_after_delay(60)`) `wait` `WaitBaseT` Wait strategy (e.g., `wait_exponential()`, `wait_fixed(2)`) `retry` `RetryBaseT` Retry condition (e.g., `retry_if_exception_type(TimeoutError)`) `reraise` `bool` Whether to reraise the original exception (default: `False`) `before_sleep` `Callable` Callback before sleeping between retries See the [Tenacity documentation](https://tenacity.readthedocs.io/) for all available options. ## Task Retries Retry the task function when it fails: ```python from tenacity import stop_after_attempt, wait_exponential from pydantic_evals import Case, Dataset async def call_llm(inputs: str) -> str: return f'LLM response to: {inputs}' async def flaky_llm_task(inputs: str) -> str: """This might hit rate limits or timeout.""" response = await call_llm(inputs) return response dataset = Dataset(name='task_retry', cases=[Case(inputs='test')]) report = dataset.evaluate_sync( task=flaky_llm_task, retry_task={ 'stop': stop_after_attempt(5), # Try up to 5 times 'wait': wait_exponential(multiplier=1, min=1, max=30), # Exponential backoff, capped at 30s 'reraise': True, }, ) ``` ### When Task Retries Trigger Retries trigger when the task raises an exception: ```python class RateLimitError(Exception): pass class ValidationError(Exception): pass async def call_api(inputs: str) -> str: return f'API response: {inputs}' async def my_task(inputs: str) -> str: try: return await call_api(inputs) except RateLimitError: # Will trigger retry raise except ValidationError: # Will also trigger retry raise ``` ### Exponential Backoff When using `wait_exponential()`, delays increase exponentially: ``` Attempt 1: immediate Attempt 2: ~1s delay (multiplier * 2^0) Attempt 3: ~2s delay (multiplier * 2^1) Attempt 4: ~4s delay (multiplier * 2^2) Attempt 5: ~8s delay (multiplier * 2^3, capped at max) ``` The actual delay depends on the `multiplier`, `min`, and `max` parameters passed to `wait_exponential()`. ## Evaluator Retries Retry evaluators when they fail: ```python from tenacity import stop_after_attempt, wait_exponential from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge def my_task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset( name='evaluator_retry', cases=[Case(inputs='test')], evaluators=[ # LLMJudge might hit rate limits LLMJudge(rubric='Response is accurate'), ], ) report = dataset.evaluate_sync( task=my_task, retry_evaluators={ 'stop': stop_after_attempt(3), 'wait': wait_exponential(multiplier=1, min=0.5, max=10), 'reraise': True, }, ) ``` ### When Evaluator Retries Trigger Retries trigger when an evaluator raises an exception: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext async def external_api_call(output: str) -> bool: return len(output) > 0 @dataclass class APIEvaluator(Evaluator): async def evaluate(self, ctx: EvaluatorContext) -> bool: # If this raises an exception, retry logic will trigger result = await external_api_call(ctx.output) return result ``` ### Evaluator Failures If an evaluator fails after all retries, it's recorded as an [`EvaluatorFailure`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorFailure): ```python from tenacity import stop_after_attempt from pydantic_evals import Case, Dataset def task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='evaluator_failures', cases=[Case(inputs='test')], evaluators=[]) report = dataset.evaluate_sync(task, retry_evaluators={'stop': stop_after_attempt(3)}) # Check for evaluator failures for case in report.cases: if case.evaluator_failures: for failure in case.evaluator_failures: print(f'Evaluator {failure.name} failed: {failure.error_message}') #> (No output - no evaluator failures in this case) ``` View evaluator failures in reports: ```python from pydantic_evals import Case, Dataset def task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='failure_report', cases=[Case(inputs='test')], evaluators=[]) report = dataset.evaluate_sync(task) report.print(include_evaluator_failures=True) """ Evaluation Summary: task ┏━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Duration ┃ ┡━━━━━━━━━━╇━━━━━━━━━━┩ │ Case 1 │ 10ms │ ├──────────┼──────────┤ │ Averages │ 10ms │ └──────────┴──────────┘ """ #> #> ✅ case_0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% (0/0) ``` ## Combining Task and Evaluator Retries You can configure both independently: ```python from tenacity import stop_after_attempt, wait_exponential from pydantic_evals import Case, Dataset def flaky_task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='combined_retry', cases=[Case(inputs='test')], evaluators=[]) report = dataset.evaluate_sync( task=flaky_task, retry_task={ 'stop': stop_after_attempt(5), # Retry task up to 5 times 'wait': wait_exponential(multiplier=1, min=1, max=30), 'reraise': True, }, retry_evaluators={ 'stop': stop_after_attempt(3), # Retry evaluators up to 3 times 'wait': wait_exponential(multiplier=1, min=0.5, max=10), 'reraise': True, }, ) ``` ## Practical Examples ### Rate Limit Handling ```python from tenacity import stop_after_attempt, wait_exponential from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import LLMJudge async def expensive_llm_call(inputs: str) -> str: return f'LLM response: {inputs}' async def llm_task(inputs: str) -> str: """Task that might hit rate limits.""" return await expensive_llm_call(inputs) dataset = Dataset( name='rate_limit_retry', cases=[Case(inputs='test')], evaluators=[ LLMJudge(rubric='Quality check'), # Also might hit rate limits ], ) # Generous retries for rate limits report = dataset.evaluate_sync( task=llm_task, retry_task={ 'stop': stop_after_attempt(10), # Rate limits can take multiple retries 'wait': wait_exponential(multiplier=2, min=2, max=60), # Start at 2s, exponential up to 60s 'reraise': True, }, retry_evaluators={ 'stop': stop_after_attempt(5), 'wait': wait_exponential(multiplier=2, min=2, max=30), 'reraise': True, }, ) ``` ### Network Timeout Handling ```python import httpx from tenacity import stop_after_attempt, wait_exponential from pydantic_evals import Case, Dataset async def api_task(inputs: str) -> str: """Task that calls external API which might timeout.""" async with httpx.AsyncClient(timeout=10.0) as client: response = await client.post('https://api.example.com', json={'input': inputs}) return response.text dataset = Dataset(name='timeout_retry', cases=[Case(inputs='test')], evaluators=[]) # Quick retries for network issues report = dataset.evaluate_sync( task=api_task, retry_task={ 'stop': stop_after_attempt(4), # A few quick retries 'wait': wait_exponential(multiplier=0.5, min=0.5, max=5), # Fast retry, capped at 5s 'reraise': True, }, ) ``` ### Context Length Handling ```python from tenacity import stop_after_attempt from pydantic_evals import Case, Dataset class ContextLengthError(Exception): pass async def llm_call(inputs: str, max_tokens: int = 8000) -> str: return f'LLM response: {inputs[:100]}' async def smart_llm_task(inputs: str) -> str: """Task that might exceed context length.""" try: return await llm_call(inputs, max_tokens=8000) except ContextLengthError: # Retry with shorter context truncated_inputs = inputs[:4000] return await llm_call(truncated_inputs, max_tokens=4000) dataset = Dataset(name='context_length', cases=[Case(inputs='test')], evaluators=[]) # Don't retry context length errors (handle in task) report = dataset.evaluate_sync( task=smart_llm_task, retry_task={'stop': stop_after_attempt(1)}, # No retries, we handle it ) ``` ## Retry vs Error Handling **Use retries for:** - Transient failures (rate limits, timeouts) - Network issues - Temporary service outages - Recoverable errors **Use error handling for:** - Validation errors - Logic errors - Permanent failures - Expected error conditions ```python class RateLimitError(Exception): pass async def llm_call(inputs: str) -> str: return f'LLM response: {inputs}' def is_valid(result: str) -> bool: return len(result) > 0 async def smart_task(inputs: str) -> str: """Handle expected errors, let retries handle transient failures.""" try: result = await llm_call(inputs) # Validate output (don't retry validation errors) if not is_valid(result): return 'ERROR: Invalid output format' return result except RateLimitError: # Let retry logic handle this raise except ValueError as e: # Don't retry - this is a permanent error return f'ERROR: {e}' ``` ## Troubleshooting ### "Still failing after retries" Increase retry attempts or check if error is retriable: ```python import logging from tenacity import stop_after_attempt from pydantic_evals import Case, Dataset def task(inputs: str) -> str: return f'Result: {inputs}' # Add logging to see what's failing logging.basicConfig(level=logging.DEBUG) dataset = Dataset(name='troubleshooting', cases=[Case(inputs='test')], evaluators=[]) # Tenacity logs retry attempts report = dataset.evaluate_sync(task, retry_task={'stop': stop_after_attempt(5)}) ``` ### "Evaluations taking too long" Reduce retry attempts or wait times: ```python from tenacity import stop_after_attempt, wait_exponential # Faster retries retry_config = { 'stop': stop_after_attempt(3), # Fewer attempts 'wait': wait_exponential(multiplier=0.1, min=0.1, max=2), # Quick retries, capped at 2s 'reraise': True, } ``` ### "Hitting rate limits despite retries" Increase delays or use `max_concurrency`: ```python from tenacity import stop_after_attempt, wait_exponential from pydantic_evals import Case, Dataset def task(inputs: str) -> str: return f'Result: {inputs}' dataset = Dataset(name='rate_limit_config', cases=[Case(inputs='test')], evaluators=[]) # Longer delays retry_config = { 'stop': stop_after_attempt(5), 'wait': wait_exponential(multiplier=5, min=5, max=60), # Start at 5s, exponential up to 60s 'reraise': True, } # Also reduce concurrency report = dataset.evaluate_sync( task=task, retry_task=retry_config, max_concurrency=2, # Only 2 concurrent tasks ) ``` ## Next Steps - **[Concurrency & Performance](/docs/ai/evals/how-to/concurrency)** - Optimize evaluation performance - **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** - View retries in Logfire --- # [Online Evaluation](https://pydantic.dev/docs/ai/evals/online-evaluation/) # Online Evaluation Online evaluation lets you attach evaluators to production (or staging) functions so that every call (or a sampled subset) is automatically evaluated in the background. The same [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) classes used with [`Dataset.evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate) work here; the difference is just in how they're wired up. ## When to Use Online Evaluation Online evaluation is useful when you want to: - **Monitor production quality:** continuously score LLM outputs against rubrics - **Catch regressions:** detect degradation in agent behavior across deploys - **Collect evaluation data:** build datasets from real traffic for offline analysis - **Control costs:** sample expensive LLM judges on a fraction of traffic while running cheap checks on everything For testing against curated datasets before deployment, use [offline evaluation](/docs/ai/evals/quick-start) with [`Dataset.evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate) instead. ## Quick Start The [`evaluate()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.evaluate) decorator attaches evaluators to any function. Evaluators run in the background without blocking the caller: ```python import asyncio from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.online import configure, evaluate, wait_for_evaluations results_log: list[str] = [] configure(default_sink=lambda results, failures, ctx: results_log.extend(f'{r.name}={r.value}' for r in results)) @dataclass class OutputNotEmpty(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return bool(ctx.output) @evaluate(OutputNotEmpty()) async def summarize(text: str) -> str: return f'Summary of: {text}' async def main(): result = await summarize('Pydantic AI is a Python agent framework.') print(result) #> Summary of: Pydantic AI is a Python agent framework. await wait_for_evaluations() print(results_log) #> ['OutputNotEmpty=True'] asyncio.run(main()) ``` This uses the module-level [`configure()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.configure) and [`evaluate()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.evaluate) functions, which delegate to a global [`OnlineEvalConfig`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig). For multiple configurations or isolated setups, create your own config instances (see [OnlineEvalConfig](#onlineevalconfig) below). ## Core Concepts ### OnlineEvaluator Different evaluators need different settings. A cheap heuristic could run on 100% of traffic; an expensive LLM judge might run on 1%. [`OnlineEvaluator`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvaluator) wraps an [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) with per-evaluator configuration: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext, LLMJudge from pydantic_evals.online import OnlineEvaluator @dataclass class IsHelpful(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return len(str(ctx.output)) > 10 # Cheap evaluator: run on every request always_check = OnlineEvaluator(evaluator=IsHelpful(), sample_rate=1.0) # Expensive evaluator: run on 1% of requests, limit concurrency rare_check = OnlineEvaluator( evaluator=LLMJudge(rubric='Is the response helpful?'), sample_rate=0.01, max_concurrency=5, ) ``` When you pass a bare [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) to the [`evaluate()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.evaluate) decorator, it's automatically wrapped in an [`OnlineEvaluator`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvaluator) with the config's default sample rate. ### OnlineEvalConfig [`OnlineEvalConfig`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig) holds cross-evaluator defaults (sink, sample rate, metadata). There's a global default instance, plus you can create custom instances for different configurations: ```python import asyncio from collections.abc import Sequence from dataclasses import dataclass from pydantic_evals.evaluators import ( EvaluationResult, Evaluator, EvaluatorContext, EvaluatorFailure, ) from pydantic_evals.online import OnlineEvalConfig, wait_for_evaluations @dataclass class IsNonEmpty(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return bool(ctx.output) results_log: list[str] = [] async def log_sink( results: Sequence[EvaluationResult], failures: Sequence[EvaluatorFailure], context: EvaluatorContext, ) -> None: for r in results: results_log.append(f'{r.name}={r.value}') my_eval = OnlineEvalConfig( default_sink=log_sink, default_sample_rate=1.0, metadata={'service': 'my-app'}, ) @my_eval.evaluate(IsNonEmpty()) async def my_function(query: str) -> str: return f'Answer to: {query}' async def main(): result = await my_function('What is 2+2?') print(result) #> Answer to: What is 2+2? await wait_for_evaluations() print(results_log) #> ['IsNonEmpty=True'] asyncio.run(main()) ``` ### EvaluationSink [`EvaluationSink`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.EvaluationSink) is a protocol for evaluation result destinations. Results flow through pluggable sinks rather than being hardcoded to one destination. The built-in [`CallbackSink`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.CallbackSink) wraps any callable (sync or async) that accepts results, failures, and context. You can also pass a bare callable wherever a sink is expected -- it's auto-wrapped in a [`CallbackSink`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.CallbackSink). For custom sinks, implement the [`EvaluationSink`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.EvaluationSink) protocol: ```python from collections.abc import Sequence from pydantic_evals.evaluators import ( EvaluationResult, EvaluatorContext, EvaluatorFailure, ) from pydantic_evals.online import SpanReference class PrintSink: """Prints evaluation results to stdout.""" async def submit( self, *, results: Sequence[EvaluationResult], failures: Sequence[EvaluatorFailure], context: EvaluatorContext, span_reference: SpanReference | None, ) -> None: for r in results: print(f' {r.name}: {r.value}') for f in failures: print(f' FAILED {f.name}: {f.error_message}') ``` Evaluators require at least one sink Evaluators are designed to produce evaluation results, not side effects -- use sinks for any actions you want to take on results (logging, alerting, storing, etc.). Because of this, **evaluators are skipped entirely when no sinks are configured** (neither a per-evaluator sink nor a default sink). This avoids wasted work when there's nowhere to send results. If you need side effects produced _within_ an evaluator itself, configure a no-op sink to ensure the evaluator runs: ```python from pydantic_evals.online import OnlineEvalConfig config = OnlineEvalConfig(default_sink=lambda results, failures, context: None) ``` ## Sampling Control evaluation frequency with per-evaluator sample rates to balance quality monitoring against cost. Note Sampling is decided **before** the decorated function runs. When no evaluators are sampled for a given call, the function executes without any additional instrumentation overhead (no logfire span or span tree capture). ### Static Sample Rates A `sample_rate` between 0.0 and 1.0 sets the probability of evaluating each call: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.online import OnlineEvaluator @dataclass class QuickCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return bool(ctx.output) # Run on every request always = OnlineEvaluator(evaluator=QuickCheck(), sample_rate=1.0) # Run on 10% of requests sometimes = OnlineEvaluator(evaluator=QuickCheck(), sample_rate=0.1) # Never run (effectively disabled) never = OnlineEvaluator(evaluator=QuickCheck(), sample_rate=0.0) ``` ### Dynamic Sample Rates Pass a callable to enable runtime-configurable or input-dependent sampling. The callable receives a [`SamplingContext`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SamplingContext) with the evaluator instance, function inputs, config metadata, and a per-call random seed, and returns a `float` (probability) or `bool` (always/never): ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.online import OnlineEvaluator, SamplingContext def get_current_rate(ctx: SamplingContext) -> float: return 0.5 @dataclass class QuickCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return bool(ctx.output) dynamic = OnlineEvaluator(evaluator=QuickCheck(), sample_rate=get_current_rate) ``` This enables integration with feature flags, managed variables, or configuration systems -- for example, you could replace `get_current_rate` with a function that reads from a remote config service (such as [Logfire managed variables](https://logfire.pydantic.dev/docs/reference/advanced/managed-variables/)) at runtime, allowing you to change the probability without redeploying the application. You can also use the [`SamplingContext`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SamplingContext) to make sampling decisions based on the function inputs: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.online import OnlineEvaluator, SamplingContext def sample_long_inputs(ctx: SamplingContext) -> bool: """Only evaluate calls with long input text.""" return len(str(ctx.inputs.get('text', ''))) > 100 @dataclass class QualityCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return len(str(ctx.output)) > 10 expensive = OnlineEvaluator(evaluator=QualityCheck(), sample_rate=sample_long_inputs) ``` ### Correlated Sampling By default, each evaluator samples independently. With three evaluators each at 10%, roughly 27% of calls incur evaluation overhead (`1 − 0.9³`). If you'd prefer that the _same_ 10% of calls run _all_ evaluators, set `sampling_mode='correlated'`: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.online import OnlineEvalConfig, OnlineEvaluator @dataclass class CheckA(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return True @dataclass class CheckB(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return True config = OnlineEvalConfig( default_sink=lambda results, failures, ctx: None, sampling_mode='correlated', ) # Both run on the same ~10% of calls check_a = OnlineEvaluator(evaluator=CheckA(), sample_rate=0.1) check_b = OnlineEvaluator(evaluator=CheckB(), sample_rate=0.1) ``` In correlated mode, a single random `call_seed` (uniformly distributed between 0.0 and 1.0) is generated per function call and shared across all evaluators. An evaluator runs when `call_seed < sample_rate`, so lower-rate evaluators' calls are always a subset of higher-rate ones, and the total overhead probability equals the maximum rate rather than accumulating. The `call_seed` is also available on [`SamplingContext`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SamplingContext) for custom `sample_rate` callables that want to implement their own correlated logic regardless of mode. ### Disabling Evaluation Use [`disable_evaluation()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.disable_evaluation) to suppress all online evaluation in a scope. This may be useful in tests: ```python import asyncio from collections.abc import Sequence from dataclasses import dataclass from pydantic_evals.evaluators import ( EvaluationResult, Evaluator, EvaluatorContext, EvaluatorFailure, ) from pydantic_evals.online import ( OnlineEvalConfig, disable_evaluation, wait_for_evaluations, ) @dataclass class OutputCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return bool(ctx.output) results_log: list[str] = [] async def log_sink( results: Sequence[EvaluationResult], failures: Sequence[EvaluatorFailure], context: EvaluatorContext, ) -> None: for r in results: results_log.append(f'{r.name}={r.value}') config = OnlineEvalConfig(default_sink=log_sink) @config.evaluate(OutputCheck()) async def my_function(x: int) -> int: return x * 2 async def main(): # Evaluators suppressed inside this block with disable_evaluation(): result = await my_function(21) print(result) #> 42 await wait_for_evaluations() print(f'evaluations run: {len(results_log)}') #> evaluations run: 0 # Evaluators resume outside the block await my_function(21) await wait_for_evaluations() print(f'evaluations run: {len(results_log)}') #> evaluations run: 1 asyncio.run(main()) ``` ## Conditional Evaluation For cost control, you can run expensive evaluation logic conditionally within a single custom evaluator. Return a mapping where you only include keys for checks that have run -- checks you don't want to perform can simply be omitted from the results: ```python import asyncio from collections.abc import Sequence from dataclasses import dataclass from pydantic_evals.evaluators import ( EvaluationResult, Evaluator, EvaluatorContext, EvaluatorFailure, ) from pydantic_evals.online import ( OnlineEvalConfig, wait_for_evaluations, ) results_log: list[str] = [] async def log_sink( results: Sequence[EvaluationResult], failures: Sequence[EvaluatorFailure], context: EvaluatorContext, ) -> None: for r in results: results_log.append(f'{r.name}={r.value}') @dataclass class ConditionalAnalysis(Evaluator): """Runs a cheap check on every call, and an expensive check only on long outputs.""" def evaluate(self, ctx: EvaluatorContext) -> dict[str, float | bool]: output = str(ctx.output) results: dict[str, float | bool] = { 'has_content': len(output) > 0, } # Only run the expensive analysis on long outputs if len(output) > 20: # pretend the following line is expensive.. results['detail_score'] = len(output) / 100.0 return results config = OnlineEvalConfig(default_sink=log_sink) @config.evaluate(ConditionalAnalysis()) async def generate(prompt: str) -> str: return f'Response to: {prompt}' async def main(): await generate('hi') # short output -- only cheap check runs await wait_for_evaluations() print(results_log) #> ['has_content=True'] results_log.clear() await generate('tell me a long story about dragons') # long output -- both checks run await wait_for_evaluations() print(sorted(results_log)) #> ['detail_score=0.47', 'has_content=True'] asyncio.run(main()) ``` This pattern lets you combine cheap and expensive checks in one evaluator, avoiding unnecessary work when conditions aren't met. ## Sync Function Support The [`evaluate()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.evaluate) decorator works with both async and sync functions: ```python import asyncio from collections.abc import Sequence from dataclasses import dataclass from pydantic_evals.evaluators import ( EvaluationResult, Evaluator, EvaluatorContext, EvaluatorFailure, ) from pydantic_evals.online import OnlineEvalConfig, wait_for_evaluations results_log: list[str] = [] async def log_sink( results: Sequence[EvaluationResult], failures: Sequence[EvaluatorFailure], context: EvaluatorContext, ) -> None: for r in results: results_log.append(f'{r.name}={r.value}') @dataclass class OutputCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return bool(ctx.output) config = OnlineEvalConfig(default_sink=log_sink) @config.evaluate(OutputCheck()) def process(text: str) -> str: return text.upper() async def main(): # Sync decorated functions work from async contexts too result = process('hello') print(result) #> HELLO await wait_for_evaluations() print(results_log) #> ['OutputCheck=True'] asyncio.run(main()) ``` Sync decorated functions work from both sync and async contexts. When a running event loop is available, evaluators are dispatched as background tasks on that loop. Otherwise, a background thread with its own event loop is spawned. ## Per-Evaluator Sink Overrides Individual evaluators can override the config's default sink. This is useful if different evaluators need to send results to different destinations: ```python import asyncio from collections.abc import Sequence from dataclasses import dataclass from pydantic_evals.evaluators import ( EvaluationResult, Evaluator, EvaluatorContext, EvaluatorFailure, ) from pydantic_evals.online import ( OnlineEvalConfig, OnlineEvaluator, wait_for_evaluations, ) default_log: list[str] = [] special_log: list[str] = [] async def default_sink( results: Sequence[EvaluationResult], failures: Sequence[EvaluatorFailure], context: EvaluatorContext, ) -> None: for r in results: default_log.append(r.name) async def special_sink( results: Sequence[EvaluationResult], failures: Sequence[EvaluatorFailure], context: EvaluatorContext, ) -> None: for r in results: special_log.append(r.name) @dataclass class FastCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return True @dataclass class ImportantCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return True config = OnlineEvalConfig(default_sink=default_sink) @config.evaluate( FastCheck(), # uses default sink OnlineEvaluator(evaluator=ImportantCheck(), sink=special_sink), # uses special sink ) async def my_function(x: int) -> int: return x async def main(): await my_function(42) await wait_for_evaluations() print(f'default: {default_log}') #> default: ['FastCheck'] print(f'special: {special_log}') #> special: ['ImportantCheck'] asyncio.run(main()) ``` ## Re-running Evaluators from Stored Data A key capability of online evaluation is re-running evaluators without re-executing the original function. This is useful when you want to evaluate historical data with updated rubrics, or run additional evaluators on existing traces. ### run\_evaluators [`run_evaluators()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.run_evaluators) runs a list of evaluators against an [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext) and returns the results: ```python import asyncio from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.online import run_evaluators from pydantic_evals.otel.span_tree import SpanTree @dataclass class LengthCheck(Evaluator): min_length: int = 10 def evaluate(self, ctx: EvaluatorContext) -> bool: return len(str(ctx.output)) >= self.min_length @dataclass class HasKeyword(Evaluator): keyword: str = 'hello' def evaluate(self, ctx: EvaluatorContext) -> bool: return self.keyword in str(ctx.output).lower() async def main(): # Build a context manually (in practice, you'd get this from stored data) # Normally EvaluatorContext would not be manually constructed -- # it is built automatically by the @evaluate decorator or OnlineEvaluation capability, # or from an EvaluatorContextSource (see below). ctx = EvaluatorContext( name='example', inputs={'query': 'greet the user'}, output='Hello! How can I help you today?', expected_output=None, metadata=None, duration=0.5, _span_tree=SpanTree(), attributes={}, metrics={}, ) results, failures = await run_evaluators( [LengthCheck(min_length=10), HasKeyword(keyword='hello')], ctx, ) for r in results: print(f'{r.name}: {r.value}') #> LengthCheck: True #> HasKeyword: True print(f'failures: {len(failures)}') #> failures: 0 asyncio.run(main()) ``` ### EvaluatorContextSource Protocol For fetching context data from external storage (like Pydantic Logfire), implement the [`EvaluatorContextSource`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.EvaluatorContextSource) protocol. It defines `fetch()` and `fetch_many()` methods that return [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext) objects from stored data: ```python import asyncio from collections.abc import Sequence from pydantic_evals.evaluators import EvaluatorContext from pydantic_evals.online import SpanReference from pydantic_evals.otel.span_tree import SpanTree class MyContextSource: """Example source that fetches context from a hypothetical store.""" def __init__(self, store: dict[str, EvaluatorContext]) -> None: self._store = store async def fetch(self, span: SpanReference) -> EvaluatorContext: return self._store[span.span_id] async def fetch_many(self, spans: Sequence[SpanReference]) -> list[EvaluatorContext]: return [self._store[s.span_id] for s in spans] def _make_context( *, inputs: object = None, output: object = None, metadata: object = None, duration: float = 0.0, ) -> EvaluatorContext: # Normally EvaluatorContext would not be manually constructed -- # it is built automatically by the @evaluate decorator or OnlineEvaluation capability. return EvaluatorContext( name=None, inputs=inputs, output=output, expected_output=None, metadata=metadata, duration=duration, _span_tree=SpanTree(), attributes={}, metrics={}, ) async def main(): source = MyContextSource({ 'span_abc': _make_context( inputs={'query': 'What is AI?'}, output='AI is artificial intelligence.', metadata={'model': 'gpt-4o'}, duration=1.2, ), 'span_def': _make_context( inputs={'query': 'What is ML?'}, output='ML is machine learning.', metadata={'model': 'gpt-4o'}, duration=0.8, ), }) # Fetch a single context ctx = await source.fetch(SpanReference(trace_id='t1', span_id='span_abc')) print(f'inputs: {ctx.inputs}') #> inputs: {'query': 'What is AI?'} print(f'output: {ctx.output}') #> output: AI is artificial intelligence. # Fetch multiple contexts in a batch spans = [ SpanReference(trace_id='t1', span_id='span_abc'), SpanReference(trace_id='t1', span_id='span_def'), ] contexts = await source.fetch_many(spans) print(f'batch size: {len(contexts)}') #> batch size: 2 asyncio.run(main()) ``` ## Concurrency Control Each [`OnlineEvaluator`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvaluator) has a `max_concurrency` limit (default: 10). When the limit is reached, new evaluation requests for that evaluator are **dropped** (not queued). This prevents expensive evaluators from consuming unbounded resources: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.online import OnlineEvaluator @dataclass class ExpensiveCheck(Evaluator): async def evaluate(self, ctx: EvaluatorContext) -> bool: # Imagine this makes a slow call to an LLM return True # Allow at most 3 concurrent evaluations limited = OnlineEvaluator( evaluator=ExpensiveCheck(), sample_rate=0.1, max_concurrency=3, ) ``` To react to dropped evaluations, set `on_max_concurrency` on the [`OnlineEvaluator`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvaluator) or as a default on [`OnlineEvalConfig`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig). The callback receives the [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext) that would have been evaluated, and can be sync or async: ```python import warnings from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.online import OnlineEvalConfig, OnlineEvaluator @dataclass class ExpensiveCheck(Evaluator): async def evaluate(self, ctx: EvaluatorContext) -> bool: return True def warn_on_drop(ctx: EvaluatorContext) -> None: warnings.warn('Evaluation dropped due to max concurrency', stacklevel=1) # Per-evaluator handler limited = OnlineEvaluator( evaluator=ExpensiveCheck(), max_concurrency=3, on_max_concurrency=warn_on_drop, ) # Or set a global default for all evaluators in a config config = OnlineEvalConfig(on_max_concurrency=warn_on_drop) ``` Note If neither the per-evaluator nor the config-level `on_max_concurrency` is set, dropped evaluations are silently ignored. ## Error Handling There are two types of error handling: - **`on_sampling_error`**: Called synchronously when a `sample_rate` callable raises. Receives the exception and the [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator). Must be sync (not async). If set, the evaluator is skipped. If not set, the exception **propagates to the caller**. - **`on_error`**: Called when an exception occurs in a `sink` or `on_max_concurrency` callback. Receives the exception, [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext), [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator), and a location string. Can be sync or async. If not set, exceptions are **silently suppressed**. Set these on [`OnlineEvalConfig`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig) for global defaults, or on [`OnlineEvaluator`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvaluator) to override per-evaluator: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.online import OnErrorLocation, OnlineEvalConfig, OnlineEvaluator def log_errors( exc: Exception, ctx: EvaluatorContext, evaluator: Evaluator, location: OnErrorLocation, ) -> None: print(f'[{location}] {type(exc).__name__}: {exc}') @dataclass class MyCheck(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return True # Global default -- applies to all evaluators in this config config = OnlineEvalConfig( default_sink=lambda results, failures, context: None, on_error=log_errors, ) # Per-evaluator override custom = OnlineEvaluator(evaluator=MyCheck(), on_error=log_errors) ``` Key behaviors: - **Evaluator exceptions** are handled by converting them to [`EvaluatorFailure`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorFailure) objects passed to sinks -- they do not go through `on_error`. - **One evaluator's error doesn't affect siblings** -- each evaluator runs in its own task with isolated error handling. - **One sink's error doesn't affect other sinks** -- each sink submission is wrapped individually. - **If `on_error` itself raises**, the exception is silently suppressed to protect sibling evaluators. - **If no `on_error` is set**, exceptions are silently suppressed -- this is the safe default. ## API Reference The complete API for the `pydantic_evals.online` module is documented in the [API reference](/docs/ai/api/pydantic-evals/online). ## Next Steps - **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** -- Write evaluators for your domain - **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** -- Use ready-made evaluators - **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** -- Visualize evaluation results in Logfire - **[Quick Start](/docs/ai/evals/quick-start)** -- Offline evaluation with [`Dataset.evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate) --- # [Agent User Interaction (AG-UI)](https://pydantic.dev/docs/ai/examples/ag-ui/) # Agent User Interaction (AG-UI) Example of using Pydantic AI agents with the [AG-UI Dojo](https://github.com/ag-ui-protocol/ag-ui/tree/main/apps/dojo) example app. See the [AG-UI docs](/docs/ai/integrations/ui/ag-ui) for more information about the AG-UI integration. Demonstrates: - [AG-UI](/docs/ai/integrations/ui/ag-ui) - [Tools](/docs/ai/tools-toolsets/tools) ## Prerequisites - An [OpenAI API key](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key) ## Running the Example With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage) you will need two command line windows. ### Pydantic AI AG-UI backend Setup your OpenAI API Key Terminal ```bash export OPENAI_API_KEY= ``` Start the Pydantic AI AG-UI example backend. - [pip](#tab-panel-20) - [uv](#tab-panel-21) Terminal ```bash python -m pydantic_ai_examples.ag_ui ``` Terminal ```bash uv run -m pydantic_ai_examples.ag_ui ``` ### AG-UI Dojo example frontend Next run the AG-UI Dojo example frontend. 1. Clone the [AG-UI repository](https://github.com/ag-ui-protocol/ag-ui) Terminal ```shell git clone https://github.com/ag-ui-protocol/ag-ui.git ``` 2. Change into to the `ag-ui/typescript-sdk` directory Terminal ```shell cd ag-ui/sdks/typescript ``` 3. Run the Dojo app following the [official instructions](https://github.com/ag-ui-protocol/ag-ui/tree/main/apps/dojo#development-setup) 4. Visit [http://localhost:3000/pydantic-ai](http://localhost:3000/pydantic-ai) 5. Select View `Pydantic AI` from the sidebar ## Feature Examples ### Agentic Chat This demonstrates a basic agent interaction including Pydantic AI server side tools and AG-UI client side tools. If you've [run the example](#running-the-example), you can view it at [http://localhost:3000/pydantic-ai/feature/agentic\_chat](http://localhost:3000/pydantic-ai/feature/agentic_chat). #### Agent Tools - `time` - Pydantic AI tool to check the current time for a time zone - `background` - AG-UI tool to set the background color of the client window #### Agent Prompts ```text What is the time in New York? ``` ```text Change the background to blue ``` A complex example which mixes both AG-UI and Pydantic AI tools: ```text Perform the following steps, waiting for the response of each step before continuing: 1. Get the time 2. Set the background to red 3. Get the time 4. Report how long the background set took by diffing the two times ``` #### Agentic Chat - Code agentic\_chat.py ```python from __future__ import annotations from datetime import datetime from zoneinfo import ZoneInfo from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent('openai:gpt-5-mini') @agent.tool_plain async def current_time(timezone: str = 'UTC') -> str: """Get the current time in ISO format. Args: timezone: The timezone to use. Returns: The current time in ISO format string. """ tz: ZoneInfo = ZoneInfo(timezone) return datetime.now(tz=tz).isoformat() app = AGUIApp(agent) ``` ### Agentic Generative UI Demonstrates a long running task where the agent sends updates to the frontend to let the user know what's happening. If you've [run the example](#running-the-example), you can view it at [http://localhost:3000/pydantic-ai/feature/agentic\_generative\_ui](http://localhost:3000/pydantic-ai/feature/agentic_generative_ui). #### Plan Prompts ```text Create a plan for breakfast and execute it ``` #### Agentic Generative UI - Code agentic\_generative\_ui.py ```python from __future__ import annotations from textwrap import dedent from typing import Any, Literal from pydantic import BaseModel, Field from ag_ui.core import EventType, StateDeltaEvent, StateSnapshotEvent from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp StepStatus = Literal['pending', 'completed'] class Step(BaseModel): """Represents a step in a plan.""" description: str = Field(description='The description of the step') status: StepStatus = Field( default='pending', description='The status of the step (e.g., pending, completed)', ) class Plan(BaseModel): """Represents a plan with multiple steps.""" steps: list[Step] = Field( default_factory=list[Step], description='The steps in the plan' ) class JSONPatchOp(BaseModel): """A class representing a JSON Patch operation (RFC 6902).""" op: Literal['add', 'remove', 'replace', 'move', 'copy', 'test'] = Field( description='The operation to perform: add, remove, replace, move, copy, or test', ) path: str = Field(description='JSON Pointer (RFC 6901) to the target location') value: Any = Field( default=None, description='The value to apply (for add, replace operations)', ) from_: str | None = Field( default=None, alias='from', description='Source path (for move, copy operations)', ) agent = Agent( 'openai:gpt-5-mini', instructions=dedent( """ When planning use tools only, without any other messages. IMPORTANT: - Use the `create_plan` tool to set the initial state of the steps - Use the `update_plan_step` tool to update the status of each step - Do NOT repeat the plan or summarise it in a message - Do NOT confirm the creation or updates in a message - Do NOT ask the user for additional information or next steps Only one plan can be active at a time, so do not call the `create_plan` tool again until all the steps in current plan are completed. """ ), ) @agent.tool_plain async def create_plan(steps: list[str]) -> StateSnapshotEvent: """Create a plan with multiple steps. Args: steps: List of step descriptions to create the plan. Returns: StateSnapshotEvent containing the initial state of the steps. """ plan: Plan = Plan( steps=[Step(description=step) for step in steps], ) return StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot=plan.model_dump(), ) @agent.tool_plain async def update_plan_step( index: int, description: str | None = None, status: StepStatus | None = None ) -> StateDeltaEvent: """Update the plan with new steps or changes. Args: index: The index of the step to update. description: The new description for the step. status: The new status for the step. Returns: StateDeltaEvent containing the changes made to the plan. """ changes: list[JSONPatchOp] = [] if description is not None: changes.append( JSONPatchOp( op='replace', path=f'/steps/{index}/description', value=description ) ) if status is not None: changes.append( JSONPatchOp(op='replace', path=f'/steps/{index}/status', value=status) ) return StateDeltaEvent( type=EventType.STATE_DELTA, delta=changes, ) app = AGUIApp(agent) ``` ### Human in the Loop Demonstrates simple human in the loop workflow where the agent comes up with a plan and the user can approve it using checkboxes. #### Task Planning Tools - `generate_task_steps` - AG-UI tool to generate and confirm steps #### Task Planning Prompt ```text Generate a list of steps for cleaning a car for me to review ``` #### Human in the Loop - Code human\_in\_the\_loop.py ```python from __future__ import annotations from textwrap import dedent from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent( 'openai:gpt-5-mini', instructions=dedent( """ When planning tasks use tools only, without any other messages. IMPORTANT: - Use the `generate_task_steps` tool to display the suggested steps to the user - Never repeat the plan, or send a message detailing steps - If accepted, confirm the creation of the plan and the number of selected (enabled) steps only - If not accepted, ask the user for more information, DO NOT use the `generate_task_steps` tool again """ ), ) app = AGUIApp(agent) ``` ### Predictive State Updates Demonstrates how to use the predictive state updates feature to update the state of the UI based on agent responses, including user interaction via user confirmation. If you've [run the example](#running-the-example), you can view it at [http://localhost:3000/pydantic-ai/feature/predictive\_state\_updates](http://localhost:3000/pydantic-ai/feature/predictive_state_updates). #### Story Tools - `write_document` - AG-UI tool to write the document to a window - `document_predict_state` - Pydantic AI tool that enables document state prediction for the `write_document` tool This also shows how to use custom instructions based on shared state information. #### Story Example Starting document text ```markdown Bruce was a good dog, ``` Agent prompt ```text Help me complete my story about bruce the dog, is should be no longer than a sentence. ``` #### Predictive State Updates - Code predictive\_state\_updates.py ```python from __future__ import annotations from textwrap import dedent from pydantic import BaseModel from ag_ui.core import CustomEvent, EventType from pydantic_ai import Agent, RunContext from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent('openai:gpt-5-mini', deps_type=StateDeps[DocumentState]) # Tools which return AG-UI events will be sent to the client as part of the # event stream, single events and iterables of events are supported. @agent.tool_plain async def document_predict_state() -> list[CustomEvent]: """Enable document state prediction. Returns: CustomEvent containing the event to enable state prediction. """ return [ CustomEvent( type=EventType.CUSTOM, name='PredictState', value=[ { 'state_key': 'document', 'tool': 'write_document', 'tool_argument': 'document', }, ], ), ] @agent.instructions() async def story_instructions(ctx: RunContext[StateDeps[DocumentState]]) -> str: """Provide instructions for writing document if present. Args: ctx: The run context containing document state information. Returns: Instructions string for the document writing agent. """ return dedent( f"""You are a helpful assistant for writing documents. Before you start writing, you MUST call the `document_predict_state` tool to enable state prediction. To present the document to the user for review, you MUST use the `write_document` tool. When you have written the document, DO NOT repeat it as a message. If accepted briefly summarize the changes you made, 2 sentences max, otherwise ask the user to clarify what they want to change. This is the current document: {ctx.deps.state.document} """ ) app = AGUIApp(agent, deps=StateDeps(DocumentState())) ``` ### Shared State Demonstrates how to use the shared state between the UI and the agent. State sent to the agent is detected by a function based instruction. This then validates the data using a custom pydantic model before using to create the instructions for the agent to follow and send to the client using a AG-UI tool. If you've [run the example](#running-the-example), you can view it at [http://localhost:3000/pydantic-ai/feature/shared\_state](http://localhost:3000/pydantic-ai/feature/shared_state). #### Recipe Tools - `display_recipe` - AG-UI tool to display the recipe in a graphical format #### Recipe Example 1. Customise the basic settings of your recipe 2. Click `Improve with AI` #### Shared State - Code shared\_state.py ```python from __future__ import annotations from enum import Enum from textwrap import dedent from pydantic import BaseModel, Field from ag_ui.core import EventType, StateSnapshotEvent from pydantic_ai import Agent, RunContext from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class SkillLevel(str, Enum): """The level of skill required for the recipe.""" BEGINNER = 'Beginner' INTERMEDIATE = 'Intermediate' ADVANCED = 'Advanced' class SpecialPreferences(str, Enum): """Special preferences for the recipe.""" HIGH_PROTEIN = 'High Protein' LOW_CARB = 'Low Carb' SPICY = 'Spicy' BUDGET_FRIENDLY = 'Budget-Friendly' ONE_POT_MEAL = 'One-Pot Meal' VEGETARIAN = 'Vegetarian' VEGAN = 'Vegan' class CookingTime(str, Enum): """The cooking time of the recipe.""" FIVE_MIN = '5 min' FIFTEEN_MIN = '15 min' THIRTY_MIN = '30 min' FORTY_FIVE_MIN = '45 min' SIXTY_PLUS_MIN = '60+ min' class Ingredient(BaseModel): """A class representing an ingredient in a recipe.""" icon: str = Field( default='ingredient', description="The icon emoji (not emoji code like '\x1f35e', but the actual emoji like 🥕) of the ingredient", ) name: str amount: str class Recipe(BaseModel): """A class representing a recipe.""" skill_level: SkillLevel = Field( default=SkillLevel.BEGINNER, description='The skill level required for the recipe', ) special_preferences: list[SpecialPreferences] = Field( default_factory=list[SpecialPreferences], description='Any special preferences for the recipe', ) cooking_time: CookingTime = Field( default=CookingTime.FIVE_MIN, description='The cooking time of the recipe' ) ingredients: list[Ingredient] = Field( default_factory=list[Ingredient], description='Ingredients for the recipe', ) instructions: list[str] = Field( default_factory=list[str], description='Instructions for the recipe' ) class RecipeSnapshot(BaseModel): """A class representing the state of the recipe.""" recipe: Recipe = Field( default_factory=Recipe, description='The current state of the recipe' ) agent = Agent('openai:gpt-5-mini', deps_type=StateDeps[RecipeSnapshot]) @agent.tool_plain async def display_recipe(recipe: Recipe) -> StateSnapshotEvent: """Display the recipe to the user. Args: recipe: The recipe to display. Returns: StateSnapshotEvent containing the recipe snapshot. """ return StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot={'recipe': recipe}, ) @agent.instructions async def recipe_instructions(ctx: RunContext[StateDeps[RecipeSnapshot]]) -> str: """Instructions for the recipe generation agent. Args: ctx: The run context containing recipe state information. Returns: Instructions string for the recipe generation agent. """ return dedent( f""" You are a helpful assistant for creating recipes. IMPORTANT: - Create a complete recipe using the existing ingredients - Append new ingredients to the existing ones - Use the `display_recipe` tool to present the recipe to the user - Do NOT repeat the recipe in the message, use the tool instead - Do NOT run the `display_recipe` tool multiple times in a row Once you have created the updated recipe and displayed it to the user, summarise the changes in one sentence, don't describe the recipe in detail or send it as a message to the user. The current state of the recipe is: {ctx.deps.state.recipe.model_dump_json(indent=2)} """, ) app = AGUIApp(agent, deps=StateDeps(RecipeSnapshot())) ``` ### Tool Based Generative UI Demonstrates customised rendering for tool output with used confirmation. If you've [run the example](#running-the-example), you can view it at [http://localhost:3000/pydantic-ai/feature/tool\_based\_generative\_ui](http://localhost:3000/pydantic-ai/feature/tool_based_generative_ui). #### Haiku Tools - `generate_haiku` - AG-UI tool to display a haiku in English and Japanese #### Haiku Prompt ```text Generate a haiku about formula 1 ``` #### Tool Based Generative UI - Code tool\_based\_generative\_ui.py ```python from __future__ import annotations from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent('openai:gpt-5-mini') app = AGUIApp(agent) ``` --- # [Flight Booking](https://pydantic.dev/docs/ai/examples/complex-workflows/flight-booking/) # Flight Booking Example of a multi-agent flow where one agent delegates work to another, then hands off control to a third agent. Demonstrates: - [agent delegation](/docs/ai/guides/multi-agent-applications#agent-delegation) - [programmatic agent hand-off](/docs/ai/guides/multi-agent-applications#programmatic-agent-hand-off) - [usage limits](/docs/ai/core-concepts/agent#usage-limits) In this scenario, a group of agents work together to find the best flight for a user. The control flow for this example can be summarised as follows: ```mermaid graph TD START --> search_agent("search agent") search_agent --> extraction_agent("extraction agent") extraction_agent --> search_agent search_agent --> human_confirm("human confirm") human_confirm --> search_agent search_agent --> FAILED human_confirm --> find_seat_function("find seat function") find_seat_function --> human_seat_choice("human seat choice") human_seat_choice --> find_seat_agent("find seat agent") find_seat_agent --> find_seat_function find_seat_function --> buy_flights("buy flights") buy_flights --> SUCCESS ``` ## Running the Example With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run: - [pip](#tab-panel-136) - [uv](#tab-panel-137) Terminal ```bash python -m pydantic_ai_examples.flight_booking ``` Terminal ```bash uv run -m pydantic_ai_examples.flight_booking ``` ## Example Code flight\_booking.py ```python import datetime from dataclasses import dataclass from typing import Literal import logfire from pydantic import BaseModel, Field from rich.prompt import Prompt from pydantic_ai import ( Agent, ModelMessage, ModelRetry, RunContext, RunUsage, UsageLimits, ) # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class FlightDetails(BaseModel): """Details of the most suitable flight.""" flight_number: str price: int origin: str = Field(description='Three-letter airport code') destination: str = Field(description='Three-letter airport code') date: datetime.date class NoFlightFound(BaseModel): """When no valid flight is found.""" @dataclass class Deps: web_page_text: str req_origin: str req_destination: str req_date: datetime.date # This agent is responsible for controlling the flow of the conversation. search_agent = Agent[Deps, FlightDetails | NoFlightFound]( 'openai:gpt-5.2', output_type=FlightDetails | NoFlightFound, # type: ignore retries=4, system_prompt=( 'Your job is to find the cheapest flight for the user on the given date. ' ), ) # This agent is responsible for extracting flight details from web page text. extraction_agent = Agent( 'openai:gpt-5.2', output_type=list[FlightDetails], system_prompt='Extract all the flight details from the given text.', ) @search_agent.tool async def extract_flights(ctx: RunContext[Deps]) -> list[FlightDetails]: """Get details of all flights.""" # we pass the usage to the search agent so requests within this agent are counted result = await extraction_agent.run(ctx.deps.web_page_text, usage=ctx.usage) logfire.info('found {flight_count} flights', flight_count=len(result.output)) return result.output @search_agent.output_validator async def validate_output( ctx: RunContext[Deps], output: FlightDetails | NoFlightFound ) -> FlightDetails | NoFlightFound: """Procedural validation that the flight meets the constraints.""" if isinstance(output, NoFlightFound): return output errors: list[str] = [] if output.origin != ctx.deps.req_origin: errors.append( f'Flight should have origin {ctx.deps.req_origin}, not {output.origin}' ) if output.destination != ctx.deps.req_destination: errors.append( f'Flight should have destination {ctx.deps.req_destination}, not {output.destination}' ) if output.date != ctx.deps.req_date: errors.append(f'Flight should be on {ctx.deps.req_date}, not {output.date}') if errors: raise ModelRetry('\n'.join(errors)) else: return output class SeatPreference(BaseModel): row: int = Field(ge=1, le=30) seat: Literal['A', 'B', 'C', 'D', 'E', 'F'] class Failed(BaseModel): """Unable to extract a seat selection.""" # This agent is responsible for extracting the user's seat selection seat_preference_agent = Agent[None, SeatPreference | Failed]( 'openai:gpt-5.2', output_type=SeatPreference | Failed, system_prompt=( "Extract the user's seat preference. " 'Seats A and F are window seats. ' 'Row 1 is the front row and has extra leg room. ' 'Rows 14, and 20 also have extra leg room. ' ), ) # in reality this would be downloaded from a booking site, # potentially using another agent to navigate the site flights_web_page = """ 1. Flight SFO-AK123 - Price: $350 - Origin: San Francisco International Airport (SFO) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 10, 2025 2. Flight SFO-AK456 - Price: $370 - Origin: San Francisco International Airport (SFO) - Destination: Fairbanks International Airport (FAI) - Date: January 10, 2025 3. Flight SFO-AK789 - Price: $400 - Origin: San Francisco International Airport (SFO) - Destination: Juneau International Airport (JNU) - Date: January 20, 2025 4. Flight NYC-LA101 - Price: $250 - Origin: San Francisco International Airport (SFO) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 10, 2025 5. Flight CHI-MIA202 - Price: $200 - Origin: Chicago O'Hare International Airport (ORD) - Destination: Miami International Airport (MIA) - Date: January 12, 2025 6. Flight BOS-SEA303 - Price: $120 - Origin: Boston Logan International Airport (BOS) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 12, 2025 7. Flight DFW-DEN404 - Price: $150 - Origin: Dallas/Fort Worth International Airport (DFW) - Destination: Denver International Airport (DEN) - Date: January 10, 2025 8. Flight ATL-HOU505 - Price: $180 - Origin: Hartsfield-Jackson Atlanta International Airport (ATL) - Destination: George Bush Intercontinental Airport (IAH) - Date: January 10, 2025 """ # restrict how many requests this app can make to the LLM usage_limits = UsageLimits(request_limit=15) async def main(): deps = Deps( web_page_text=flights_web_page, req_origin='SFO', req_destination='ANC', req_date=datetime.date(2025, 1, 10), ) message_history: list[ModelMessage] | None = None usage: RunUsage = RunUsage() # run the agent until a satisfactory flight is found while True: result = await search_agent.run( f'Find me a flight from {deps.req_origin} to {deps.req_destination} on {deps.req_date}', deps=deps, usage=usage, message_history=message_history, usage_limits=usage_limits, ) if isinstance(result.output, NoFlightFound): print('No flight found') break else: flight = result.output print(f'Flight found: {flight}') answer = Prompt.ask( 'Do you want to buy this flight, or keep searching? (buy/*search)', choices=['buy', 'search', ''], show_choices=False, ) if answer == 'buy': seat = await find_seat(usage) await buy_tickets(flight, seat) break else: message_history = result.all_messages( output_tool_return_content='Please suggest another flight' ) async def find_seat(usage: RunUsage) -> SeatPreference: message_history: list[ModelMessage] | None = None while True: answer = Prompt.ask('What seat would you like?') result = await seat_preference_agent.run( answer, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, SeatPreference): return result.output else: print('Could not understand seat preference. Please try again.') message_history = result.all_messages() async def buy_tickets(flight_details: FlightDetails, seat: SeatPreference): print(f'Purchasing flight {flight_details=!r} {seat=!r}...') if __name__ == '__main__': import asyncio asyncio.run(main()) ``` --- # [Question Graph](https://pydantic.dev/docs/ai/examples/complex-workflows/question-graph/) # Question Graph Example of a graph for asking and evaluating questions. Demonstrates: - [`pydantic_graph`](/docs/ai/graph/) ## Running the Example With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run: - [pip](#tab-panel-140) - [uv](#tab-panel-141) Terminal ```bash python -m pydantic_ai_examples.question_graph ``` Terminal ```bash uv run -m pydantic_ai_examples.question_graph ``` ## Example Code question\_graph.py ```python from __future__ import annotations as _annotations from dataclasses import dataclass, field from pathlib import Path import logfire from groq import BaseModel from pydantic_ai import Agent, ModelMessage, format_as_xml from pydantic_graph import ( BaseNode, End, Graph, GraphRunContext, ) from pydantic_graph.persistence.file import FileStatePersistence # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() ask_agent = Agent('openai:gpt-5.2', output_type=str) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list[ModelMessage]) evaluate_agent_messages: list[ModelMessage] = field( default_factory=list[ModelMessage] ) @dataclass class Ask(BaseNode[QuestionState]): async def run(self, ctx: GraphRunContext[QuestionState]) -> Answer: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.all_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationOutput(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-5.2', output_type=EvaluationOutput, system_prompt='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> End[str] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.all_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) async def run_as_continuous(): state = QuestionState() node = Ask() end = await question_graph.run(node, state=state) print('END:', end.output) async def run_as_cli(answer: str | None): persistence = FileStatePersistence(Path('question_graph.json')) persistence.set_graph_types(question_graph) if snapshot := await persistence.load_next(): state = snapshot.state assert answer is not None, ( 'answer required, usage "uv run -m pydantic_ai_examples.question_graph cli "' ) node = Evaluate(answer) else: state = QuestionState() node = Ask() # debug(state, node) async with question_graph.iter(node, state=state, persistence=persistence) as run: while True: node = await run.next() if isinstance(node, End): print('END:', node.data) history = await persistence.load_all() print('history:', '\n'.join(str(e.node) for e in history), sep='\n') print('Finished!') break elif isinstance(node, Answer): print(node.question) break # otherwise just continue if __name__ == '__main__': import asyncio import sys try: sub_command = sys.argv[1] assert sub_command in ('continuous', 'cli', 'mermaid') except (IndexError, AssertionError): print( 'Usage:\n' ' uv run -m pydantic_ai_examples.question_graph mermaid\n' 'or:\n' ' uv run -m pydantic_ai_examples.question_graph continuous\n' 'or:\n' ' uv run -m pydantic_ai_examples.question_graph cli [answer]', file=sys.stderr, ) sys.exit(1) if sub_command == 'mermaid': print(question_graph.mermaid_code(start_node=Ask)) elif sub_command == 'continuous': asyncio.run(run_as_continuous()) else: a = sys.argv[2] if len(sys.argv) > 2 else None asyncio.run(run_as_cli(a)) ``` The mermaid diagram generated in this example looks like this: ```mermaid --- title: question_graph --- stateDiagram-v2 [*] --> Ask Ask --> Answer: ask the question Answer --> Evaluate: answer the question Evaluate --> Congratulate Evaluate --> Castigate Congratulate --> [*]: success Castigate --> Ask: try again ``` --- # [Bank Support](https://pydantic.dev/docs/ai/examples/conversational-agents/bank-support/) # Bank Support Small but complete example of using Pydantic AI to build a support agent for a bank. Demonstrates: - [dynamic system prompt](/docs/ai/core-concepts/agent#system-prompts) - [structured `output_type`](/docs/ai/core-concepts/output#structured-output) - [tools](/docs/ai/tools-toolsets/tools) ## Running the Example With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run: - [pip](#tab-panel-156) - [uv](#tab-panel-157) Terminal ```bash python -m pydantic_ai_examples.bank_support ``` Terminal ```bash uv run -m pydantic_ai_examples.bank_support ``` (or `PYDANTIC_AI_MODEL=gemini-3-flash-preview ...`) ## Example Code bank\_support.py ```python import sqlite3 from dataclasses import dataclass from pydantic import BaseModel from pydantic_ai import Agent, RunContext @dataclass class DatabaseConn: """A wrapper over the SQLite connection.""" sqlite_conn: sqlite3.Connection async def customer_name(self, *, id: int) -> str | None: res = cur.execute('SELECT name FROM customers WHERE id=?', (id,)) row = res.fetchone() if row: return row[0] return None async def customer_balance(self, *, id: int) -> float: res = cur.execute('SELECT balance FROM customers WHERE id=?', (id,)) row = res.fetchone() if row: return row[0] else: raise ValueError('Customer not found') @dataclass class SupportDependencies: customer_id: int db: DatabaseConn class SupportOutput(BaseModel): support_advice: str """Advice returned to the customer""" block_card: bool """Whether to block their card or not""" risk: int """Risk level of query""" support_agent = Agent( 'openai:gpt-5.2', deps_type=SupportDependencies, output_type=SupportOutput, instructions=( 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query. ' "Reply using the customer's name." ), ) @support_agent.instructions async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str: customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id) return f"The customer's name is {customer_name!r}" @support_agent.tool async def customer_balance(ctx: RunContext[SupportDependencies]) -> str: """Returns the customer's current account balance.""" balance = await ctx.deps.db.customer_balance( id=ctx.deps.customer_id, ) return f'${balance:.2f}' if __name__ == '__main__': with sqlite3.connect(':memory:') as con: cur = con.cursor() cur.execute('CREATE TABLE customers(id, name, balance)') cur.execute(""" INSERT INTO customers VALUES (123, 'John', 123.45) """) con.commit() deps = SupportDependencies(customer_id=123, db=DatabaseConn(sqlite_conn=con)) result = support_agent.run_sync('What is my balance?', deps=deps) print(result.output) """ support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1 """ result = support_agent.run_sync('I just lost my card!', deps=deps) print(result.output) """ support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8 """ ``` --- # [Chat App with FastAPI](https://pydantic.dev/docs/ai/examples/conversational-agents/chat-app/) # Chat App with FastAPI Simple chat app example build with FastAPI. Demonstrates: - [reusing chat history](/docs/ai/core-concepts/message-history) - [serializing messages](/docs/ai/core-concepts/message-history#accessing-messages-from-results) - [streaming responses](/docs/ai/core-concepts/output#streamed-results) This demonstrates storing chat history between requests and using it to give the model context for new responses. Most of the complex logic here is between `chat_app.py` which streams the response to the browser, and `chat_app.ts` which renders messages in the browser. ## Running the Example With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run: - [pip](#tab-panel-138) - [uv](#tab-panel-139) Terminal ```bash python -m pydantic_ai_examples.chat_app ``` Terminal ```bash uv run -m pydantic_ai_examples.chat_app ``` Then open the app at [localhost:8000](http://localhost:8000). ![Example conversation](/docs/ai/img/chat-app-example.png) ## Example Code Python code that runs the chat app: chat\_app.py ```python from __future__ import annotations as _annotations import asyncio import json import sqlite3 from collections.abc import AsyncIterator, Callable from concurrent.futures.thread import ThreadPoolExecutor from contextlib import asynccontextmanager from dataclasses import dataclass from datetime import datetime, timezone from functools import partial from pathlib import Path from typing import Annotated, Any, Literal, TypeVar import fastapi import logfire from fastapi import Depends, Request from fastapi.responses import FileResponse, Response, StreamingResponse from typing_extensions import LiteralString, ParamSpec, TypedDict from pydantic_ai import ( Agent, ModelMessage, ModelMessagesTypeAdapter, ModelRequest, ModelResponse, TextPart, UnexpectedModelBehavior, UserPromptPart, ) # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() agent = Agent('openai:gpt-5.2') THIS_DIR = Path(__file__).parent @asynccontextmanager async def lifespan(_app: fastapi.FastAPI): async with Database.connect() as db: yield {'db': db} app = fastapi.FastAPI(lifespan=lifespan) logfire.instrument_fastapi(app) @app.get('/') async def index() -> FileResponse: return FileResponse((THIS_DIR / 'chat_app.html'), media_type='text/html') @app.get('/chat_app.ts') async def main_ts() -> FileResponse: """Get the raw typescript code, it's compiled in the browser, forgive me.""" return FileResponse((THIS_DIR / 'chat_app.ts'), media_type='text/plain') async def get_db(request: Request) -> Database: return request.state.db @app.get('/chat/') async def get_chat(database: Database = Depends(get_db)) -> Response: msgs = await database.get_messages() return Response( b'\n'.join(json.dumps(to_chat_message(m)).encode('utf-8') for m in msgs), media_type='text/plain', ) class ChatMessage(TypedDict): """Format of messages sent to the browser.""" role: Literal['user', 'model'] timestamp: str content: str def to_chat_message(m: ModelMessage) -> ChatMessage: first_part = m.parts[0] if isinstance(m, ModelRequest): if isinstance(first_part, UserPromptPart): assert isinstance(first_part.content, str) return { 'role': 'user', 'timestamp': first_part.timestamp.isoformat(), 'content': first_part.content, } elif isinstance(m, ModelResponse): if isinstance(first_part, TextPart): return { 'role': 'model', 'timestamp': m.timestamp.isoformat(), 'content': first_part.content, } raise UnexpectedModelBehavior(f'Unexpected message type for chat app: {m}') @app.post('/chat/') async def post_chat( prompt: Annotated[str, fastapi.Form()], database: Database = Depends(get_db) ) -> StreamingResponse: async def stream_messages(): """Streams new line delimited JSON `Message`s to the client.""" # stream the user prompt so that can be displayed straight away yield ( json.dumps( { 'role': 'user', 'timestamp': datetime.now(tz=timezone.utc).isoformat(), 'content': prompt, } ).encode('utf-8') + b'\n' ) # get the chat history so far to pass as context to the agent messages = await database.get_messages() # run the agent with the user prompt and the chat history async with agent.run_stream(prompt, message_history=messages) as result: async for text in result.stream_output(debounce_by=0.01): # text here is a `str` and the frontend wants # JSON encoded ModelResponse, so we create one m = ModelResponse(parts=[TextPart(text)], timestamp=result.timestamp()) yield json.dumps(to_chat_message(m)).encode('utf-8') + b'\n' # add new messages (e.g. the user prompt and the agent response in this case) to the database await database.add_messages(result.new_messages_json()) return StreamingResponse(stream_messages(), media_type='text/plain') P = ParamSpec('P') R = TypeVar('R') @dataclass class Database: """Rudimentary database to store chat messages in SQLite. The SQLite standard library package is synchronous, so we use a thread pool executor to run queries asynchronously. """ con: sqlite3.Connection _loop: asyncio.AbstractEventLoop _executor: ThreadPoolExecutor @classmethod @asynccontextmanager async def connect( cls, file: Path = THIS_DIR / '.chat_app_messages.sqlite' ) -> AsyncIterator[Database]: with logfire.span('connect to DB'): loop = asyncio.get_event_loop() executor = ThreadPoolExecutor(max_workers=1) con = await loop.run_in_executor(executor, cls._connect, file) slf = cls(con, loop, executor) try: yield slf finally: await slf._asyncify(con.close) @staticmethod def _connect(file: Path) -> sqlite3.Connection: con = sqlite3.connect(str(file)) con = logfire.instrument_sqlite3(con) cur = con.cursor() cur.execute( 'CREATE TABLE IF NOT EXISTS messages (id INT PRIMARY KEY, message_list TEXT);' ) con.commit() return con async def add_messages(self, messages: bytes): await self._asyncify( self._execute, 'INSERT INTO messages (message_list) VALUES (?);', messages, commit=True, ) await self._asyncify(self.con.commit) async def get_messages(self) -> list[ModelMessage]: c = await self._asyncify( self._execute, 'SELECT message_list FROM messages order by id' ) rows = await self._asyncify(c.fetchall) messages: list[ModelMessage] = [] for row in rows: messages.extend(ModelMessagesTypeAdapter.validate_json(row[0])) return messages def _execute( self, sql: LiteralString, *args: Any, commit: bool = False ) -> sqlite3.Cursor: cur = self.con.cursor() cur.execute(sql, args) if commit: self.con.commit() return cur async def _asyncify( self, func: Callable[P, R], *args: P.args, **kwargs: P.kwargs ) -> R: return await self._loop.run_in_executor( # type: ignore self._executor, partial(func, **kwargs), *args, # type: ignore ) if __name__ == '__main__': import uvicorn uvicorn.run( 'pydantic_ai_examples.chat_app:app', reload=True, reload_dirs=[str(THIS_DIR)] ) ``` Simple HTML page to render the app: chat\_app.html ```html Chat App

Chat App

Ask me anything...

Error occurred, check the browser developer console for more information.
``` TypeScript to handle rendering the messages, to keep this simple (and at the risk of offending frontend developers) the typescript code is passed to the browser as plain text and transpiled in the browser. chat\_app.ts ```ts // BIG FAT WARNING: to avoid the complexity of npm, this typescript is compiled in the browser // there's currently no static type checking import { marked } from 'https://cdnjs.cloudflare.com/ajax/libs/marked/15.0.0/lib/marked.esm.js' const convElement = document.getElementById('conversation') const promptInput = document.getElementById('prompt-input') as HTMLInputElement const spinner = document.getElementById('spinner') // stream the response and render messages as each chunk is received // data is sent as newline-delimited JSON async function onFetchResponse(response: Response): Promise { let text = '' let decoder = new TextDecoder() if (response.ok) { const reader = response.body.getReader() while (true) { const {done, value} = await reader.read() if (done) { break } text += decoder.decode(value) addMessages(text) spinner.classList.remove('active') } addMessages(text) promptInput.disabled = false promptInput.focus() } else { const text = await response.text() console.error(`Unexpected response: ${response.status}`, {response, text}) throw new Error(`Unexpected response: ${response.status}`) } } // The format of messages, this matches pydantic-ai both for brevity and understanding // in production, you might not want to keep this format all the way to the frontend interface Message { role: string content: string timestamp: string } // take raw response text and render messages into the `#conversation` element // Message timestamp is assumed to be a unique identifier of a message, and is used to deduplicate // hence you can send data about the same message multiple times, and it will be updated // instead of creating a new message elements function addMessages(responseText: string) { const lines = responseText.split('\n') const messages: Message[] = lines.filter(line => line.length > 1).map(j => JSON.parse(j)) for (const message of messages) { // we use the timestamp as a crude element id const {timestamp, role, content} = message const id = `msg-${timestamp}` let msgDiv = document.getElementById(id) if (!msgDiv) { msgDiv = document.createElement('div') msgDiv.id = id msgDiv.title = `${role} at ${timestamp}` msgDiv.classList.add('border-top', 'pt-2', role) convElement.appendChild(msgDiv) } msgDiv.innerHTML = marked.parse(content) } window.scrollTo({ top: document.body.scrollHeight, behavior: 'smooth' }) } function onError(error: any) { console.error(error) document.getElementById('error').classList.remove('d-none') document.getElementById('spinner').classList.remove('active') } async function onSubmit(e: SubmitEvent): Promise { e.preventDefault() spinner.classList.add('active') const body = new FormData(e.target as HTMLFormElement) promptInput.value = '' promptInput.disabled = true const response = await fetch('/chat/', {method: 'POST', body}) await onFetchResponse(response) } // call onSubmit when the form is submitted (e.g. user clicks the send button or hits Enter) document.querySelector('form').addEventListener('submit', (e) => onSubmit(e).catch(onError)) // load messages on page load fetch('/chat/').then(onFetchResponse).catch(onError) ``` --- # [Data Analyst](https://pydantic.dev/docs/ai/examples/data-analytics/data-analyst/) # Data Analyst Sometimes in an agent workflow, the agent does not need to know the exact tool output, but still needs to process the tool output in some ways. This is especially common in data analytics: the agent needs to know that the result of a query tool is a `DataFrame` with certain named columns, but not necessarily the content of every single row. With Pydantic AI, you can use a [dependencies object](/docs/ai/core-concepts/dependencies) to store the result from one tool and use it in another tool. In this example, we'll build an agent that analyzes the [Rotten Tomatoes movie review dataset from Cornell](https://huggingface.co/datasets/cornell-movie-review-data/rotten_tomatoes). Demonstrates: - [agent dependencies](/docs/ai/core-concepts/dependencies) ## Running the Example With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run: - [pip](#tab-panel-146) - [uv](#tab-panel-147) Terminal ```bash python -m pydantic_ai_examples.data_analyst ``` Terminal ```bash uv run -m pydantic_ai_examples.data_analyst ``` Output (debug): > Based on my analysis of the Cornell Movie Review dataset (rotten\_tomatoes), there are **4,265 negative comments** in the training split. These are the reviews labeled as 'neg' (represented by 0 in the dataset). ## Example Code data\_analyst.py ```python from dataclasses import dataclass, field import datasets import duckdb import pandas as pd from pydantic_ai import Agent, ModelRetry, RunContext @dataclass class AnalystAgentDeps: output: dict[str, pd.DataFrame] = field(default_factory=dict[str, pd.DataFrame]) def store(self, value: pd.DataFrame) -> str: """Store the output in deps and return the reference such as Out[1] to be used by the LLM.""" ref = f'Out[{len(self.output) + 1}]' self.output[ref] = value return ref def get(self, ref: str) -> pd.DataFrame: if ref not in self.output: raise ModelRetry( f'Error: {ref} is not a valid variable reference. Check the previous messages and try again.' ) return self.output[ref] analyst_agent = Agent( 'openai:gpt-5.2', deps_type=AnalystAgentDeps, instructions='You are a data analyst and your job is to analyze the data according to the user request.', ) @analyst_agent.tool def load_dataset( ctx: RunContext[AnalystAgentDeps], path: str, split: str = 'train', ) -> str: """Load the `split` of dataset `dataset_name` from huggingface. Args: ctx: Pydantic AI agent RunContext path: name of the dataset in the form of `/` split: load the split of the dataset (default: "train") """ # begin load data from hf builder = datasets.load_dataset_builder(path) # pyright: ignore[reportUnknownMemberType] splits: dict[str, datasets.SplitInfo] = builder.info.splits or {} if split not in splits: raise ModelRetry( f'{split} is not valid for dataset {path}. Valid splits are {",".join(splits.keys())}' ) builder.download_and_prepare() # pyright: ignore[reportUnknownMemberType] dataset = builder.as_dataset(split=split) assert isinstance(dataset, datasets.Dataset) dataframe = dataset.to_pandas() assert isinstance(dataframe, pd.DataFrame) # end load data from hf # store the dataframe in the deps and get a ref like "Out[1]" ref = ctx.deps.store(dataframe) # construct a summary of the loaded dataset output = [ f'Loaded the dataset as `{ref}`.', f'Description: {dataset.info.description}' if dataset.info.description else None, f'Features: {dataset.info.features!r}' if dataset.info.features else None, ] return '\n'.join(filter(None, output)) @analyst_agent.tool def run_duckdb(ctx: RunContext[AnalystAgentDeps], dataset: str, sql: str) -> str: """Run DuckDB SQL query on the DataFrame. Note that the virtual table name used in DuckDB SQL must be `dataset`. Args: ctx: Pydantic AI agent RunContext dataset: reference string to the DataFrame sql: the query to be executed using DuckDB """ data = ctx.deps.get(dataset) result = duckdb.query_df(df=data, virtual_table_name='dataset', sql_query=sql) # pass the result as ref (because DuckDB SQL can select many rows, creating another huge dataframe) ref = ctx.deps.store(result.df()) return f'Executed SQL, result is `{ref}`' @analyst_agent.tool def display(ctx: RunContext[AnalystAgentDeps], name: str) -> str: """Display at most 5 rows of the dataframe.""" dataset = ctx.deps.get(name) return dataset.head().to_string() # pyright: ignore[reportUnknownMemberType] if __name__ == '__main__': deps = AnalystAgentDeps() result = analyst_agent.run_sync( user_prompt='Count how many negative comments are there in the dataset `cornell-movie-review-data/rotten_tomatoes`', deps=deps, ) print(result.output) ``` ## Appendix ### Choosing a Model This example requires using a model that understands DuckDB SQL. You can check with `clai`: Terminal ```sh > clai -m bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0 clai - Pydantic AI CLI v0.0.1.dev920+41dd069 with bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0 clai ➤ do you understand duckdb sql? # DuckDB SQL Yes, I understand DuckDB SQL. DuckDB is an in-process analytical SQL database that uses syntax similar to PostgreSQL. It specializes in analytical queries and is designed for high-performance analysis of structured data. Some key features of DuckDB SQL include: • OLAP (Online Analytical Processing) optimized • Columnar-vectorized query execution • Standard SQL support with PostgreSQL compatibility • Support for complex analytical queries • Efficient handling of CSV/Parquet/JSON files I can help you with DuckDB SQL queries, schema design, optimization, or other DuckDB-related questions. ``` --- # [RAG](https://pydantic.dev/docs/ai/examples/data-analytics/rag/) # RAG RAG search example. This demo allows you to ask question of the [logfire](https://pydantic.dev/logfire) documentation. Demonstrates: - [tools](/docs/ai/tools-toolsets/tools) - [agent dependencies](/docs/ai/core-concepts/dependencies) - RAG search This is done by creating a database containing each section of the markdown documentation, then registering the search tool with the Pydantic AI agent. Logic for extracting sections from markdown files and a JSON file with that data is available in [this gist](https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992). [PostgreSQL with pgvector](https://github.com/pgvector/pgvector) is used as the search database, the easiest way to download and run pgvector is using Docker: Terminal ```bash mkdir postgres-data docker run --rm \ -e POSTGRES_PASSWORD=postgres \ -p 54320:5432 \ -v `pwd`/postgres-data:/var/lib/postgresql/data \ pgvector/pgvector:pg17 ``` As with the [SQL gen](/docs/ai/examples/sql-gen) example, we run postgres on port `54320` to avoid conflicts with any other postgres instances you may have running. We also mount the PostgreSQL `data` directory locally to persist the data if you need to stop and restart the container. With that running and [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), we can build the search database with (**WARNING**: this requires the `OPENAI_API_KEY` env variable and will calling the OpenAI embedding API around 300 times to generate embeddings for each section of the documentation): - [pip](#tab-panel-148) - [uv](#tab-panel-149) Terminal ```bash python -m pydantic_ai_examples.rag build ``` Terminal ```bash uv run -m pydantic_ai_examples.rag build ``` (Note building the database doesn't use Pydantic AI right now, instead it uses the OpenAI SDK directly.) You can then ask the agent a question with: - [pip](#tab-panel-150) - [uv](#tab-panel-151) Terminal ```bash python -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" ``` Terminal ```bash uv run -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" ``` ## Example Code rag.py ```python from __future__ import annotations as _annotations import asyncio import re import sys import unicodedata from contextlib import asynccontextmanager from dataclasses import dataclass import asyncpg import httpx import logfire import pydantic_core from anyio import create_task_group from openai import AsyncOpenAI from pydantic import TypeAdapter from typing_extensions import AsyncGenerator from pydantic_ai import Agent, RunContext # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_asyncpg() logfire.instrument_pydantic_ai() @dataclass class Deps: openai: AsyncOpenAI pool: asyncpg.Pool agent = Agent('openai:gpt-5.2', deps_type=Deps) @agent.tool async def retrieve(context: RunContext[Deps], search_query: str) -> str: """Retrieve documentation sections based on a search query. Args: context: The call context. search_query: The search query. """ with logfire.span( 'create embedding for {search_query=}', search_query=search_query ): embedding = await context.deps.openai.embeddings.create( input=search_query, model='text-embedding-3-small', ) assert len(embedding.data) == 1, ( f'Expected 1 embedding, got {len(embedding.data)}, doc query: {search_query!r}' ) embedding = embedding.data[0].embedding embedding_json = pydantic_core.to_json(embedding).decode() rows = await context.deps.pool.fetch( 'SELECT url, title, content FROM doc_sections ORDER BY embedding <-> $1 LIMIT 8', embedding_json, ) return '\n\n'.join( f'# {row["title"]}\nDocumentation URL:{row["url"]}\n\n{row["content"]}\n' for row in rows ) async def run_agent(question: str): """Entry point to run the agent and perform RAG based question answering.""" openai = AsyncOpenAI() logfire.instrument_openai(openai) logfire.info('Asking "{question}"', question=question) async with database_connect(False) as pool: deps = Deps(openai=openai, pool=pool) answer = await agent.run(question, deps=deps) print(answer.output) ####################################################### # The rest of this file is dedicated to preparing the # # search database, and some utilities. # ####################################################### # JSON document from # https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992 DOCS_JSON = ( 'https://gist.githubusercontent.com/' 'samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992/raw/' '80c5925c42f1442c24963aaf5eb1a324d47afe95/logfire_docs.json' ) async def build_search_db(): """Build the search database.""" async with httpx.AsyncClient() as client: response = await client.get(DOCS_JSON) response.raise_for_status() sections = sections_ta.validate_json(response.content) openai = AsyncOpenAI() logfire.instrument_openai(openai) async with database_connect(True) as pool: with logfire.span('create schema'): async with pool.acquire() as conn: async with conn.transaction(): await conn.execute(DB_SCHEMA) sem = asyncio.Semaphore(10) async with create_task_group() as tg: for section in sections: tg.start_soon(insert_doc_section, sem, openai, pool, section) async def insert_doc_section( sem: asyncio.Semaphore, openai: AsyncOpenAI, pool: asyncpg.Pool, section: DocsSection, ) -> None: async with sem: url = section.url() exists = await pool.fetchval('SELECT 1 FROM doc_sections WHERE url = $1', url) if exists: logfire.info('Skipping {url=}', url=url) return with logfire.span('create embedding for {url=}', url=url): embedding = await openai.embeddings.create( input=section.embedding_content(), model='text-embedding-3-small', ) assert len(embedding.data) == 1, ( f'Expected 1 embedding, got {len(embedding.data)}, doc section: {section}' ) embedding = embedding.data[0].embedding embedding_json = pydantic_core.to_json(embedding).decode() await pool.execute( 'INSERT INTO doc_sections (url, title, content, embedding) VALUES ($1, $2, $3, $4)', url, section.title, section.content, embedding_json, ) @dataclass class DocsSection: id: int parent: int | None path: str level: int title: str content: str def url(self) -> str: url_path = re.sub(r'\.md$', '', self.path) return ( f'https://logfire.pydantic.dev/docs/{url_path}/#{slugify(self.title, "-")}' ) def embedding_content(self) -> str: return '\n\n'.join((f'path: {self.path}', f'title: {self.title}', self.content)) sections_ta = TypeAdapter(list[DocsSection]) # pyright: reportUnknownMemberType=false # pyright: reportUnknownVariableType=false @asynccontextmanager async def database_connect( create_db: bool = False, ) -> AsyncGenerator[asyncpg.Pool, None]: server_dsn, database = ( 'postgresql://postgres:postgres@localhost:54320', 'pydantic_ai_rag', ) if create_db: with logfire.span('check and create DB'): conn = await asyncpg.connect(server_dsn) try: db_exists = await conn.fetchval( 'SELECT 1 FROM pg_database WHERE datname = $1', database ) if not db_exists: await conn.execute(f'CREATE DATABASE {database}') finally: await conn.close() pool = await asyncpg.create_pool(f'{server_dsn}/{database}') try: yield pool finally: await pool.close() DB_SCHEMA = """ CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE IF NOT EXISTS doc_sections ( id serial PRIMARY KEY, url text NOT NULL UNIQUE, title text NOT NULL, content text NOT NULL, -- text-embedding-3-small returns a vector of 1536 floats embedding vector(1536) NOT NULL ); CREATE INDEX IF NOT EXISTS idx_doc_sections_embedding ON doc_sections USING hnsw (embedding vector_l2_ops); """ def slugify(value: str, separator: str, unicode: bool = False) -> str: """Slugify a string, to make it URL friendly.""" # Taken unchanged from https://github.com/Python-Markdown/markdown/blob/3.7/markdown/extensions/toc.py#L38 if not unicode: # Replace Extended Latin characters with ASCII, i.e. `žlutý` => `zluty` value = unicodedata.normalize('NFKD', value) value = value.encode('ascii', 'ignore').decode('ascii') value = re.sub(r'[^\w\s-]', '', value).strip().lower() return re.sub(rf'[{separator}\s]+', separator, value) if __name__ == '__main__': action = sys.argv[1] if len(sys.argv) > 1 else None if action == 'build': asyncio.run(build_search_db()) elif action == 'search': if len(sys.argv) == 3: q = sys.argv[2] else: q = 'How do I configure logfire to work with FastAPI?' asyncio.run(run_agent(q)) else: print( 'uv run --extra examples -m pydantic_ai_examples.rag build|search', file=sys.stderr, ) sys.exit(1) ``` --- # [SQL Generation](https://pydantic.dev/docs/ai/examples/data-analytics/sql-gen/) # SQL Generation Example demonstrating how to use Pydantic AI to generate SQL queries based on user input. Demonstrates: - [dynamic system prompt](/docs/ai/core-concepts/agent#system-prompts) - [structured `output_type`](/docs/ai/core-concepts/output#structured-output) - [output validation](/docs/ai/core-concepts/output#output-validator-functions) - [agent dependencies](/docs/ai/core-concepts/dependencies) ## Running the Example The resulting SQL is validated by running it as an `EXPLAIN` query on PostgreSQL. To run the example, you first need to run PostgreSQL, e.g. via Docker: Terminal ```bash docker run --rm -e POSTGRES_PASSWORD=postgres -p 54320:5432 postgres ``` _(we run postgres on port `54320` to avoid conflicts with any other postgres instances you may have running)_ With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run: - [pip](#tab-panel-142) - [uv](#tab-panel-143) Terminal ```bash python -m pydantic_ai_examples.sql_gen ``` Terminal ```bash uv run -m pydantic_ai_examples.sql_gen ``` or to use a custom prompt: - [pip](#tab-panel-144) - [uv](#tab-panel-145) Terminal ```bash python -m pydantic_ai_examples.sql_gen "find me errors" ``` Terminal ```bash uv run -m pydantic_ai_examples.sql_gen "find me errors" ``` This model uses `gemini-3-flash-preview` by default since Gemini is good at single shot queries of this kind. ## Example Code sql\_gen.py ```py import asyncio import sys from collections.abc import AsyncGenerator from contextlib import asynccontextmanager from dataclasses import dataclass from datetime import date from typing import Annotated, Any, TypeAlias import asyncpg import logfire from annotated_types import MinLen from devtools import debug from pydantic import BaseModel, Field from pydantic_ai import Agent, ModelRetry, RunContext, format_as_xml # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_asyncpg() logfire.instrument_pydantic_ai() DB_SCHEMA = """ CREATE TABLE records ( created_at timestamptz, start_timestamp timestamptz, end_timestamp timestamptz, trace_id text, span_id text, parent_span_id text, level log_level, span_name text, message text, attributes_json_schema text, attributes jsonb, tags text[], is_exception boolean, otel_status_message text, service_name text ); """ SQL_EXAMPLES = [ { 'request': 'show me records where foobar is false', 'response': "SELECT * FROM records WHERE attributes->>'foobar' = false", }, { 'request': 'show me records where attributes include the key "foobar"', 'response': "SELECT * FROM records WHERE attributes ? 'foobar'", }, { 'request': 'show me records from yesterday', 'response': "SELECT * FROM records WHERE start_timestamp::date > CURRENT_TIMESTAMP - INTERVAL '1 day'", }, { 'request': 'show me error records with the tag "foobar"', 'response': "SELECT * FROM records WHERE level = 'error' and 'foobar' = ANY(tags)", }, ] @dataclass class Deps: conn: asyncpg.Connection class Success(BaseModel): """Response when SQL could be successfully generated.""" sql_query: Annotated[str, MinLen(1)] explanation: str = Field( '', description='Explanation of the SQL query, as markdown' ) class InvalidRequest(BaseModel): """Response the user input didn't include enough information to generate SQL.""" error_message: str Response: TypeAlias = Success | InvalidRequest agent = Agent[Deps, Response]( 'google-gla:gemini-3-flash-preview', # Type ignore while we wait for PEP-0747, nonetheless unions will work fine everywhere else output_type=Response, # type: ignore deps_type=Deps, ) @agent.system_prompt async def system_prompt() -> str: return f"""\ Given the following PostgreSQL table of records, your job is to write a SQL query that suits the user's request. Database schema: {DB_SCHEMA} today's date = {date.today()} {format_as_xml(SQL_EXAMPLES)} """ @agent.output_validator async def validate_output(ctx: RunContext[Deps], output: Response) -> Response: if isinstance(output, InvalidRequest): return output # gemini often adds extraneous backslashes to SQL output.sql_query = output.sql_query.replace('\\', '') if not output.sql_query.upper().startswith('SELECT'): raise ModelRetry('Please create a SELECT query') try: await ctx.deps.conn.execute(f'EXPLAIN {output.sql_query}') except asyncpg.exceptions.PostgresError as e: raise ModelRetry(f'Invalid query: {e}') from e else: return output async def main(): if len(sys.argv) == 1: prompt = 'show me logs from yesterday, with level "error"' else: prompt = sys.argv[1] async with database_connect( 'postgresql://postgres:postgres@localhost:54320', 'pydantic_ai_sql_gen' ) as conn: deps = Deps(conn) result = await agent.run(prompt, deps=deps) debug(result.output) # pyright: reportUnknownMemberType=false # pyright: reportUnknownVariableType=false @asynccontextmanager async def database_connect(server_dsn: str, database: str) -> AsyncGenerator[Any, None]: with logfire.span('check and create DB'): conn = await asyncpg.connect(server_dsn) try: db_exists = await conn.fetchval( 'SELECT 1 FROM pg_database WHERE datname = $1', database ) if not db_exists: await conn.execute(f'CREATE DATABASE {database}') finally: await conn.close() conn = await asyncpg.connect(f'{server_dsn}/{database}') try: with logfire.span('create schema'): async with conn.transaction(): if not db_exists: await conn.execute( "CREATE TYPE log_level AS ENUM ('debug', 'info', 'warning', 'error', 'critical')" ) await conn.execute(DB_SCHEMA) yield conn finally: await conn.close() if __name__ == '__main__': asyncio.run(main()) ``` --- # [Pydantic Model](https://pydantic.dev/docs/ai/examples/getting-started/pydantic-model/) # Pydantic Model Simple example of using Pydantic AI to construct a Pydantic model from a text input. Demonstrates: - [structured `output_type`](/docs/ai/core-concepts/output#structured-output) ## Running the Example With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run: - [pip](#tab-panel-152) - [uv](#tab-panel-153) Terminal ```bash python -m pydantic_ai_examples.pydantic_model ``` Terminal ```bash uv run -m pydantic_ai_examples.pydantic_model ``` This examples uses `openai:gpt-5` by default, but it works well with other models, e.g. you can run it with Gemini using: - [pip](#tab-panel-154) - [uv](#tab-panel-155) Terminal ```bash PYDANTIC_AI_MODEL=gemini-3-pro-preview python -m pydantic_ai_examples.pydantic_model ``` Terminal ```bash PYDANTIC_AI_MODEL=gemini-3-pro-preview uv run -m pydantic_ai_examples.pydantic_model ``` (or `PYDANTIC_AI_MODEL=gemini-3-flash-preview ...`) ## Example Code pydantic\_model.py ```py import os import logfire from pydantic import BaseModel from pydantic_ai import Agent # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class MyModel(BaseModel): city: str country: str model = os.getenv('PYDANTIC_AI_MODEL', 'openai:gpt-5.2') print(f'Using model: {model}') agent = Agent(model, output_type=MyModel) if __name__ == '__main__': result = agent.run_sync('The windy city in the US of A.') print(result.output) print(result.usage()) ``` --- # [Weather Agent](https://pydantic.dev/docs/ai/examples/getting-started/weather-agent/) # Weather Agent Example of Pydantic AI with multiple tools which the LLM needs to call in turn to answer a question. Demonstrates: - [tools](/docs/ai/tools-toolsets/tools) - [agent dependencies](/docs/ai/core-concepts/dependencies) - [streaming text responses](/docs/ai/core-concepts/output#streaming-text) - Building a [Gradio](https://www.gradio.app/) UI for the agent In this case the idea is a "weather" agent -- the user can ask for the weather in multiple locations, the agent will use the `get_lat_lng` tool to get the latitude and longitude of the locations, then use the `get_weather` tool to get the weather for those locations. ## Running the Example To run this example properly, you might want to add two extra API keys **(Note if either key is missing, the code will fall back to dummy data, so they're not required)**: - A weather API key from [tomorrow.io](https://www.tomorrow.io/weather-api/) set via `WEATHER_API_KEY` - A geocoding API key from [geocode.maps.co](https://geocode.maps.co/) set via `GEO_API_KEY` With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run: - [pip](#tab-panel-158) - [uv](#tab-panel-159) Terminal ```bash python -m pydantic_ai_examples.weather_agent ``` Terminal ```bash uv run -m pydantic_ai_examples.weather_agent ``` ## Example Code weather\_agent.py ```python from __future__ import annotations as _annotations import asyncio from dataclasses import dataclass from typing import Any import logfire from httpx import AsyncClient from pydantic import BaseModel from pydantic_ai import Agent, RunContext # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() @dataclass class Deps: client: AsyncClient weather_agent = Agent( 'openai:gpt-5-mini', # 'Be concise, reply with one sentence.' is enough for some models (like openai) to use # the below tools appropriately, but others like anthropic and gemini require a bit more direction. instructions='Be concise, reply with one sentence.', deps_type=Deps, retries=2, ) class LatLng(BaseModel): lat: float lng: float @weather_agent.tool async def get_lat_lng(ctx: RunContext[Deps], location_description: str) -> LatLng: """Get the latitude and longitude of a location. Args: ctx: The context. location_description: A description of a location. """ # NOTE: the response here will be random, and is not related to the location description. r = await ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/latlng', params={'location': location_description}, ) r.raise_for_status() return LatLng.model_validate_json(r.content) @weather_agent.tool async def get_weather(ctx: RunContext[Deps], lat: float, lng: float) -> dict[str, Any]: """Get the weather at a location. Args: ctx: The context. lat: Latitude of the location. lng: Longitude of the location. """ # NOTE: the responses here will be random, and are not related to the lat and lng. temp_response, descr_response = await asyncio.gather( ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/number', params={'min': 10, 'max': 30}, ), ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/weather', params={'lat': lat, 'lng': lng}, ), ) temp_response.raise_for_status() descr_response.raise_for_status() return { 'temperature': f'{temp_response.text} °C', 'description': descr_response.text, } async def main(): async with AsyncClient() as client: logfire.instrument_httpx(client, capture_all=True) deps = Deps(client=client) result = await weather_agent.run( 'What is the weather like in London and in Wiltshire?', deps=deps ) print('Response:', result.output) if __name__ == '__main__': asyncio.run(main()) ``` ## Running the UI You can build multi-turn chat applications for your agent with [Gradio](https://www.gradio.app/), a framework for building AI web applications entirely in python. Gradio comes with built-in chat components and agent support so the entire UI will be implemented in a single python file! Here's what the UI looks like for the weather agent: Terminal ```bash pip install gradio>=6.7.0 python/uv-run -m pydantic_ai_examples.weather_agent_gradio ``` ## UI Code weather\_agent\_gradio.py ```py from __future__ import annotations as _annotations import json from httpx import AsyncClient from pydantic import BaseModel from pydantic_ai import ToolCallPart, ToolReturnPart from pydantic_ai_examples.weather_agent import Deps, weather_agent try: import gradio as gr except ImportError as e: raise ImportError( 'Please install gradio with `pip install gradio`. You must use python>=3.10.' ) from e TOOL_TO_DISPLAY_NAME = {'get_lat_lng': 'Geocoding API', 'get_weather': 'Weather API'} client = AsyncClient() deps = Deps(client=client) async def stream_from_agent(prompt: str, chatbot: list[dict], past_messages: list): chatbot.append({'role': 'user', 'content': prompt}) yield gr.Textbox(interactive=False, value=''), chatbot, gr.skip() async with weather_agent.run_stream( prompt, deps=deps, message_history=past_messages ) as result: for message in result.new_messages(): for call in message.parts: if isinstance(call, ToolCallPart): call_args = call.args_as_json_str() metadata = { 'title': f'🛠️ Using {TOOL_TO_DISPLAY_NAME[call.tool_name]}', } if call.tool_call_id is not None: metadata['id'] = call.tool_call_id gr_message = { 'role': 'assistant', 'content': 'Parameters: ' + call_args, 'metadata': metadata, } chatbot.append(gr_message) if isinstance(call, ToolReturnPart): for gr_message in chatbot: if ( gr_message.get('metadata', {}).get('id', '') == call.tool_call_id ): if isinstance(call.content, BaseModel): json_content = call.content.model_dump_json() else: json_content = json.dumps(call.content) gr_message['content'] += f'\nOutput: {json_content}' yield gr.skip(), chatbot, gr.skip() chatbot.append({'role': 'assistant', 'content': ''}) async for message in result.stream_text(): chatbot[-1]['content'] = message yield gr.skip(), chatbot, gr.skip() past_messages = result.all_messages() yield gr.Textbox(interactive=True), gr.skip(), past_messages async def handle_retry(chatbot, past_messages: list, retry_data: gr.RetryData): new_history = chatbot[: retry_data.index] previous_prompt = chatbot[retry_data.index]['content'] past_messages = past_messages[: retry_data.index] async for update in stream_from_agent(previous_prompt, new_history, past_messages): yield update def undo(chatbot, past_messages: list, undo_data: gr.UndoData): new_history = chatbot[: undo_data.index] past_messages = past_messages[: undo_data.index] return chatbot[undo_data.index]['content'], new_history, past_messages def select_data(message: gr.SelectData) -> str: return message.value['text'] with gr.Blocks() as demo: gr.HTML( """

Weather Assistant

This assistant answer your weather questions.

""" ) past_messages = gr.State([]) chatbot = gr.Chatbot( label='Packing Assistant', avatar_images=(None, 'https://ai.pydantic.dev/img/logo-white.svg'), examples=[ {'text': 'What is the weather like in Miami?'}, {'text': 'What is the weather like in London?'}, ], ) with gr.Row(): prompt = gr.Textbox( lines=1, show_label=False, placeholder='What is the weather like in New York City?', ) generation = prompt.submit( stream_from_agent, inputs=[prompt, chatbot, past_messages], outputs=[prompt, chatbot, past_messages], ) chatbot.example_select(select_data, None, [prompt]) chatbot.retry( handle_retry, [chatbot, past_messages], [prompt, chatbot, past_messages] ) chatbot.undo(undo, [chatbot, past_messages], [prompt, chatbot, past_messages]) if __name__ == '__main__': demo.launch() ``` --- # [Setup](https://pydantic.dev/docs/ai/examples/setup/) # Setup Here we include some examples of how to use Pydantic AI and what it can do. ## Usage These examples are distributed with `pydantic-ai` so you can run them either by cloning the [pydantic-ai repo](https://github.com/pydantic/pydantic-ai) or by simply installing `pydantic-ai` from PyPI with `pip` or `uv`. ### Installing required dependencies Either way you'll need to install extra dependencies to run some examples, you just need to install the `examples` optional dependency group. If you've installed `pydantic-ai` via pip/uv, you can install the extra dependencies with: - [pip](#tab-panel-10) - [uv](#tab-panel-11) Terminal ```bash pip install "pydantic-ai[examples]" ``` Terminal ```bash uv add "pydantic-ai[examples]" ``` If you clone the repo, you should instead use `uv sync --extra examples` to install extra dependencies. ### Setting model environment variables These examples will need you to set up authentication with one or more of the LLMs, see the [model configuration](/docs/ai/models/overview) docs for details on how to do this. TL;DR: in most cases you'll need to set one of the following environment variables: - [OpenAI](#tab-panel-12) - [Google Gemini](#tab-panel-13) Terminal ```bash export OPENAI_API_KEY=your-api-key ``` Terminal ```bash export GEMINI_API_KEY=your-api-key ``` ### Running Examples To run the examples (this will work whether you installed `pydantic_ai`, or cloned the repo), run: - [pip](#tab-panel-14) - [uv](#tab-panel-15) Terminal ```bash python -m pydantic_ai_examples. ``` Terminal ```bash uv run -m pydantic_ai_examples. ``` For example, to run the very simple [`pydantic_model`](/docs/ai/examples/pydantic-model) example: - [pip](#tab-panel-16) - [uv](#tab-panel-17) Terminal ```bash python -m pydantic_ai_examples.pydantic_model ``` Terminal ```bash uv run -m pydantic_ai_examples.pydantic_model ``` If you like one-liners and you're using uv, you can run a pydantic-ai example with zero setup: Terminal ```bash OPENAI_API_KEY='your-api-key' \ uv run --with "pydantic-ai[examples]" \ -m pydantic_ai_examples.pydantic_model ``` * * * You'll probably want to edit examples in addition to just running them. You can copy the examples to a new directory with: - [pip](#tab-panel-18) - [uv](#tab-panel-19) Terminal ```bash python -m pydantic_ai_examples --copy-to examples/ ``` Terminal ```bash uv run -m pydantic_ai_examples --copy-to examples/ ``` --- # [Slack Lead Qualifier with Modal](https://pydantic.dev/docs/ai/examples/slack-lead-qualifier/) # Slack Lead Qualifier with Modal In this example, we're going to build an agentic app that: - automatically researches each new member that joins a company's public Slack community to see how good of a fit they are for the company's commercial product, - sends this analysis into a (private) Slack channel, and - sends a daily summary of the top 5 leads from the previous 24 hours into a (different) Slack channel. We'll be deploying the app on [Modal](https://modal.com), as it lets you use Python to define an app with web endpoints, scheduled functions, and background functions, and deploy them with a CLI, without needing to set up or manage any infrastructure. It's a great way to lower the barrier for people in your organization to start building and deploying AI agents to make their jobs easier. We also add [Pydantic Logfire](https://pydantic.dev/logfire) to get observability into the app and agent as they're running in response to webhooks and the schedule ## Screenshots This is what the analysis sent into Slack will look like: ![Slack message](/docs/ai/img/slack-lead-qualifier-slack.png) This is what the corresponding trace in [Logfire](https://pydantic.dev/logfire) will look like: ![Logfire trace](/docs/ai/img/slack-lead-qualifier-logfire.png) All of these entries can be clicked on to get more details about what happened at that step, including the full conversation with the LLM and HTTP requests and responses. ## Prerequisites If you just want to see the code without actually going through the effort of setting up the bits necessary to run it, feel free to [jump ahead](#the-code). ### Slack app You need to have a Slack workspace and the necessary permissions to create apps. 2. Create a new Slack app using the instructions at [https://docs.slack.dev/quickstart](https://docs.slack.dev/quickstart). 1. In step 2, "Requesting scopes", request the following scopes: - [`users.read`](https://docs.slack.dev/reference/scopes/users.read) - [`users.read.email`](https://docs.slack.dev/reference/scopes/users.read.email) - [`users.profile.read`](https://docs.slack.dev/reference/scopes/users.profile.read) 2. In step 3, "Installing and authorizing the app", note down the Access Token as we're going to need to store it as a Secret in Modal. 3. You can skip steps 4 and 5. We're going to need to subscribe to the `team_join` event, but at this point you don't have a webhook URL yet. 3. Create the channels the app will post into, and add the Slack app to them: - `#new-slack-leads` - `#daily-slack-leads-summary` These names are hard-coded in the example. If you want to use different channels, you can clone the repo and change them in `examples/pydantic_examples/slack_lead_qualifier/functions.py`. ### Logfire Write Token 1. If you don't have a Logfire account yet, create one on [https://logfire-us.pydantic.dev/](https://logfire-us.pydantic.dev/). 2. Create a new project named, for example, `slack-lead-qualifier`. 3. Generate a new Write Token and note it down, as we're going to need to store it as a Secret in Modal. ### OpenAI API Key 1. If you don't have an OpenAI account yet, create one on [https://platform.openai.com/](https://platform.openai.com/). 2. Create a new API Key in Settings and note it down, as we're going to need to store it as a Secret in Modal. ### Modal account 1. If you don't have a Modal account yet, create one on [https://modal.com/signup](https://modal.com/signup). 2. Create 3 Secrets of type "Custom" on [https://modal.com/secrets](https://modal.com/secrets): - Name: `slack`, key: `SLACK_API_KEY`, value: the Slack Access Token you generated earlier - Name: `logfire`, key: `LOGFIRE_TOKEN`, value: the Logfire Write Token you generated earlier - Name: `openai`, key: `OPENAI_API_KEY`, value: the OpenAI API Key you generated earlier ## Usage 1. Make sure you have the [dependencies installed](/docs/ai/examples/setup#usage). 2. Authenticate with Modal: Terminal ```bash python/uv-run -m modal setup ``` 3. Run the example as an [ephemeral Modal app](https://modal.com/docs/guide/apps#ephemeral-apps), meaning it will only run until you quit it using Ctrl+C: Terminal ```bash python/uv-run -m modal serve -m pydantic_ai_examples.slack_lead_qualifier.modal ``` 4. Note down the URL after `Created web function web_app =>`, this is your webhook endpoint URL. 5. Go back to [https://docs.slack.dev/quickstart](https://docs.slack.dev/quickstart) and follow step 4, "Configuring the app for event listening", to subscribe to the `team_join` event with the webhook endpoint URL you noted down as the Request URL. Now when someone new (possibly you with a throwaway email) joins the Slack workspace, you'll see the webhook event being processed in the terminal where you ran `modal serve` and in the Logfire Live view, and after waiting a few seconds you should see the result appear in the `#new-slack-leads` Slack channel! Faking a Slack signup You can also fake a Slack signup event and try out the agent like this, with any name or email you please: Terminal ```bash curl -X POST \ -H "Content-Type: application/json" \ -d '{ "type": "event_callback", "event": { "type": "team_join", "user": { "profile": { "email": "samuel@pydantic.dev", "first_name": "Samuel", "last_name": "Colvin", "display_name": "Samuel Colvin" } } } }' ``` Deploying to production If you'd like to deploy this app into your Modal workspace in a persistent fashion, you can use this command: Terminal ```bash python/uv-run -m modal deploy -m pydantic_ai_examples.slack_lead_qualifier.modal ``` You'll likely want to [download the code](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples/slack_lead_qualifier) first, put it in a new repo, and then do [continuous deployment](https://modal.com/docs/guide/continuous-deployment#github-actions) using GitHub Actions. Don't forget to update the Slack event request URL to the new persistent URL! You'll also want to modify the [instructions for the agent](#agent) to your own situation. ## The code We're going to start with the basics, and then gradually build up into the full app. ### Models #### `Profile` First, we define a [Pydantic](https://docs.pydantic.dev) model that represents a Slack user profile. These are the fields we get from the [`team_join`](https://docs.slack.dev/reference/events/team_join) event that's sent to the webhook endpoint that we'll define in a bit. models.py ```py ... class Profile(BaseModel): first_name: str | None = None last_name: str | None = None display_name: str | None = None email: str ... ``` We also define a `Profile.as_prompt()` helper method that uses [`format_as_xml`](/docs/ai/api/pydantic-ai/format_prompt/#pydantic_ai.format_prompt.format_as_xml) to turn the profile into a string that can be sent to the model. models.py ```py ... from pydantic_ai import format_as_xml ... class Profile(BaseModel): ... def as_prompt(self) -> str: return format_as_xml(self, root_tag='profile') ... ``` #### `Analysis` The second model we'll need represents the result of the analysis that the agent will perform. We include docstrings to provide additional context to the model on what these fields should contain. models.py ```py ... class Analysis(BaseModel): profile: Profile organization_name: str organization_domain: str job_title: str relevance: Annotated[int, Ge(1), Le(5)] """Estimated fit for Pydantic Logfire: 1 = low, 5 = high""" summary: str """One-sentence welcome note summarising who they are and how we might help""" ... ``` We also define a `Analysis.as_slack_blocks()` helper method that turns the analysis into some [Slack blocks](https://api.slack.com/reference/block-kit/blocks) that can be sent to the Slack API to post a new message. models.py ```py ... class Analysis(BaseModel): ... def as_slack_blocks(self, include_relevance: bool = False) -> list[dict[str, Any]]: profile = self.profile relevance = f'({self.relevance}/5)' if include_relevance else '' return [ { 'type': 'markdown', 'text': f'[{profile.display_name}](mailto:{profile.email}), {self.job_title} at [**{self.organization_name}**](https://{self.organization_domain}) {relevance}', }, { 'type': 'markdown', 'text': self.summary, }, ] ... ``` ### Agent Now it's time to get into Pydantic AI and define the agent that will do the actual analysis! We specify the model we'll use (`openai:gpt-5`), provide [instructions](/docs/ai/core-concepts/agent#instructions), give the agent access to the [DuckDuckGo search tool](/docs/ai/tools-toolsets/common-tools#duckduckgo-search-tool), and tell it to output either an `Analysis` or `None` using the [Native Output](/docs/ai/core-concepts/output#native-output) structured output mode. The real meat of the app is in the instructions that tell the agent how to evaluate each new Slack member. If you plan to use this app yourself, you'll of course want to modify them to your own situation. agent.py ```python ... from pydantic_ai import Agent, NativeOutput from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool ... agent = Agent( 'openai:gpt-5.2', instructions=dedent( """ When a new person joins our public Slack, please put together a brief snapshot so we can be most useful to them. **What to include** 1. **Who they are:** Any details about their professional role or projects (e.g. LinkedIn, GitHub, company bio). 2. **Where they work:** Name of the organisation and its domain. 3. **How we can help:** On a scale of 1-5, estimate how likely they are to benefit from **Pydantic Logfire** (our paid observability tool) based on factors such as company size, product maturity, or AI usage. *1 = probably not relevant, 5 = very strong fit.* **Our products (for context only)** • **Pydantic Validation** - Python data-validation (open source) • **Pydantic AI** - Python agent framework (open source) • **Pydantic Logfire** - Observability for traces, logs & metrics with first-class AI support (commercial) **How to research** • Use the provided DuckDuckGo search tool to research the person and the organization they work for, based on the email domain or what you find on e.g. LinkedIn and GitHub. • If you can't find enough to form a reasonable view, return **None**. """ ), tools=[duckduckgo_search_tool()], output_type=NativeOutput([Analysis, NoneType]), ) ... ``` #### `analyze_profile` We also define a `analyze_profile` helper function that takes a `Profile`, runs the agent, and returns an `Analysis` (or `None`), and instrument it using [Logfire](/docs/ai/integrations/logfire). agent.py ```py ... @logfire.instrument('Analyze profile') async def analyze_profile(profile: Profile) -> Analysis | None: result = await agent.run(profile.as_prompt()) return result.output ... ``` ### Analysis store The next building block we'll need is a place to store all the analyses that have been done so that we can look them up when we send the daily summary. Fortunately, Modal provides us with a convenient way to store some data that can be read back in a subsequent Modal run (webhook or scheduled): [`modal.Dict`](https://modal.com/docs/reference/modal.Dict). We define some convenience methods to easily add, list, and clear analyses. store.py ```py ... import modal ... class AnalysisStore: @classmethod @logfire.instrument('Add analysis to store') async def add(cls, analysis: Analysis): await cls._get_store().put.aio(analysis.profile.email, analysis.model_dump()) @classmethod @logfire.instrument('List analyses from store') async def list(cls) -> list[Analysis]: return [ Analysis.model_validate(analysis) async for analysis in cls._get_store().values.aio() ] @classmethod @logfire.instrument('Clear analyses from store') async def clear(cls): await cls._get_store().clear.aio() @classmethod def _get_store(cls) -> modal.Dict: return modal.Dict.from_name('analyses', create_if_missing=True) # type: ignore ... ``` Note Note that `# type: ignore` on the last line -- unfortunately `modal` does not fully define its types, so we need this to stop our static type checker `pyright`, which we run over all Pydantic AI code including examples, from complaining. ### Send Slack message Next, we'll need a way to actually send a Slack message, so we define a simple function that uses Slack's [`chat.postMessage`](https://api.slack.com/methods/chat.postMessage) API. slack.py ```py ... API_KEY = os.getenv('SLACK_API_KEY') assert API_KEY, 'SLACK_API_KEY is not set' @logfire.instrument('Send Slack message') async def send_slack_message(channel: str, blocks: list[dict[str, Any]]): client = httpx.AsyncClient() response = await client.post( 'https://slack.com/api/chat.postMessage', json={ 'channel': channel, 'blocks': blocks, }, headers={ 'Authorization': f'Bearer {API_KEY}', }, timeout=5, ) response.raise_for_status() result = response.json() if not result.get('ok', False): error = result.get('error', 'Unknown error') raise Exception(f'Failed to send to Slack: {error}') ... ``` ### Features Now we can start putting these building blocks together to implement the actual features we want! #### `process_slack_member` This function takes a [`Profile`](#profile), [analyzes](#analyze_profile) it using the agent, adds it to the [`AnalysisStore`](#analysis-store), and [sends](#send-slack-message) the analysis into the `#new-slack-leads` channel. functions.py ```py ... from .agent import analyze_profile from .models import Profile from .slack import send_slack_message from .store import AnalysisStore ... NEW_LEAD_CHANNEL = '#new-slack-leads' ... @logfire.instrument('Process Slack member') async def process_slack_member(profile: Profile): analysis = await analyze_profile(profile) logfire.info('Analysis', analysis=analysis) if analysis is None: return await AnalysisStore().add(analysis) await send_slack_message( NEW_LEAD_CHANNEL, [ { 'type': 'header', 'text': { 'type': 'plain_text', 'text': f'New Slack member with score {analysis.relevance}/5', }, }, { 'type': 'divider', }, *analysis.as_slack_blocks(), ], ) ... ``` #### `send_daily_summary` This function list all of the analyses in the [`AnalysisStore`](#analysis-store), takes the top 5 by relevance, [sends](#send-slack-message) them into the `#daily-slack-leads-summary` channel, and clears the `AnalysisStore` so that the next daily run won't process these analyses again. functions.py ```py ... from .slack import send_slack_message from .store import AnalysisStore ... DAILY_SUMMARY_CHANNEL = '#daily-slack-leads-summary' ... @logfire.instrument('Send daily summary') async def send_daily_summary(): analyses = await AnalysisStore().list() logfire.info('Analyses', analyses=analyses) if len(analyses) == 0: return sorted_analyses = sorted(analyses, key=lambda x: x.relevance, reverse=True) top_analyses = sorted_analyses[:5] blocks = [ { 'type': 'header', 'text': { 'type': 'plain_text', 'text': f'Top {len(top_analyses)} new Slack members from the last 24 hours', }, }, ] for analysis in top_analyses: blocks.extend( [ { 'type': 'divider', }, *analysis.as_slack_blocks(include_relevance=True), ] ) await send_slack_message( DAILY_SUMMARY_CHANNEL, blocks, ) await AnalysisStore().clear() ... ``` ### Web app As it stands, neither of these functions are actually being called from anywhere. Let's implement a [FastAPI](https://fastapi.tiangolo.com/) endpoint to handle the `team_join` Slack webhook (also known as the [Slack Events API](https://docs.slack.dev/apis/events-api)) and call the [`process_slack_member`](#process_slack_member) function we just defined. We also instrument FastAPI using Logfire for good measure. app.py ```py ... app = FastAPI() logfire.instrument_fastapi(app, capture_headers=True) @app.post('/') async def process_webhook(payload: dict[str, Any]) -> dict[str, Any]: if payload['type'] == 'url_verification': return {'challenge': payload['challenge']} elif ( payload['type'] == 'event_callback' and payload['event']['type'] == 'team_join' ): profile = Profile.model_validate(payload['event']['user']['profile']) process_slack_member(profile) return {'status': 'OK'} raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY) ... ``` #### `process_slack_member` with Modal I was a little sneaky there -- we're not actually calling the [`process_slack_member`](#process_slack_member) function we defined in `functions.py` directly, as Slack requires webhooks to respond within 3 seconds, and we need a bit more time than that to talk to the LLM, do some web searches, and send the Slack message. Instead, we're calling the following function defined alongside the app, which uses Modal's [`modal.Function.spawn`](https://modal.com/docs/reference/modal.Function#spawn) feature to run a function in the background. (If you're curious what the Modal side of this function looks like, you can [jump ahead](#backgrounded-process_slack_member).) Because `modal.py` (which we'll see in the next section) imports `app.py`, we import from `modal.py` inside the function definition because doing so at the top level would have resulted in a circular import error. We also pass along the current Logfire context to get [Distributed Tracing](https://logfire.pydantic.dev/docs/how-to-guides/distributed-tracing/), meaning that the background function execution will show up nested under the webhook request trace, so that we have everything related to that request in one place. app.py ```py ... def process_slack_member(profile: Profile): from .modal import process_slack_member as _process_slack_member _process_slack_member.spawn( profile.model_dump(), logfire_ctx=get_context() ) ... ``` ### Modal app Now let's see how easy Modal makes it to deploy all of this. #### Set up Modal The first thing we do is define the Modal app, by specifying the base image to use (Debian with Python 3.13), all the Python packages it needs, and all of the secrets defined in the Modal interface that need to be made available during runtime. modal.py ```py ... import modal image = modal.Image.debian_slim(python_version='3.13').pip_install( 'pydantic', 'pydantic_ai_slim[openai,duckduckgo]', 'logfire[httpx,fastapi]', 'fastapi[standard]', 'httpx', ) app = modal.App( name='slack-lead-qualifier', image=image, secrets=[ modal.Secret.from_name('logfire'), modal.Secret.from_name('openai'), modal.Secret.from_name('slack'), ], ) ... ``` #### Set up Logfire Next, we define a function to set up Logfire instrumentation for Pydantic AI and HTTPX. We cannot do this at the top level of the file, as the requested packages (like `logfire`) will only be available within functions running on Modal (like the ones we'll define next). This file, `modal.py`, runs on your local machine and only has access to the `modal` package. modal.py ```py ... def setup_logfire(): import logfire logfire.configure(service_name=app.name) logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) ... ``` #### Web app To deploy a [web endpoint](https://modal.com/docs/guide/webhooks) on Modal, we simply define a function that returns an ASGI app (like FastAPI) and decorate it with `@app.function()` and `@modal.asgi_app()`. This `web_app` function will be run on Modal, so inside the function we can call the `setup_logfire` function that requires the `logfire` package, and import `app.py` which uses the other requested packages. By default, Modal spins up a container to handle a function call (like a web request) on-demand, meaning there's a little bit of startup time to each request. However, Slack requires webhooks to respond within 3 seconds, so we specify `min_containers=1` to keep the web endpoint running and ready to answer requests at all times. This is a bit annoying and wasteful, but fortunately [Modal's pricing](https://modal.com/pricing) is pretty reasonable, you get $30 free monthly compute, and they offer up to $50k in free credits for startup and academic researchers. modal.py ```py ... @app.function(min_containers=1) @modal.asgi_app() # type: ignore def web_app(): setup_logfire() from .app import app as _app return _app ... ``` Note Note that `# type: ignore` on the `@modal.asgi_app()` line -- unfortunately `modal` does not fully define its types, so we need this to stop our static type checker `pyright`, which we run over all Pydantic AI code including examples, from complaining. #### Scheduled `send_daily_summary` To define a [scheduled function](https://modal.com/docs/guide/cron), we can use the `@app.function()` decorator with a `schedule` argument. This Modal function will call our imported [`send_daily_summary`](#send_daily_summary) function every day at 8 am UTC. modal.py ```py ... @app.function(schedule=modal.Cron('0 8 * * *')) # Every day at 8am UTC async def send_daily_summary(): setup_logfire() from .functions import send_daily_summary as _send_daily_summary await _send_daily_summary() ... ``` #### Backgrounded `process_slack_member` Finally, we define a Modal function that wraps our [`process_slack_member`](#process_slack_member) function, so that it can run in the background. As you'll remember from when we [spawned this function from the web app](#process_slack_member-with-modal), we passed along the Logfire context to get [Distributed Tracing](https://logfire.pydantic.dev/docs/how-to-guides/distributed-tracing/), so we need to attach it here. modal.py ```py ... @app.function() async def process_slack_member(profile_raw: dict[str, Any], logfire_ctx: Any): setup_logfire() from logfire.propagate import attach_context from .functions import process_slack_member as _process_slack_member from .models import Profile with attach_context(logfire_ctx): profile = Profile.model_validate(profile_raw) await _process_slack_member(profile) ... ``` ## Conclusion And that's it! Now, assuming you've met the [prerequisites](#prerequisites), you can run or deploy the app using the commands under [usage](#usage). --- # [Stream Markdown](https://pydantic.dev/docs/ai/examples/streaming/stream-markdown/) # Stream Markdown This example shows how to stream markdown from an agent, using the [`rich`](https://github.com/Textualize/rich) library to highlight the output in the terminal. It'll run the example with both OpenAI and Google Gemini models if the required environment variables are set. Demonstrates: - [streaming text responses](/docs/ai/core-concepts/output#streaming-text) ## Running the Example With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run: - [pip](#tab-panel-160) - [uv](#tab-panel-161) Terminal ```bash python -m pydantic_ai_examples.stream_markdown ``` Terminal ```bash uv run -m pydantic_ai_examples.stream_markdown ``` ## Example Code stream\_markdown.py ```python import asyncio import os import logfire from rich.console import Console, ConsoleOptions, RenderResult from rich.live import Live from rich.markdown import CodeBlock, Markdown from rich.syntax import Syntax from rich.text import Text from pydantic_ai import Agent from pydantic_ai.models import KnownModelName # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() agent = Agent() # models to try, and the appropriate env var models: list[tuple[KnownModelName, str]] = [ ('google-gla:gemini-3-flash-preview', 'GEMINI_API_KEY'), ('openai:gpt-5-mini', 'OPENAI_API_KEY'), ('groq:llama-3.3-70b-versatile', 'GROQ_API_KEY'), ] async def main(): prettier_code_blocks() console = Console() prompt = 'Show me a short example of using Pydantic.' console.log(f'Asking: {prompt}...', style='cyan') for model, env_var in models: if env_var in os.environ: console.log(f'Using model: {model}') with Live('', console=console, vertical_overflow='visible') as live: async with agent.run_stream(prompt, model=model) as result: async for message in result.stream_output(): live.update(Markdown(message)) console.log(result.usage()) else: console.log(f'{model} requires {env_var} to be set.') def prettier_code_blocks(): """Make rich code blocks prettier and easier to copy. From https://github.com/samuelcolvin/aicli/blob/v0.8.0/samuelcolvin_aicli.py#L22 """ class SimpleCodeBlock(CodeBlock): def __rich_console__( self, console: Console, options: ConsoleOptions ) -> RenderResult: code = str(self.text).rstrip() yield Text(self.lexer_name, style='dim') yield Syntax( code, self.lexer_name, theme=self.theme, background_color='default', word_wrap=True, ) yield Text(f'/{self.lexer_name}', style='dim') Markdown.elements['fence'] = SimpleCodeBlock if __name__ == '__main__': asyncio.run(main()) ``` --- # [Stream Whales](https://pydantic.dev/docs/ai/examples/streaming/stream-whales/) # Stream Whales Information about whales -- an example of streamed structured response validation. Demonstrates: - [streaming structured output](/docs/ai/core-concepts/output#streaming-structured-output) This script streams structured responses about whales, validates the data and displays it as a dynamic table using [`rich`](https://github.com/Textualize/rich) as the data is received. ## Running the Example With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run: - [pip](#tab-panel-162) - [uv](#tab-panel-163) Terminal ```bash python -m pydantic_ai_examples.stream_whales ``` Terminal ```bash uv run -m pydantic_ai_examples.stream_whales ``` Should give an output like this: ## Example Code stream\_whales.py ```python from typing import Annotated import logfire from pydantic import Field from rich.console import Console from rich.live import Live from rich.table import Table from typing_extensions import NotRequired, TypedDict from pydantic_ai import Agent # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class Whale(TypedDict): name: str length: Annotated[ float, Field(description='Average length of an adult whale in meters.') ] weight: NotRequired[ Annotated[ float, Field(description='Average weight of an adult whale in kilograms.', ge=50), ] ] ocean: NotRequired[str] description: NotRequired[Annotated[str, Field(description='Short Description')]] agent = Agent('openai:gpt-5.2', output_type=list[Whale]) async def main(): console = Console() with Live('\n' * 36, console=console) as live: console.print('Requesting data...', style='cyan') async with agent.run_stream( 'Generate me details of 5 species of Whale.' ) as result: console.print('Response:', style='green') async for whales in result.stream_output(debounce_by=0.01): table = Table( title='Species of Whale', caption='Streaming Structured responses from OpenAI', width=120, ) table.add_column('ID', justify='right') table.add_column('Name') table.add_column('Avg. Length (m)', justify='right') table.add_column('Avg. Weight (kg)', justify='right') table.add_column('Ocean') table.add_column('Description', justify='right') for wid, whale in enumerate(whales, start=1): table.add_row( str(wid), whale['name'], f'{whale["length"]:0.0f}', f'{w:0.0f}' if (w := whale.get('weight')) else '...', whale.get('ocean') or '...', whale.get('description') or '...', ) live.update(table) if __name__ == '__main__': import asyncio asyncio.run(main()) ``` --- # [Decisions](https://pydantic.dev/docs/ai/graph/beta/decisions/) # Decisions Decision nodes enable conditional branching in your graph based on the type or value of data flowing through it. A decision node evaluates incoming data and routes it to different branches based on: - Type matching (using `isinstance`) - Literal value matching - Custom predicate functions The first matching branch is taken, similar to pattern matching or `if-elif-else` chains. ## Creating Decisions Use [`g.decision()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.decision) to create a decision node, then add branches with [`g.match()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.match): simple\_decision.py ```python from dataclasses import dataclass from typing import Literal from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression @dataclass class DecisionState: path_taken: str | None = None async def main(): g = GraphBuilder(state_type=DecisionState, output_type=str) @g.step async def choose_path(ctx: StepContext[DecisionState, None, None]) -> Literal['left', 'right']: return 'left' @g.step async def left_path(ctx: StepContext[DecisionState, None, object]) -> str: ctx.state.path_taken = 'left' return 'Went left' @g.step async def right_path(ctx: StepContext[DecisionState, None, object]) -> str: ctx.state.path_taken = 'right' return 'Went right' g.add( g.edge_from(g.start_node).to(choose_path), g.edge_from(choose_path).to( g.decision() .branch(g.match(TypeExpression[Literal['left']]).to(left_path)) .branch(g.match(TypeExpression[Literal['right']]).to(right_path)) ), g.edge_from(left_path, right_path).to(g.end_node), ) graph = g.build() state = DecisionState() result = await graph.run(state=state) print(result) #> Went left print(state.path_taken) #> left ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Type Matching Match by type using regular Python types: type\_matching.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class DecisionState: pass async def main(): g = GraphBuilder(state_type=DecisionState, output_type=str) @g.step async def return_int(ctx: StepContext[DecisionState, None, None]) -> int: return 42 @g.step async def handle_int(ctx: StepContext[DecisionState, None, int]) -> str: return f'Got int: {ctx.inputs}' @g.step async def handle_str(ctx: StepContext[DecisionState, None, str]) -> str: return f'Got str: {ctx.inputs}' g.add( g.edge_from(g.start_node).to(return_int), g.edge_from(return_int).to( g.decision() .branch(g.match(int).to(handle_int)) .branch(g.match(str).to(handle_str)) ), g.edge_from(handle_int, handle_str).to(g.end_node), ) graph = g.build() result = await graph.run(state=DecisionState()) print(result) #> Got int: 42 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ### Matching Union Types For more complex type expressions like unions, you need to use [`TypeExpression`](/docs/ai/api/pydantic_graph/beta/#pydantic_graph.beta.TypeExpression) because Python's type system doesn't allow union types to be used directly as runtime values: union\_type\_matching.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression @dataclass class DecisionState: pass async def main(): g = GraphBuilder(state_type=DecisionState, output_type=str) @g.step async def return_value(ctx: StepContext[DecisionState, None, None]) -> int | str: """Returns either an int or a str.""" return 42 @g.step async def handle_number(ctx: StepContext[DecisionState, None, int | float]) -> str: return f'Got number: {ctx.inputs}' @g.step async def handle_text(ctx: StepContext[DecisionState, None, str]) -> str: return f'Got text: {ctx.inputs}' g.add( g.edge_from(g.start_node).to(return_value), g.edge_from(return_value).to( g.decision() # Use TypeExpression for union types .branch(g.match(TypeExpression[int | float]).to(handle_number)) .branch(g.match(str).to(handle_text)) ), g.edge_from(handle_number, handle_text).to(g.end_node), ) graph = g.build() result = await graph.run(state=DecisionState()) print(result) #> Got number: 42 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ Note [`TypeExpression`](/docs/ai/api/pydantic_graph/beta/#pydantic_graph.beta.TypeExpression) is only necessary for complex type expressions like unions (`int | str`), `Literal`, and other type forms that aren't valid as runtime `type` objects. For simple types like `int`, `str`, or custom classes, you can pass them directly to `g.match()`. The `TypeForm` class introduced in [PEP 747](https://peps.python.org/pep-0747/) should eventually eliminate the need for this workaround. ## Custom Matchers Provide custom matching logic with the `matches` parameter: custom\_matcher.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression @dataclass class DecisionState: pass async def main(): g = GraphBuilder(state_type=DecisionState, output_type=str) @g.step async def return_number(ctx: StepContext[DecisionState, None, None]) -> int: return 7 @g.step async def even_path(ctx: StepContext[DecisionState, None, int]) -> str: return f'{ctx.inputs} is even' @g.step async def odd_path(ctx: StepContext[DecisionState, None, int]) -> str: return f'{ctx.inputs} is odd' g.add( g.edge_from(g.start_node).to(return_number), g.edge_from(return_number).to( g.decision() .branch(g.match(TypeExpression[int], matches=lambda x: x % 2 == 0).to(even_path)) .branch(g.match(TypeExpression[int], matches=lambda x: x % 2 == 1).to(odd_path)) ), g.edge_from(even_path, odd_path).to(g.end_node), ) graph = g.build() result = await graph.run(state=DecisionState()) print(result) #> 7 is odd ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Branch Priority Branches are evaluated in the order they're added. The first matching branch is taken: branch\_priority.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression @dataclass class DecisionState: pass async def main(): g = GraphBuilder(state_type=DecisionState, output_type=str) @g.step async def return_value(ctx: StepContext[DecisionState, None, None]) -> int: return 10 @g.step async def branch_a(ctx: StepContext[DecisionState, None, int]) -> str: return 'Branch A' @g.step async def branch_b(ctx: StepContext[DecisionState, None, int]) -> str: return 'Branch B' g.add( g.edge_from(g.start_node).to(return_value), g.edge_from(return_value).to( g.decision() .branch(g.match(TypeExpression[int], matches=lambda x: x >= 5).to(branch_a)) .branch(g.match(TypeExpression[int], matches=lambda x: x >= 0).to(branch_b)) ), g.edge_from(branch_a, branch_b).to(g.end_node), ) graph = g.build() result = await graph.run(state=DecisionState()) print(result) #> Branch A ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ Both branches could match `10`, but Branch A is first, so it's taken. ## Catch-All Branches Use `object` or `Any` to create a catch-all branch: catch\_all.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression @dataclass class DecisionState: pass async def main(): g = GraphBuilder(state_type=DecisionState, output_type=str) @g.step async def return_value(ctx: StepContext[DecisionState, None, None]) -> int: return 100 @g.step async def catch_all(ctx: StepContext[DecisionState, None, object]) -> str: return f'Caught: {ctx.inputs}' g.add( g.edge_from(g.start_node).to(return_value), g.edge_from(return_value).to(g.decision().branch(g.match(TypeExpression[object]).to(catch_all))), g.edge_from(catch_all).to(g.end_node), ) graph = g.build() result = await graph.run(state=DecisionState()) print(result) #> Caught: 100 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Nested Decisions Decisions can be nested for complex conditional logic: nested\_decisions.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression @dataclass class DecisionState: pass async def main(): g = GraphBuilder(state_type=DecisionState, output_type=str) @g.step async def get_number(ctx: StepContext[DecisionState, None, None]) -> int: return 15 @g.step async def is_positive(ctx: StepContext[DecisionState, None, int]) -> int: return ctx.inputs @g.step async def is_negative(ctx: StepContext[DecisionState, None, int]) -> str: return 'Negative' @g.step async def small_positive(ctx: StepContext[DecisionState, None, int]) -> str: return 'Small positive' @g.step async def large_positive(ctx: StepContext[DecisionState, None, int]) -> str: return 'Large positive' g.add( g.edge_from(g.start_node).to(get_number), g.edge_from(get_number).to( g.decision() .branch(g.match(TypeExpression[int], matches=lambda x: x > 0).to(is_positive)) .branch(g.match(TypeExpression[int], matches=lambda x: x <= 0).to(is_negative)) ), g.edge_from(is_positive).to( g.decision() .branch(g.match(TypeExpression[int], matches=lambda x: x < 10).to(small_positive)) .branch(g.match(TypeExpression[int], matches=lambda x: x >= 10).to(large_positive)) ), g.edge_from(is_negative, small_positive, large_positive).to(g.end_node), ) graph = g.build() result = await graph.run(state=DecisionState()) print(result) #> Large positive ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Branching with Labels Add labels to branches for documentation and diagram generation: labeled\_branches.py ```python from dataclasses import dataclass from typing import Literal from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression @dataclass class DecisionState: pass async def main(): g = GraphBuilder(state_type=DecisionState, output_type=str) @g.step async def choose(ctx: StepContext[DecisionState, None, None]) -> Literal['a', 'b']: return 'a' @g.step async def path_a(ctx: StepContext[DecisionState, None, object]) -> str: return 'Path A' @g.step async def path_b(ctx: StepContext[DecisionState, None, object]) -> str: return 'Path B' g.add( g.edge_from(g.start_node).to(choose), g.edge_from(choose).to( g.decision() .branch(g.match(TypeExpression[Literal['a']]).label('Take path A').to(path_a)) .branch(g.match(TypeExpression[Literal['b']]).label('Take path B').to(path_b)) ), g.edge_from(path_a, path_b).to(g.end_node), ) graph = g.build() result = await graph.run(state=DecisionState()) print(result) #> Path A ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Next Steps - Learn about [parallel execution](/docs/ai/graph/beta/parallel) with broadcasting and mapping - Understand [join nodes](/docs/ai/graph/beta/joins) for aggregating parallel results - See the [API reference](/docs/ai/api/pydantic_graph/beta_decision/#pydantic_graph.beta.decision) for complete decision documentation --- # [Getting Started](https://pydantic.dev/docs/ai/graph/beta/) # Getting Started Beta API This is the new beta graph API. It provides enhanced capabilities for parallel execution, conditional branching, and complex workflows. The original graph API is still available (and compatible of interop with the new beta API) and is documented in the [main graph documentation](/docs/ai/graph/). The beta graph API in `pydantic-graph` provides a powerful builder pattern for constructing parallel execution graphs with: - **Step nodes** for executing async functions - **Decision nodes** for conditional branching - **Spread operations** for parallel processing of iterables - **Broadcast operations** for sending the same data to multiple parallel paths - **Join nodes and Reducers** for aggregating results from parallel execution This API is designed for advanced workflows where you want declarative control over parallelism, routing, and data aggregation. ## Installation The beta graph API is included with `pydantic-graph`: Terminal ```bash pip install pydantic-graph ``` Or as part of `pydantic-ai`: Terminal ```bash pip install pydantic-ai ``` ## Quick Start Here's a simple example to get you started: simple\_counter.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class CounterState: """State for tracking a counter value.""" value: int = 0 async def main(): # Create a graph builder with state and output types g = GraphBuilder(state_type=CounterState, output_type=int) # Define steps using the decorator @g.step async def increment(ctx: StepContext[CounterState, None, None]) -> int: """Increment the counter and return its value.""" ctx.state.value += 1 return ctx.state.value @g.step async def double_it(ctx: StepContext[CounterState, None, int]) -> int: """Double the input value.""" return ctx.inputs * 2 # Add edges connecting the nodes g.add( g.edge_from(g.start_node).to(increment), g.edge_from(increment).to(double_it), g.edge_from(double_it).to(g.end_node), ) # Build and run the graph graph = g.build() state = CounterState() result = await graph.run(state=state) print(f'Result: {result}') #> Result: 2 print(f'Final state: {state.value}') #> Final state: 1 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Key Concepts ### GraphBuilder The [`GraphBuilder`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder) is the main entry point for constructing graphs. It's generic over: - `StateT` - The type of mutable state shared across all nodes - `DepsT` - The type of dependencies injected into nodes - `InputT` - The type of initial input to the graph - `OutputT` - The type of final output from the graph ### Steps Steps are async functions decorated with [`@g.step`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.step) that define the actual work to be done in each node. They receive a [`StepContext`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepContext) with access to: - `ctx.state` - The mutable graph state - `ctx.deps` - Injected dependencies - `ctx.inputs` - Input data for this step ### Edges Edges define the connections between nodes. The builder provides multiple ways to create edges: - [`g.add()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.add) - Add one or more edge paths - [`g.add_edge()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.add_edge) - Add a simple edge between two nodes - [`g.edge_from()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.edge_from) - Start building a complex edge path ### Start and End Nodes Every graph has: - [`g.start_node`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.start_node) - The entry point receiving initial inputs - [`g.end_node`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.end_node) - The exit point producing final outputs ## A More Complex Example Here's an example showcasing parallel execution with a map operation: parallel\_processing.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class ProcessingState: """State for tracking processing metrics.""" items_processed: int = 0 async def main(): g = GraphBuilder( state_type=ProcessingState, input_type=list[int], output_type=list[int], ) @g.step async def square(ctx: StepContext[ProcessingState, None, int]) -> int: """Square a number and track that we processed it.""" ctx.state.items_processed += 1 return ctx.inputs * ctx.inputs # Create a join to collect results collect_results = g.join(reduce_list_append, initial_factory=list[int]) # Build the graph with map operation g.add( g.edge_from(g.start_node).map().to(square), g.edge_from(square).to(collect_results), g.edge_from(collect_results).to(g.end_node), ) graph = g.build() state = ProcessingState() result = await graph.run(state=state, inputs=[1, 2, 3, 4, 5]) print(f'Results: {sorted(result)}') #> Results: [1, 4, 9, 16, 25] print(f'Items processed: {state.items_processed}') #> Items processed: 5 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ In this example: 1. The start node receives a list of integers 2. The `.map()` operation fans out each item to a separate parallel execution of the `square` step 3. All results are collected back together using [`reduce_list_append`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_list_append) 4. The joined results flow to the end node ## Next Steps Explore the detailed documentation for each feature: - [**Steps**](/docs/ai/graph/beta/steps) - Learn about step nodes and execution contexts - [**Joins**](/docs/ai/graph/beta/joins) - Understand join nodes and reducer patterns - [**Decisions**](/docs/ai/graph/beta/decisions) - Implement conditional branching - [**Parallel Execution**](/docs/ai/graph/beta/parallel) - Master broadcasting and mapping ## Advanced Execution Control Beyond the basic [`graph.run()`](/docs/ai/api/pydantic_graph/beta_graph/#pydantic_graph.beta.graph.Graph.run) method, the beta API provides fine-grained control over graph execution. ### Step-by-Step Execution Use [`graph.iter()`](/docs/ai/api/pydantic_graph/beta_graph/#pydantic_graph.beta.graph.Graph.iter) to execute the graph one step at a time: step\_by\_step.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class CounterState: value: int = 0 async def main(): g = GraphBuilder(state_type=CounterState, output_type=int) @g.step async def increment(ctx: StepContext[CounterState, None, None]) -> int: ctx.state.value += 1 return ctx.state.value @g.step async def double_it(ctx: StepContext[CounterState, None, int]) -> int: return ctx.inputs * 2 g.add( g.edge_from(g.start_node).to(increment), g.edge_from(increment).to(double_it), g.edge_from(double_it).to(g.end_node), ) graph = g.build() state = CounterState() # Use iter() for step-by-step execution async with graph.iter(state=state) as graph_run: print(f'Initial state: {state.value}') #> Initial state: 0 # Advance execution step by step async for event in graph_run: print(f'{state.value=} | {event=}') #> state.value=0 | event=[GraphTask(node_id='increment', inputs=None)] #> state.value=1 | event=[GraphTask(node_id='double_it', inputs=1)] #> state.value=1 | event=[GraphTask(node_id='__end__', inputs=2)] #> state.value=1 | event=EndMarker(_value=2) if graph_run.output is not None: print(f'Final output: {graph_run.output}') #> Final output: 2 break ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ The [`GraphRun`](/docs/ai/api/pydantic_graph/beta_graph/#pydantic_graph.beta.graph.GraphRun) object provides: - **Async iteration**: Iterate through execution events - **`next_task` property**: Inspect upcoming tasks - **`output` property**: Check if the graph has completed and get the final output - **`next()` method**: Manually advance execution with optional value injection ### Visualizing Graphs Generate Mermaid diagrams of your graph structure using [`graph.render()`](/docs/ai/api/pydantic_graph/beta_graph/#pydantic_graph.beta.graph.Graph.render): visualize\_graph.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class SimpleState: pass g = GraphBuilder(state_type=SimpleState, output_type=str) @g.step async def step_a(ctx: StepContext[SimpleState, None, None]) -> int: return 10 @g.step async def step_b(ctx: StepContext[SimpleState, None, int]) -> str: return f'Result: {ctx.inputs}' g.add( g.edge_from(g.start_node).to(step_a), g.edge_from(step_a).to(step_b), g.edge_from(step_b).to(g.end_node), ) graph = g.build() # Generate a Mermaid diagram mermaid_diagram = graph.render(title='My Graph', direction='LR') print(mermaid_diagram) """ --- title: My Graph --- stateDiagram-v2 direction LR step_a step_b [*] --> step_a step_a --> step_b step_b --> [*] """ ``` The rendered diagram can be displayed in documentation, notebooks, or any tool that supports Mermaid syntax. ## Comparison with Original API The original graph API (documented in the [main graph page](/docs/ai/graph/)) uses a class-based approach with [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode) subclasses. The beta API uses a builder pattern with decorated functions, which provides: **Advantages:** - More concise syntax for simple workflows - Explicit control over parallelism with map/broadcast - Built-in reducers for common aggregation patterns - Easier to visualize complex data flows **Trade-offs:** - Requires understanding of builder patterns - Less object-oriented, more functional style Both APIs are fully supported and can even be integrated together when needed. ## Persistence and Resumability No Native Persistence Unlike the [original Graph API](/docs/ai/graph/#state-persistence), the beta graph API does not include built-in state persistence. This is due to the [complexity of achieving consistent snapshotting with parallel execution](https://github.com/pydantic/pydantic-ai/issues/530#issuecomment-3504609992). For workflows that need to preserve progress across failures, restarts, or long-running operations, use one of the supported [durable execution](/docs/ai/integrations/durable_execution/overview) solutions. --- # [Joins & Reducers](https://pydantic.dev/docs/ai/graph/beta/joins/) # Joins & Reducers Join nodes synchronize and aggregate data from parallel execution paths. They use **Reducers** to combine multiple inputs into a single output. When you use [parallel execution](/docs/ai/graph/beta/parallel) (broadcasting or mapping), you often need to collect and combine the results. Join nodes serve this purpose by: 1. Waiting for all parallel tasks to complete 2. Aggregating their outputs using a [`ReducerFunction`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReducerFunction) 3. Passing the aggregated result to the next node ## Creating Joins Create a join using `GraphBuilder.join` with a reducer function and initial value or factory: basic\_join.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass g = GraphBuilder(state_type=SimpleState, output_type=list[int]) @g.step async def generate_numbers(ctx: StepContext[SimpleState, None, None]) -> list[int]: return [1, 2, 3, 4, 5] @g.step async def square(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs * ctx.inputs # Create a join to collect all squared values collect = g.join(reduce_list_append, initial_factory=list[int]) g.add( g.edge_from(g.start_node).to(generate_numbers), g.edge_from(generate_numbers).map().to(square), g.edge_from(square).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() async def main(): result = await graph.run(state=SimpleState()) print(sorted(result)) #> [1, 4, 9, 16, 25] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Built-in Reducers Pydantic Graph provides several common reducer types out of the box: ### `reduce_list_append` [`reduce_list_append`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_list_append) collects all inputs into a list: list\_reducer.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=list[str]) @g.step async def generate(ctx: StepContext[SimpleState, None, None]) -> list[int]: return [10, 20, 30] @g.step async def to_string(ctx: StepContext[SimpleState, None, int]) -> str: return f'value-{ctx.inputs}' collect = g.join(reduce_list_append, initial_factory=list[str]) g.add( g.edge_from(g.start_node).to(generate), g.edge_from(generate).map().to(to_string), g.edge_from(to_string).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(sorted(result)) #> ['value-10', 'value-20', 'value-30'] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ### `reduce_list_extend` [`reduce_list_extend`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_list_extend) extends a list with an iterable of items: list\_extend\_reducer.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_extend @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=list[int]) @g.step async def generate(ctx: StepContext[SimpleState, None, None]) -> list[int]: return [1, 2, 3] @g.step async def create_range(ctx: StepContext[SimpleState, None, int]) -> list[int]: """Create a range from 0 to the input value.""" return list(range(ctx.inputs)) collect = g.join(reduce_list_extend, initial_factory=list[int]) g.add( g.edge_from(g.start_node).to(generate), g.edge_from(generate).map().to(create_range), g.edge_from(create_range).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(sorted(result)) #> [0, 0, 0, 1, 1, 2] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ### `reduce_dict_update` [`reduce_dict_update`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_dict_update) merges dictionaries together: dict\_reducer.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_dict_update @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=dict[str, int]) @g.step async def generate_keys(ctx: StepContext[SimpleState, None, None]) -> list[str]: return ['apple', 'banana', 'cherry'] @g.step async def create_entry(ctx: StepContext[SimpleState, None, str]) -> dict[str, int]: return {ctx.inputs: len(ctx.inputs)} merge = g.join(reduce_dict_update, initial_factory=dict[str, int]) g.add( g.edge_from(g.start_node).to(generate_keys), g.edge_from(generate_keys).map().to(create_entry), g.edge_from(create_entry).to(merge), g.edge_from(merge).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) result = {k: result[k] for k in sorted(result)} # force deterministic ordering print(result) #> {'apple': 5, 'banana': 6, 'cherry': 6} ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ### `reduce_null` [`reduce_null`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_null) discards all inputs and returns `None`. Useful when you only care about side effects: null\_reducer.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_null @dataclass class CounterState: total: int = 0 async def main(): g = GraphBuilder(state_type=CounterState, output_type=int) @g.step async def generate(ctx: StepContext[CounterState, None, None]) -> list[int]: return [1, 2, 3, 4, 5] @g.step async def accumulate(ctx: StepContext[CounterState, None, int]) -> int: ctx.state.total += ctx.inputs return ctx.inputs # We don't care about the outputs, only the side effect on state ignore = g.join(reduce_null, initial=None) @g.step async def get_total(ctx: StepContext[CounterState, None, None]) -> int: return ctx.state.total g.add( g.edge_from(g.start_node).to(generate), g.edge_from(generate).map().to(accumulate), g.edge_from(accumulate).to(ignore), g.edge_from(ignore).to(get_total), g.edge_from(get_total).to(g.end_node), ) graph = g.build() state = CounterState() result = await graph.run(state=state) print(result) #> 15 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ### `reduce_sum` [`reduce_sum`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_sum) sums numeric values: sum\_reducer.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_sum @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=int) @g.step async def generate(ctx: StepContext[SimpleState, None, None]) -> list[int]: return [10, 20, 30, 40] @g.step async def identity(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs sum_join = g.join(reduce_sum, initial=0) g.add( g.edge_from(g.start_node).to(generate), g.edge_from(generate).map().to(identity), g.edge_from(identity).to(sum_join), g.edge_from(sum_join).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(result) #> 100 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ### `ReduceFirstValue` [`ReduceFirstValue`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReduceFirstValue) returns the first value it receives and cancels all other parallel tasks. This is useful for "race" scenarios where you want the first successful result: first\_value\_reducer.py ```python import asyncio from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import ReduceFirstValue @dataclass class SimpleState: tasks_completed: int = 0 async def main(): g = GraphBuilder(state_type=SimpleState, output_type=str) @g.step async def generate(ctx: StepContext[SimpleState, None, None]) -> list[int]: return [1, 12, 13, 14, 15] @g.step async def slow_process(ctx: StepContext[SimpleState, None, int]) -> str: """Simulate variable processing times.""" # Simulate different delays await asyncio.sleep(ctx.inputs * 0.1) ctx.state.tasks_completed += 1 return f'Result from task {ctx.inputs}' # Use ReduceFirstValue to get the first result and cancel the rest first_result = g.join(ReduceFirstValue[str](), initial=None, node_id='first_result') g.add( g.edge_from(g.start_node).to(generate), g.edge_from(generate).map().to(slow_process), g.edge_from(slow_process).to(first_result), g.edge_from(first_result).to(g.end_node), ) graph = g.build() state = SimpleState() result = await graph.run(state=state) print(result) #> Result from task 1 print(f'Tasks completed: {state.tasks_completed}') #> Tasks completed: 1 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Custom Reducers Create custom reducers by defining a [`ReducerFunction`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReducerFunction): custom\_reducer.py ```python from pydantic_graph.beta import GraphBuilder, StepContext def reduce_sum(current: int, inputs: int) -> int: """A reducer that sums numbers.""" return current + inputs async def main(): g = GraphBuilder(output_type=int) @g.step async def generate(ctx: StepContext[None, None, None]) -> list[int]: return [5, 10, 15, 20] @g.step async def identity(ctx: StepContext[None, None, int]) -> int: return ctx.inputs sum_join = g.join(reduce_sum, initial=0) g.add( g.edge_from(g.start_node).to(generate), g.edge_from(generate).map().to(identity), g.edge_from(identity).to(sum_join), g.edge_from(sum_join).to(g.end_node), ) graph = g.build() result = await graph.run() print(result) #> 50 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Reducers with State Access Reducers can access and modify the graph state: stateful\_reducer.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import ReducerContext @dataclass class MetricsState: total_count: int = 0 total_sum: int = 0 @dataclass class ReducedMetrics: count: int = 0 sum: int = 0 def reduce_metrics_sum(ctx: ReducerContext[MetricsState, None], current: ReducedMetrics, inputs: int) -> ReducedMetrics: ctx.state.total_count += 1 ctx.state.total_sum += inputs return ReducedMetrics(count=current.count + 1, sum=current.sum + inputs) def reduce_metrics_max(current: ReducedMetrics, inputs: ReducedMetrics) -> ReducedMetrics: return ReducedMetrics(count=max(current.count, inputs.count), sum=max(current.sum, inputs.sum)) async def main(): g = GraphBuilder(state_type=MetricsState, output_type=dict[str, int]) @g.step async def generate(ctx: StepContext[object, None, None]) -> list[int]: return [1, 3, 5, 7, 9, 10, 20, 30, 40] @g.step async def process_even(ctx: StepContext[MetricsState, None, int]) -> int: return ctx.inputs * 2 @g.step async def process_odd(ctx: StepContext[MetricsState, None, int]) -> int: return ctx.inputs * 3 metrics_even = g.join(reduce_metrics_sum, initial_factory=ReducedMetrics, node_id='metrics_even') metrics_odd = g.join(reduce_metrics_sum, initial_factory=ReducedMetrics, node_id='metrics_odd') metrics_max = g.join(reduce_metrics_max, initial_factory=ReducedMetrics, node_id='metrics_max') g.add( g.edge_from(g.start_node).to(generate), # Send even and odd numbers to their respective `process` steps g.edge_from(generate).map().to( g.decision() .branch(g.match(int, matches=lambda x: x % 2 == 0).label('even').to(process_even)) .branch(g.match(int, matches=lambda x: x % 2 == 1).label('odd').to(process_odd)) ), # Reduce metrics for even and odd numbers separately g.edge_from(process_even).to(metrics_even), g.edge_from(process_odd).to(metrics_odd), # Aggregate the max values for each field g.edge_from(metrics_even).to(metrics_max), g.edge_from(metrics_odd).to(metrics_max), # Finish the graph run with the final reduced value g.edge_from(metrics_max).to(g.end_node), ) graph = g.build() state = MetricsState() result = await graph.run(state=state) print(f'Result: {result}') #> Result: ReducedMetrics(count=5, sum=200) print(f'State total_count: {state.total_count}') #> State total_count: 9 print(f'State total_sum: {state.total_sum}') #> State total_sum: 275 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ### Canceling Sibling Tasks Reducers with access to [`ReducerContext`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReducerContext) can call [`ctx.cancel_sibling_tasks()`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReducerContext.cancel_sibling_tasks) to cancel all other parallel tasks in the same fork. This is useful for early termination when you've found what you need: cancel\_siblings.py ```python import asyncio from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import ReducerContext @dataclass class SearchState: searches_completed: int = 0 def reduce_find_match(ctx: ReducerContext[SearchState, None], current: str | None, inputs: str) -> str | None: """Return the first input that contains 'target' and cancel remaining tasks.""" if current is not None: # We already found a match, ignore subsequent inputs return current if 'target' in inputs: # Found a match! Cancel all other parallel tasks ctx.cancel_sibling_tasks() return inputs return None async def main(): g = GraphBuilder(state_type=SearchState, output_type=str | None) @g.step async def generate_searches(ctx: StepContext[SearchState, None, None]) -> list[str]: return ['item1', 'item2', 'target_item', 'item4', 'item5'] @g.step async def search(ctx: StepContext[SearchState, None, str]) -> str: """Simulate a slow search operation.""" # make the search artificially slower for 'item4' and 'item5' search_duration = 0.1 if ctx.inputs not in {'item4', 'item5'} else 1.0 await asyncio.sleep(search_duration) ctx.state.searches_completed += 1 return ctx.inputs find_match = g.join(reduce_find_match, initial=None) g.add( g.edge_from(g.start_node).to(generate_searches), g.edge_from(generate_searches).map().to(search), g.edge_from(search).to(find_match), g.edge_from(find_match).to(g.end_node), ) graph = g.build() state = SearchState() result = await graph.run(state=state) print(f'Found: {result}') #> Found: target_item print(f'Searches completed: {state.searches_completed}') #> Searches completed: 3 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ Note that only 3 searches completed instead of all 5, because the reducer canceled the remaining tasks after finding a match. ## Multiple Joins A graph can have multiple independent joins: multiple\_joins.py ```python from dataclasses import dataclass, field from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class MultiState: results: dict[str, list[int]] = field(default_factory=dict) async def main(): g = GraphBuilder(state_type=MultiState, output_type=dict[str, list[int]]) @g.step async def source_a(ctx: StepContext[MultiState, None, None]) -> list[int]: return [1, 2, 3] @g.step async def source_b(ctx: StepContext[MultiState, None, None]) -> list[int]: return [10, 20] @g.step async def process_a(ctx: StepContext[MultiState, None, int]) -> int: return ctx.inputs * 2 @g.step async def process_b(ctx: StepContext[MultiState, None, int]) -> int: return ctx.inputs * 3 join_a = g.join(reduce_list_append, initial_factory=list[int], node_id='join_a') join_b = g.join(reduce_list_append, initial_factory=list[int], node_id='join_b') @g.step async def store_a(ctx: StepContext[MultiState, None, list[int]]) -> None: ctx.state.results['a'] = ctx.inputs @g.step async def store_b(ctx: StepContext[MultiState, None, list[int]]) -> None: ctx.state.results['b'] = ctx.inputs @g.step async def combine(ctx: StepContext[MultiState, None, None]) -> dict[str, list[int]]: return ctx.state.results g.add( g.edge_from(g.start_node).to(source_a, source_b), g.edge_from(source_a).map().to(process_a), g.edge_from(source_b).map().to(process_b), g.edge_from(process_a).to(join_a), g.edge_from(process_b).to(join_b), g.edge_from(join_a).to(store_a), g.edge_from(join_b).to(store_b), g.edge_from(store_a, store_b).to(combine), g.edge_from(combine).to(g.end_node), ) graph = g.build() state = MultiState() result = await graph.run(state=state) print(f"Group A: {sorted(result['a'])}") #> Group A: [2, 4, 6] print(f"Group B: {sorted(result['b'])}") #> Group B: [30, 60] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Customizing Join Nodes ### Custom Node IDs Like steps, joins can have custom IDs: join\_custom\_id.py ```python from pydantic_graph.beta.join import reduce_list_append from basic_join import g my_join = g.join(reduce_list_append, initial_factory=list[int], node_id='my_custom_join_id') ``` ## How Joins Work Internally, the graph tracks which "fork" each parallel task belongs to. A join: 1. Identifies its parent fork (the fork that created the parallel paths) 2. Waits for all tasks from that fork to reach the join 3. Calls `reduce()` for each incoming value 4. Calls `finalize()` once all values are received 5. Passes the finalized result to downstream nodes This ensures proper synchronization even with nested parallel operations. ## Next Steps - Learn about [parallel execution](/docs/ai/graph/beta/parallel) with broadcasting and mapping - Explore [conditional branching](/docs/ai/graph/beta/decisions) with decision nodes - See the [API reference](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join) for complete reducer documentation --- # [Parallel Execution](https://pydantic.dev/docs/ai/graph/beta/parallel/) # Parallel Execution The beta graph API provides two powerful mechanisms for parallel execution: **broadcasting** and **mapping**. - **Broadcasting** - Send the same data to multiple parallel paths - **Spreading** - Fan out items from an iterable to parallel paths Both create "forks" in the execution graph that can later be synchronized with [join nodes](/docs/ai/graph/beta/joins). ## Broadcasting Broadcasting sends identical data to multiple destinations simultaneously: basic\_broadcast.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=list[int]) @g.step async def source(ctx: StepContext[SimpleState, None, None]) -> int: return 10 @g.step async def add_one(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs + 1 @g.step async def add_two(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs + 2 @g.step async def add_three(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs + 3 collect = g.join(reduce_list_append, initial_factory=list[int]) # Broadcasting: send the value from source to all three steps g.add( g.edge_from(g.start_node).to(source), g.edge_from(source).to(add_one, add_two, add_three), g.edge_from(add_one, add_two, add_three).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(sorted(result)) #> [11, 12, 13] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ All three steps receive the same input value (`10`) and execute in parallel. ## Spreading Spreading fans out elements from an iterable, processing each element in parallel: basic\_map.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=list[int]) @g.step async def generate_list(ctx: StepContext[SimpleState, None, None]) -> list[int]: return [1, 2, 3, 4, 5] @g.step async def square(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs * ctx.inputs collect = g.join(reduce_list_append, initial_factory=list[int]) # Spreading: each item in the list gets its own parallel execution g.add( g.edge_from(g.start_node).to(generate_list), g.edge_from(generate_list).map().to(square), g.edge_from(square).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(sorted(result)) #> [1, 4, 9, 16, 25] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ### Spreading AsyncIterables The `.map()` operation also works with `AsyncIterable` values. When mapping over an async iterable, the graph creates parallel tasks dynamically as values are yielded. This is particularly useful for streaming data or processing data that's being generated on-the-fly: async\_iterable\_map.py ```python import asyncio from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=list[int]) @g.stream async def stream_numbers(ctx: StepContext[SimpleState, None, None]): """Stream numbers with delays to simulate real-time data.""" for i in range(1, 4): await asyncio.sleep(0.05) # Simulate delay yield i @g.step async def triple(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs * 3 collect = g.join(reduce_list_append, initial_factory=list[int]) g.add( g.edge_from(g.start_node).to(stream_numbers), # Map over the async iterable - tasks created as items are yielded g.edge_from(stream_numbers).map().to(triple), g.edge_from(triple).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(sorted(result)) #> [3, 6, 9] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ This allows for progressive processing where downstream steps can start working on early results while later results are still being generated. ### Using `add_mapping_edge()` The convenience method [`add_mapping_edge()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.add_mapping_edge) provides a simpler syntax: mapping\_convenience.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=list[str]) @g.step async def generate_numbers(ctx: StepContext[SimpleState, None, None]) -> list[int]: return [10, 20, 30] @g.step async def stringify(ctx: StepContext[SimpleState, None, int]) -> str: return f'Value: {ctx.inputs}' collect = g.join(reduce_list_append, initial_factory=list[str]) g.add(g.edge_from(g.start_node).to(generate_numbers)) g.add_mapping_edge(generate_numbers, stringify) g.add( g.edge_from(stringify).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(sorted(result)) #> ['Value: 10', 'Value: 20', 'Value: 30'] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Empty Iterables When mapping an empty iterable, you can specify a `downstream_join_id` to ensure the join still executes: empty\_map.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=list[int]) @g.step async def generate_empty(ctx: StepContext[SimpleState, None, None]) -> list[int]: return [] @g.step async def double(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs * 2 collect = g.join(reduce_list_append, initial_factory=list[int]) g.add(g.edge_from(g.start_node).to(generate_empty)) g.add_mapping_edge(generate_empty, double, downstream_join_id=collect.id) g.add( g.edge_from(double).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(result) #> [] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Nested Parallel Operations You can nest broadcasts and maps for complex parallel patterns: ### Spread then Broadcast map\_then\_broadcast.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=list[int]) @g.step async def generate_list(ctx: StepContext[SimpleState, None, None]) -> list[int]: return [10, 20] @g.step async def add_one(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs + 1 @g.step async def add_two(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs + 2 collect = g.join(reduce_list_append, initial_factory=list[int]) g.add( g.edge_from(g.start_node).to(generate_list), # Spread the list, then broadcast each item to both steps g.edge_from(generate_list).map().to(add_one, add_two), g.edge_from(add_one, add_two).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(sorted(result)) #> [11, 12, 21, 22] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ The result contains: - From 10: `10+1=11` and `10+2=12` - From 20: `20+1=21` and `20+2=22` ### Multiple Sequential Spreads sequential\_maps.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=list[str]) @g.step async def generate_pairs(ctx: StepContext[SimpleState, None, None]) -> list[tuple[int, int]]: return [(1, 2), (3, 4)] @g.step async def unpack_pair(ctx: StepContext[SimpleState, None, tuple[int, int]]) -> list[int]: return [ctx.inputs[0], ctx.inputs[1]] @g.step async def stringify(ctx: StepContext[SimpleState, None, int]) -> str: return f'num:{ctx.inputs}' collect = g.join(reduce_list_append, initial_factory=list[str]) g.add( g.edge_from(g.start_node).to(generate_pairs), # First map: one task per tuple g.edge_from(generate_pairs).map().to(unpack_pair), # Second map: one task per number in each tuple g.edge_from(unpack_pair).map().to(stringify), g.edge_from(stringify).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(sorted(result)) #> ['num:1', 'num:2', 'num:3', 'num:4'] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Edge Labels Add labels to parallel edges for better documentation: labeled\_parallel.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=list[str]) @g.step async def generate(ctx: StepContext[SimpleState, None, None]) -> list[int]: return [1, 2, 3] @g.step async def process(ctx: StepContext[SimpleState, None, int]) -> str: return f'item-{ctx.inputs}' collect = g.join(reduce_list_append, initial_factory=list[str]) g.add(g.edge_from(g.start_node).to(generate)) g.add_mapping_edge( generate, process, pre_map_label='before map', post_map_label='after map', ) g.add( g.edge_from(process).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(sorted(result)) #> ['item-1', 'item-2', 'item-3'] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## State Sharing in Parallel Execution All parallel tasks share the same graph state. Be careful with mutations: parallel\_state.py ```python from dataclasses import dataclass, field from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class CounterState: values: list[int] = field(default_factory=list) async def main(): g = GraphBuilder(state_type=CounterState, output_type=list[int]) @g.step async def generate(ctx: StepContext[CounterState, None, None]) -> list[int]: return [1, 2, 3] @g.step async def track_and_square(ctx: StepContext[CounterState, None, int]) -> int: # All parallel tasks mutate the same state ctx.state.values.append(ctx.inputs) return ctx.inputs * ctx.inputs collect = g.join(reduce_list_append, initial_factory=list[int]) g.add( g.edge_from(g.start_node).to(generate), g.edge_from(generate).map().to(track_and_square), g.edge_from(track_and_square).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() state = CounterState() result = await graph.run(state=state) print(f'Squared: {sorted(result)}') #> Squared: [1, 4, 9] print(f'Tracked: {sorted(state.values)}') #> Tracked: [1, 2, 3] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Edge Transformations You can transform data inline as it flows along edges using the `.transform()` method: edge\_transform.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=str) @g.step async def generate_number(ctx: StepContext[SimpleState, None, None]) -> int: return 42 @g.step async def format_output(ctx: StepContext[SimpleState, None, str]) -> str: return f'The answer is: {ctx.inputs}' # Transform the number to a string inline g.add( g.edge_from(g.start_node).to(generate_number), g.edge_from(generate_number).transform(lambda ctx: str(ctx.inputs * 2)).to(format_output), g.edge_from(format_output).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(result) #> The answer is: 84 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ The transform function receives a [`StepContext`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepContext) with the current inputs and has access to state and dependencies. This is useful for: - Converting data types between incompatible steps - Extracting specific fields from complex objects - Applying simple computations without creating a full step - Adapting data formats during routing Transforms can be chained and combined with other edge operations like `.map()` and `.label()`: chained\_transforms.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=list[str]) @g.step async def generate_data(ctx: StepContext[SimpleState, None, None]) -> list[dict[str, int]]: return [{'value': 10}, {'value': 20}, {'value': 30}] @g.step async def process_number(ctx: StepContext[SimpleState, None, int]) -> str: return f'Processed: {ctx.inputs}' collect = g.join(reduce_list_append, initial_factory=list[str]) g.add( g.edge_from(g.start_node).to(generate_data), # Transform to extract values, then map over them g.edge_from(generate_data) .transform(lambda ctx: [item['value'] for item in ctx.inputs]) .label('Extract values') .map() .to(process_number), g.edge_from(process_number).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState()) print(sorted(result)) #> ['Processed: 10', 'Processed: 20', 'Processed: 30'] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Next Steps - Learn about [join nodes](/docs/ai/graph/beta/joins) for aggregating parallel results - Explore [conditional branching](/docs/ai/graph/beta/decisions) with decision nodes - See the [steps documentation](/docs/ai/graph/beta/steps) for more on step execution --- # [Steps](https://pydantic.dev/docs/ai/graph/beta/steps/) # Steps Steps are the fundamental units of work in a graph. They're async functions that receive a [`StepContext`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepContext) and return a value. ## Creating Steps Steps are created using the [`@g.step`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.step) decorator on the [`GraphBuilder`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder): basic\_step.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class MyState: counter: int = 0 g = GraphBuilder(state_type=MyState, output_type=int) @g.step async def increment(ctx: StepContext[MyState, None, None]) -> int: ctx.state.counter += 1 return ctx.state.counter g.add( g.edge_from(g.start_node).to(increment), g.edge_from(increment).to(g.end_node), ) graph = g.build() async def main(): state = MyState() result = await graph.run(state=state) print(result) #> 1 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Step Context Every step function receives a [`StepContext`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepContext) as its first parameter. The context provides access to: - `ctx.state` - The mutable graph state (type: `StateT`) - `ctx.deps` - Injected dependencies (type: `DepsT`) - `ctx.inputs` - Input data for this step (type: `InputT`) ### Accessing State State is shared across all steps in a graph and can be freely mutated: state\_access.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class AppState: messages: list[str] async def main(): g = GraphBuilder(state_type=AppState, output_type=list[str]) @g.step async def add_hello(ctx: StepContext[AppState, None, None]) -> None: ctx.state.messages.append('Hello') @g.step async def add_world(ctx: StepContext[AppState, None, None]) -> None: ctx.state.messages.append('World') @g.step async def get_messages(ctx: StepContext[AppState, None, None]) -> list[str]: return ctx.state.messages g.add( g.edge_from(g.start_node).to(add_hello), g.edge_from(add_hello).to(add_world), g.edge_from(add_world).to(get_messages), g.edge_from(get_messages).to(g.end_node), ) graph = g.build() state = AppState(messages=[]) result = await graph.run(state=state) print(result) #> ['Hello', 'World'] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ### Working with Inputs Steps can receive and transform input data: step\_inputs.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class SimpleState: pass async def main(): g = GraphBuilder( state_type=SimpleState, input_type=int, output_type=str, ) @g.step async def double_it(ctx: StepContext[SimpleState, None, int]) -> int: """Double the input value.""" return ctx.inputs * 2 @g.step async def stringify(ctx: StepContext[SimpleState, None, int]) -> str: """Convert to a formatted string.""" return f'Result: {ctx.inputs}' g.add( g.edge_from(g.start_node).to(double_it), g.edge_from(double_it).to(stringify), g.edge_from(stringify).to(g.end_node), ) graph = g.build() result = await graph.run(state=SimpleState(), inputs=21) print(result) #> Result: 42 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Dependency Injection Steps can access injected dependencies through `ctx.deps`: dependencies.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class AppState: pass @dataclass class AppDeps: """Dependencies injected into the graph.""" multiplier: int async def main(): g = GraphBuilder( state_type=AppState, deps_type=AppDeps, input_type=int, output_type=int, ) @g.step async def multiply(ctx: StepContext[AppState, AppDeps, int]) -> int: """Multiply input by the injected multiplier.""" return ctx.inputs * ctx.deps.multiplier g.add( g.edge_from(g.start_node).to(multiply), g.edge_from(multiply).to(g.end_node), ) graph = g.build() deps = AppDeps(multiplier=10) result = await graph.run(state=AppState(), deps=deps, inputs=5) print(result) #> 50 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Customizing Steps ### Custom Node IDs By default, step node IDs are inferred from the function name. You can override this: custom\_id.py ```python from pydantic_graph.beta import StepContext from basic_step import MyState, g @g.step(node_id='my_custom_id') async def my_step(ctx: StepContext[MyState, None, None]) -> int: return 42 # The node ID is now 'my_custom_id' instead of 'my_step' ``` ### Human-Readable Labels Labels provide documentation for diagram generation: labels.py ```python from pydantic_graph.beta import StepContext from basic_step import MyState, g @g.step(label='Increment the counter') async def increment(ctx: StepContext[MyState, None, None]) -> int: ctx.state.counter += 1 return ctx.state.counter # Access the label programmatically print(increment.label) #> Increment the counter ``` ## Sequential Steps Multiple steps can be chained sequentially: sequential.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class MathState: operations: list[str] async def main(): g = GraphBuilder( state_type=MathState, input_type=int, output_type=int, ) @g.step async def add_five(ctx: StepContext[MathState, None, int]) -> int: ctx.state.operations.append('add 5') return ctx.inputs + 5 @g.step async def multiply_by_two(ctx: StepContext[MathState, None, int]) -> int: ctx.state.operations.append('multiply by 2') return ctx.inputs * 2 @g.step async def subtract_three(ctx: StepContext[MathState, None, int]) -> int: ctx.state.operations.append('subtract 3') return ctx.inputs - 3 # Connect steps sequentially g.add( g.edge_from(g.start_node).to(add_five), g.edge_from(add_five).to(multiply_by_two), g.edge_from(multiply_by_two).to(subtract_three), g.edge_from(subtract_three).to(g.end_node), ) graph = g.build() state = MathState(operations=[]) result = await graph.run(state=state, inputs=10) print(f'Result: {result}') #> Result: 27 print(f'Operations: {state.operations}') #> Operations: ['add 5', 'multiply by 2', 'subtract 3'] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ The computation is: `(10 + 5) * 2 - 3 = 27` ## Streaming Steps In addition to regular steps that return a single value, you can create streaming steps that yield multiple values over time using the [`@g.stream`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.stream) decorator: streaming\_step.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext from pydantic_graph.beta.join import reduce_list_append @dataclass class SimpleState: pass g = GraphBuilder(state_type=SimpleState, output_type=list[int]) @g.stream async def generate_stream(ctx: StepContext[SimpleState, None, None]): """Stream numbers from 1 to 5.""" for i in range(1, 6): yield i @g.step async def square(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs * ctx.inputs collect = g.join(reduce_list_append, initial_factory=list[int]) g.add( g.edge_from(g.start_node).to(generate_stream), # The stream output is an AsyncIterable, so we can map over it g.edge_from(generate_stream).map().to(square), g.edge_from(square).to(collect), g.edge_from(collect).to(g.end_node), ) graph = g.build() async def main(): result = await graph.run(state=SimpleState()) print(sorted(result)) #> [1, 4, 9, 16, 25] ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ### How Streaming Steps Work Streaming steps return an `AsyncIterable` that yields values over time. When you use `.map()` on a streaming step's output, the graph processes each yielded value as it becomes available, creating parallel tasks dynamically. This is particularly useful for: - Processing data from APIs that stream responses - Handling real-time data feeds - Progressive processing of large datasets - Any scenario where you want to start processing results before all data is available Like regular steps, streaming steps can also have custom node IDs and labels: labeled\_stream.py ```python from pydantic_graph.beta import StepContext from streaming_step import SimpleState, g @g.stream(node_id='my_stream', label='Generate numbers progressively') async def labeled_stream(ctx: StepContext[SimpleState, None, None]): for i in range(10): yield i ``` ## Edge Building Convenience Methods The builder provides helper methods for common edge patterns: ### Simple Edges with `add_edge()` add\_edge\_example.py ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class SimpleState: pass async def main(): g = GraphBuilder(state_type=SimpleState, output_type=int) @g.step async def step_a(ctx: StepContext[SimpleState, None, None]) -> int: return 10 @g.step async def step_b(ctx: StepContext[SimpleState, None, int]) -> int: return ctx.inputs + 5 # Using add_edge() for simple connections g.add_edge(g.start_node, step_a) g.add_edge(step_a, step_b, label='from a to b') g.add_edge(step_b, g.end_node) graph = g.build() result = await graph.run(state=SimpleState()) print(result) #> 15 ``` _(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_ ## Type Safety The beta graph API provides strong type checking through generics. Type parameters on [`StepContext`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepContext) ensure: - State access is properly typed - Dependencies are correctly typed - Input/output types match across edges ```python from dataclasses import dataclass from pydantic_graph.beta import GraphBuilder, StepContext @dataclass class MyState: pass g = GraphBuilder(state_type=MyState, output_type=str) # Type checker will catch mismatches @g.step async def expects_int(ctx: StepContext[MyState, None, int]) -> str: return str(ctx.inputs) @g.step async def returns_str(ctx: StepContext[MyState, None, None]) -> str: return 'hello' # This would be a type error - expects_int needs int input, but returns_str outputs str # g.add(g.edge_from(returns_str).to(expects_int)) # Type error! ``` ## Next Steps - Learn about [parallel execution](/docs/ai/graph/beta/parallel) with broadcasting and mapping - Understand [join nodes](/docs/ai/graph/beta/joins) for aggregating parallel results - Explore [conditional branching](/docs/ai/graph/beta/decisions) with decision nodes --- # [Overview](https://pydantic.dev/docs/ai/graph/graph/) # Overview Don't use a nail gun unless you need a nail gun If Pydantic AI [agents](/docs/ai/core-concepts/agent) are a hammer, and [multi-agent workflows](/docs/ai/guides/multi-agent-applications) are a sledgehammer, then graphs are a nail gun: - sure, nail guns look cooler than hammers - but nail guns take a lot more setup than hammers - and nail guns don't make you a better builder, they make you a builder with a nail gun - Lastly, (and at the risk of torturing this metaphor), if you're a fan of medieval tools like mallets and untyped Python, you probably won't like nail guns or our approach to graphs. (But then again, if you're not a fan of type hints in Python, you've probably already bounced off Pydantic AI to use one of the toy agent frameworks -- good luck, and feel free to borrow my sledgehammer when you realize you need it) In short, graphs are a powerful tool, but they're not the right tool for every job. Please consider other [multi-agent approaches](/docs/ai/guides/multi-agent-applications) before proceeding. If you're not confident a graph-based approach is a good idea, it might be unnecessary. Graphs and finite state machines (FSMs) are a powerful abstraction to model, execute, control and visualize complex workflows. Alongside Pydantic AI, we've developed `pydantic-graph` -- an async graph and state machine library for Python where nodes and edges are defined using type hints. While this library is developed as part of Pydantic AI; it has no dependency on `pydantic-ai` and can be considered as a pure graph-based state machine library. You may find it useful whether or not you're using Pydantic AI or even building with GenAI. `pydantic-graph` is designed for advanced users and makes heavy use of Python generics and type hints. It is not designed to be as beginner-friendly as Pydantic AI. ## Installation `pydantic-graph` is a required dependency of `pydantic-ai`, and an optional dependency of `pydantic-ai-slim`, see [installation instructions](/docs/ai/overview/install#slim-install) for more information. You can also install it directly: - [pip](#tab-panel-22) - [uv](#tab-panel-23) Terminal ```bash pip install pydantic-graph ``` Terminal ```bash uv add pydantic-graph ``` ## Graph Types `pydantic-graph` is made up of a few key components: ### GraphRunContext [`GraphRunContext`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.GraphRunContext) -- The context for the graph run, similar to Pydantic AI's [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext). This holds the state of the graph and dependencies and is passed to nodes when they're run. `GraphRunContext` is generic in the state type of the graph it's used in, [`StateT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.StateT). ### End [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) -- return value to indicate the graph run should end. `End` is generic in the graph return type of the graph it's used in, [`RunEndT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.RunEndT). ### Nodes Subclasses of [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode) define nodes for execution in the graph. Nodes, which are generally [`dataclass`es](https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass), generally consist of: - fields containing any parameters required/optional when calling the node - the business logic to execute the node, in the [`run`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.run) method - return annotations of the [`run`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.run) method, which are read by `pydantic-graph` to determine the outgoing edges of the node Nodes are generic in: - **state**, which must have the same type as the state of graphs they're included in, [`StateT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.StateT) has a default of `None`, so if you're not using state you can omit this generic parameter, see [stateful graphs](#stateful-graphs) for more information - **deps**, which must have the same type as the deps of the graph they're included in, [`DepsT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.DepsT) has a default of `None`, so if you're not using deps you can omit this generic parameter, see [dependency injection](#dependency-injection) for more information - **graph return type** -- this only applies if the node returns [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End). [`RunEndT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.RunEndT) has a default of [Never](https://docs.python.org/3/library/typing.html#typing.Never) so this generic parameter can be omitted if the node doesn't return `End`, but must be included if it does. Here's an example of a start or intermediate node in a graph -- it can't end the run as it doesn't return [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End): intermediate\_node.py ```py from dataclasses import dataclass from pydantic_graph import BaseNode, GraphRunContext @dataclass class MyNode(BaseNode[MyState]): # (1) foo: int # (2) async def run( self, ctx: GraphRunContext[MyState], # (3) ) -> AnotherNode: # (4) ... return AnotherNode() ``` State in this example is `MyState` (not shown), hence `BaseNode` is parameterized with `MyState`. This node can't end the run, so the `RunEndT` generic parameter is omitted and defaults to `Never`. `MyNode` is a dataclass and has a single field `foo`, an `int`. The `run` method takes a `GraphRunContext` parameter, again parameterized with state `MyState`. The return type of the `run` method is `AnotherNode` (not shown), this is used to determine the outgoing edges of the node. We could extend `MyNode` to optionally end the run if `foo` is divisible by 5: intermediate\_or\_end\_node.py ```py from dataclasses import dataclass from pydantic_graph import BaseNode, End, GraphRunContext @dataclass class MyNode(BaseNode[MyState, None, int]): # (1) foo: int async def run( self, ctx: GraphRunContext[MyState], ) -> AnotherNode | End[int]: # (2) if self.foo % 5 == 0: return End(self.foo) else: return AnotherNode() ``` We parameterize the node with the return type (`int` in this case) as well as state. Because generic parameters are positional-only, we have to include `None` as the second parameter representing deps. The return type of the `run` method is now a union of `AnotherNode` and `End[int]`, this allows the node to end the run if `foo` is divisible by 5. ### Graph [`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph) -- this is the execution graph itself, made up of a set of [node classes](#nodes) (i.e., `BaseNode` subclasses). `Graph` is generic in: - **state** the state type of the graph, [`StateT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.StateT) - **deps** the deps type of the graph, [`DepsT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.DepsT) - **graph return type** the return type of the graph run, [`RunEndT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.RunEndT) Here's an example of a simple graph: graph\_example.py ```py from __future__ import annotations from dataclasses import dataclass from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class DivisibleBy5(BaseNode[None, None, int]): # (1) foo: int async def run( self, ctx: GraphRunContext, ) -> Increment | End[int]: if self.foo % 5 == 0: return End(self.foo) else: return Increment(self.foo) @dataclass class Increment(BaseNode): # (2) foo: int async def run(self, ctx: GraphRunContext) -> DivisibleBy5: return DivisibleBy5(self.foo + 1) fives_graph = Graph(nodes=[DivisibleBy5, Increment]) # (3) result = fives_graph.run_sync(DivisibleBy5(4)) # (4) print(result.output) #> 5 ``` The `DivisibleBy5` node is parameterized with `None` for the state param and `None` for the deps param as this graph doesn't use state or deps, and `int` as it can end the run. The `Increment` node doesn't return `End`, so the `RunEndT` generic parameter is omitted, state can also be omitted as the graph doesn't use state. The graph is created with a sequence of nodes. The graph is run synchronously with [`run_sync`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run_sync). The initial node is `DivisibleBy5(4)`. Because the graph doesn't use external state or deps, we don't pass `state` or `deps`. _(This example is complete, it can be run "as is")_ A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code: graph\_example\_diagram.py ```py from graph_example import DivisibleBy5, fives_graph fives_graph.mermaid_code(start_node=DivisibleBy5) ``` ```mermaid --- title: fives_graph --- stateDiagram-v2 [*] --> DivisibleBy5 DivisibleBy5 --> Increment DivisibleBy5 --> [*] Increment --> DivisibleBy5 ``` In order to visualize a graph within a `jupyter-notebook`, `IPython.display` needs to be used: jupyter\_display\_mermaid.py ```python from graph_example import DivisibleBy5, fives_graph from IPython.display import Image, display display(Image(fives_graph.mermaid_image(start_node=DivisibleBy5))) ``` ## Stateful Graphs The "state" concept in `pydantic-graph` provides an optional way to access and mutate an object (often a `dataclass` or Pydantic model) as nodes run in a graph. If you think of Graphs as a production line, then your state is the engine being passed along the line and built up by each node as the graph is run. `pydantic-graph` provides state persistence, with the state recorded after each node is run. (See [State Persistence](#state-persistence).) Here's an example of a graph which represents a vending machine where the user may insert coins and select a product to purchase. vending\_machine.py ```python from __future__ import annotations from dataclasses import dataclass from rich.prompt import Prompt from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class MachineState: # (1) user_balance: float = 0.0 product: str | None = None @dataclass class InsertCoin(BaseNode[MachineState]): # (3) async def run(self, ctx: GraphRunContext[MachineState]) -> CoinsInserted: # (16) return CoinsInserted(float(Prompt.ask('Insert coins'))) # (4) @dataclass class CoinsInserted(BaseNode[MachineState]): amount: float # (5) async def run( self, ctx: GraphRunContext[MachineState] ) -> SelectProduct | Purchase: # (17) ctx.state.user_balance += self.amount # (6) if ctx.state.product is not None: # (7) return Purchase(ctx.state.product) else: return SelectProduct() @dataclass class SelectProduct(BaseNode[MachineState]): async def run(self, ctx: GraphRunContext[MachineState]) -> Purchase: return Purchase(Prompt.ask('Select product')) PRODUCT_PRICES = { # (2) 'water': 1.25, 'soda': 1.50, 'crisps': 1.75, 'chocolate': 2.00, } @dataclass class Purchase(BaseNode[MachineState, None, None]): # (18) product: str async def run( self, ctx: GraphRunContext[MachineState] ) -> End | InsertCoin | SelectProduct: if price := PRODUCT_PRICES.get(self.product): # (8) ctx.state.product = self.product # (9) if ctx.state.user_balance >= price: # (10) ctx.state.user_balance -= price return End(None) else: diff = price - ctx.state.user_balance print(f'Not enough money for {self.product}, need {diff:0.2f} more') #> Not enough money for crisps, need 0.75 more return InsertCoin() # (11) else: print(f'No such product: {self.product}, try again') return SelectProduct() # (12) vending_machine_graph = Graph( # (13) nodes=[InsertCoin, CoinsInserted, SelectProduct, Purchase] ) async def main(): state = MachineState() # (14) await vending_machine_graph.run(InsertCoin(), state=state) # (15) print(f'purchase successful item={state.product} change={state.user_balance:0.2f}') #> purchase successful item=crisps change=0.25 ``` The state of the vending machine is defined as a dataclass with the user's balance and the product they've selected, if any. A dictionary of products mapped to prices. The `InsertCoin` node, [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode) is parameterized with `MachineState` as that's the state used in this graph. The `InsertCoin` node prompts the user to insert coins. We keep things simple by just entering a monetary amount as a float. Before you start thinking this is a toy too since it's using [rich's `Prompt.ask`](https://rich.readthedocs.io/en/stable/reference/prompt.html#rich.prompt.PromptBase.ask) within nodes, see [below](#example-human-in-the-loop) for how control flow can be managed when nodes require external input. The `CoinsInserted` node; again this is a [`dataclass`](https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass) with one field `amount`. Update the user's balance with the amount inserted. If the user has already selected a product, go to `Purchase`, otherwise go to `SelectProduct`. In the `Purchase` node, look up the price of the product if the user entered a valid product. If the user did enter a valid product, set the product in the state so we don't revisit `SelectProduct`. If the balance is enough to purchase the product, adjust the balance to reflect the purchase and return [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) to end the graph. We're not using the run return type, so we call `End` with `None`. If the balance is insufficient, go to `InsertCoin` to prompt the user to insert more coins. If the product is invalid, go to `SelectProduct` to prompt the user to select a product again. The graph is created by passing a list of nodes to [`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph). Order of nodes is not important, but it can affect how [diagrams](#mermaid-diagrams) are displayed. Initialize the state. This will be passed to the graph run and mutated as the graph runs. Run the graph with the initial state. Since the graph can be run from any node, we must pass the start node -- in this case, `InsertCoin`. [`Graph.run`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run) returns a [`GraphRunResult`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRunResult) that provides the final data and a history of the run. The return type of the node's [`run`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.run) method is important as it is used to determine the outgoing edges of the node. This information in turn is used to render [mermaid diagrams](#mermaid-diagrams) and is enforced at runtime to detect misbehavior as soon as possible. The return type of `CoinsInserted`'s [`run`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.run) method is a union, meaning multiple outgoing edges are possible. Unlike other nodes, `Purchase` can end the run, so the [`RunEndT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.RunEndT) generic parameter must be set. In this case it's `None` since the graph run return type is `None`. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code: vending\_machine\_diagram.py ```py from vending_machine import InsertCoin, vending_machine_graph vending_machine_graph.mermaid_code(start_node=InsertCoin) ``` The diagram generated by the above code is: ```mermaid --- title: vending_machine_graph --- stateDiagram-v2 [*] --> InsertCoin InsertCoin --> CoinsInserted CoinsInserted --> SelectProduct CoinsInserted --> Purchase SelectProduct --> Purchase Purchase --> InsertCoin Purchase --> SelectProduct Purchase --> [*] ``` See [below](#mermaid-diagrams) for more information on generating diagrams. ## GenAI Example So far we haven't shown an example of a Graph that actually uses Pydantic AI or GenAI at all. In this example, one agent generates a welcome email to a user and the other agent provides feedback on the email. This graph has a very simple structure: ```mermaid --- title: feedback_graph --- stateDiagram-v2 [*] --> WriteEmail WriteEmail --> Feedback Feedback --> WriteEmail Feedback --> [*] ``` genai\_email\_feedback.py ```python from __future__ import annotations as _annotations from dataclasses import dataclass, field from pydantic import BaseModel, EmailStr from pydantic_ai import Agent, ModelMessage, format_as_xml from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class User: name: str email: EmailStr interests: list[str] @dataclass class Email: subject: str body: str @dataclass class State: user: User write_agent_messages: list[ModelMessage] = field(default_factory=list) email_writer_agent = Agent( 'google-gla:gemini-3-pro-preview', output_type=Email, instructions='Write a welcome email to our tech blog.', ) @dataclass class WriteEmail(BaseNode[State]): email_feedback: str | None = None async def run(self, ctx: GraphRunContext[State]) -> Feedback: if self.email_feedback: prompt = ( f'Rewrite the email for the user:\n' f'{format_as_xml(ctx.state.user)}\n' f'Feedback: {self.email_feedback}' ) else: prompt = ( f'Write a welcome email for the user:\n' f'{format_as_xml(ctx.state.user)}' ) result = await email_writer_agent.run( prompt, message_history=ctx.state.write_agent_messages, ) ctx.state.write_agent_messages += result.new_messages() return Feedback(result.output) class EmailRequiresWrite(BaseModel): feedback: str class EmailOk(BaseModel): pass feedback_agent = Agent[None, EmailRequiresWrite | EmailOk]( 'openai:gpt-5.2', output_type=EmailRequiresWrite | EmailOk, # type: ignore instructions=( 'Review the email and provide feedback, email must reference the users specific interests.' ), ) @dataclass class Feedback(BaseNode[State, None, Email]): email: Email async def run( self, ctx: GraphRunContext[State], ) -> WriteEmail | End[Email]: prompt = format_as_xml({'user': ctx.state.user, 'email': self.email}) result = await feedback_agent.run(prompt) if isinstance(result.output, EmailRequiresWrite): return WriteEmail(email_feedback=result.output.feedback) else: return End(self.email) async def main(): user = User( name='John Doe', email='john.joe@example.com', interests=['Haskel', 'Lisp', 'Fortran'], ) state = State(user) feedback_graph = Graph(nodes=(WriteEmail, Feedback)) result = await feedback_graph.run(WriteEmail(), state=state) print(result.output) """ Email( subject='Welcome to our tech blog!', body='Hello John, Welcome to our tech blog! ...', ) """ ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## Iterating Over a Graph ### Using `Graph.iter` for `async for` iteration Sometimes you want direct control or insight into each node as the graph executes. The easiest way to do that is with the [`Graph.iter`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter) method, which returns a **context manager** that yields a [`GraphRun`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun) object. The `GraphRun` is an async-iterable over the nodes of your graph, allowing you to record or modify them as they execute. Here's an example: count\_down.py ```python from __future__ import annotations as _annotations from dataclasses import dataclass from pydantic_graph import Graph, BaseNode, End, GraphRunContext @dataclass class CountDownState: counter: int @dataclass class CountDown(BaseNode[CountDownState, None, int]): async def run(self, ctx: GraphRunContext[CountDownState]) -> CountDown | End[int]: if ctx.state.counter <= 0: return End(ctx.state.counter) ctx.state.counter -= 1 return CountDown() count_down_graph = Graph(nodes=[CountDown]) async def main(): state = CountDownState(counter=3) async with count_down_graph.iter(CountDown(), state=state) as run: # (1) async for node in run: # (2) print('Node:', node) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: End(data=0) print('Final output:', run.result.output) # (3) #> Final output: 0 ``` `Graph.iter(...)` returns a [`GraphRun`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun). Here, we step through each node as it is executed. Once the graph returns an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End), the loop ends, and `run.result` becomes a [`GraphRunResult`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRunResult) containing the final outcome (`0` here). ### Using `GraphRun.next(node)` manually Alternatively, you can drive iteration manually with the [`GraphRun.next`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun.next) method, which allows you to pass in whichever node you want to run next. You can modify or selectively skip nodes this way. Below is a contrived example that stops whenever the counter is at 2, ignoring any node runs beyond that: count\_down\_next.py ```python from pydantic_graph import End, FullStatePersistence from count_down import CountDown, CountDownState, count_down_graph async def main(): state = CountDownState(counter=5) persistence = FullStatePersistence() # (7) async with count_down_graph.iter( CountDown(), state=state, persistence=persistence ) as run: node = run.next_node # (1) while not isinstance(node, End): # (2) print('Node:', node) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() if state.counter == 2: break # (3) node = await run.next(node) # (4) print(run.result) # (5) #> None for step in persistence.history: # (6) print('History Step:', step.state, step.state) #> History Step: CountDownState(counter=5) CountDownState(counter=5) #> History Step: CountDownState(counter=4) CountDownState(counter=4) #> History Step: CountDownState(counter=3) CountDownState(counter=3) #> History Step: CountDownState(counter=2) CountDownState(counter=2) ``` We start by grabbing the first node that will be run in the agent's graph. The agent run is finished once an `End` node has been produced; instances of `End` cannot be passed to `next`. If the user decides to stop early, we break out of the loop. The graph run won't have a real final result in that case (`run.result` remains `None`). At each step, we call `await run.next(node)` to run it and get the next node (or an `End`). Because we did not continue the run until it finished, the `result` is not set. The run's history is still populated with the steps we executed so far. Use [`FullStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.FullStatePersistence) so we can show the history of the run, see [State Persistence](#state-persistence) below for more information. ## State Persistence One of the biggest benefits of finite state machine (FSM) graphs is how they simplify the handling of interrupted execution. This might happen for a variety of reasons: - the state machine logic might fundamentally need to be paused -- e.g. the returns workflow for an e-commerce order needs to wait for the item to be posted to the returns center or because execution of the next node needs input from a user so needs to wait for a new http request, - the execution takes so long that the entire graph can't reliably be executed in a single continuous run -- e.g. a deep research agent that might take hours to run, - you want to run multiple graph nodes in parallel in different processes / hardware instances (note: parallel node execution is not yet supported in `pydantic-graph`, see [#704](https://github.com/pydantic/pydantic-ai/issues/704)). Trying to make a conventional control flow (i.e., boolean logic and nested function calls) implementation compatible with these usage scenarios generally results in brittle and over-complicated spaghetti code, with the logic required to interrupt and resume execution dominating the implementation. To allow graph runs to be interrupted and resumed, `pydantic-graph` provides state persistence -- a system for snapshotting the state of a graph run before and after each node is run, allowing a graph run to be resumed from any point in the graph. `pydantic-graph` includes three state persistence implementations: - [`SimpleStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.SimpleStatePersistence) -- Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default. - [`FullStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.FullStatePersistence) -- In memory state persistence that hold a list of snapshots. - [`FileStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.file.FileStatePersistence) -- File-based state persistence that saves snapshots to a JSON file. In production applications, developers should implement their own state persistence by subclassing [`BaseStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence) abstract base class, which might persist runs in a relational database like PostgresQL. At a high level the role of `StatePersistence` implementations is to store and retrieve [`NodeSnapshot`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot) and [`EndSnapshot`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.EndSnapshot) objects. [`graph.iter_from_persistence()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter_from_persistence) may be used to run the graph based on the state stored in persistence. We can run the `count_down_graph` from [above](#iterating-over-a-graph), using [`graph.iter_from_persistence()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter_from_persistence) and [`FileStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.file.FileStatePersistence). As you can see in this code, `run_node` requires no external application state (apart from state persistence) to be run, meaning graphs can easily be executed by distributed execution and queueing systems. count\_down\_from\_persistence.py ```python from pathlib import Path from pydantic_graph import End from pydantic_graph.persistence.file import FileStatePersistence from count_down import CountDown, CountDownState, count_down_graph async def main(): run_id = 'run_abc123' persistence = FileStatePersistence(Path(f'count_down_{run_id}.json')) # (1) state = CountDownState(counter=5) await count_down_graph.initialize( # (2) CountDown(), state=state, persistence=persistence ) done = False while not done: done = await run_node(run_id) async def run_node(run_id: str) -> bool: # (3) persistence = FileStatePersistence(Path(f'count_down_{run_id}.json')) async with count_down_graph.iter_from_persistence(persistence) as run: # (4) node_or_end = await run.next() # (5) print('Node:', node_or_end) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: End(data=0) return isinstance(node_or_end, End) # (6) ``` Create a [`FileStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.file.FileStatePersistence) to use to start the graph. Call [`graph.initialize()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.initialize) to set the initial graph state in the persistence object. `run_node` is a pure function that doesn't need access to any other process state to run the next node of the graph, except the ID of the run. Call [`graph.iter_from_persistence()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter_from_persistence) create a [`GraphRun`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun) object that will run the next node of the graph from the state stored in persistence. This will return either a node or an `End` object. [`graph.run()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run) will return either a [node](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode) or an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) object. Check if the node is an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) object, if it is, the graph run is complete. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ### Example: Human in the loop. As noted above, state persistence allows graphs to be interrupted and resumed. One use case of this is to allow user input to continue. In this example, an AI asks the user a question, the user provides an answer, the AI evaluates the answer and ends if the user got it right or asks another question if they got it wrong. Instead of running the entire graph in a single process invocation, we run the graph by running the process repeatedly, optionally providing an answer to the question as a command line argument. `ai_q_and_a_graph.py` -- `question_graph` definition ai\_q\_and\_a\_graph.py ```python from __future__ import annotations as _annotations from typing import Annotated from pydantic_graph import Edge from dataclasses import dataclass, field from pydantic import BaseModel from pydantic_graph import ( BaseNode, End, Graph, GraphRunContext, ) from pydantic_ai import Agent, format_as_xml from pydantic_ai import ModelMessage ask_agent = Agent('openai:gpt-5.2', output_type=str, instrument=True) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): """Generate question using GPT-5.""" docstring_notes = True async def run( self, ctx: GraphRunContext[QuestionState] ) -> Annotated[Answer, Edge(label='Ask the question')]: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.new_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationResult(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-5.2', output_type=EvaluationResult, instructions='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> Annotated[End[str], Edge(label='success')] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.new_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) ``` _(This example is complete, it can be run "as is")_ ai\_q\_and\_a\_run.py ```python import sys from pathlib import Path from pydantic_graph import End from pydantic_graph.persistence.file import FileStatePersistence from pydantic_ai import ModelMessage # noqa: F401 from ai_q_and_a_graph import Ask, question_graph, Evaluate, QuestionState, Answer async def main(): answer: str | None = sys.argv[1] if len(sys.argv) > 1 else None # (1) persistence = FileStatePersistence(Path('question_graph.json')) # (2) persistence.set_graph_types(question_graph) # (3) if snapshot := await persistence.load_next(): # (4) state = snapshot.state assert answer is not None node = Evaluate(answer) else: state = QuestionState() node = Ask() # (5) async with question_graph.iter(node, state=state, persistence=persistence) as run: while True: node = await run.next() # (6) if isinstance(node, End): # (7) print('END:', node.data) history = await persistence.load_all() # (8) print([e.node for e in history]) break elif isinstance(node, Answer): # (9) print(node.question) #> What is the capital of France? break # otherwise just continue ``` Get the user's answer from the command line, if provided. See [question graph example](/docs/ai/examples/question-graph) for a complete example. Create a state persistence instance the `'question_graph.json'` file may or may not already exist. Since we're using the [persistence interface](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence) outside a graph, we need to call [`set_graph_types`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.set_graph_types) to set the graph generic types `StateT` and `RunEndT` for the persistence instance. This is necessary to allow the persistence instance to know how to serialize and deserialize graph nodes. If we're run the graph before, [`load_next`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.load_next) will return a snapshot of the next node to run, here we use `state` from that snapshot, and create a new `Evaluate` node with the answer provided on the command line. If the graph hasn't been run before, we create a new `QuestionState` and start with the `Ask` node. Call [`GraphRun.next()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun.next) to run the node. This will return either a node or an `End` object. If the node is an `End` object, the graph run is complete. The `data` field of the `End` object contains the comment returned by the `evaluate_agent` about the correct answer. To demonstrate the state persistence, we call [`load_all`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.load_all) to get all the snapshots from the persistence instance. This will return a list of [`Snapshot`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.Snapshot) objects. If the node is an `Answer` object, we print the question and break out of the loop to end the process and wait for user input. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ For a complete example of this graph, see the [question graph example](/docs/ai/examples/question-graph). ## Dependency Injection As with Pydantic AI, `pydantic-graph` supports dependency injection via a generic parameter on [`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph) and [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode), and the [`GraphRunContext.deps`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.GraphRunContext.deps) field. As an example of dependency injection, let's modify the `DivisibleBy5` example [above](#graph) to use a [`ProcessPoolExecutor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor) to run the compute load in a separate process (this is a contrived example, `ProcessPoolExecutor` wouldn't actually improve performance in this example): deps\_example.py ```py from __future__ import annotations import asyncio from concurrent.futures import ProcessPoolExecutor from dataclasses import dataclass from pydantic_graph import BaseNode, End, FullStatePersistence, Graph, GraphRunContext @dataclass class GraphDeps: executor: ProcessPoolExecutor @dataclass class DivisibleBy5(BaseNode[None, GraphDeps, int]): foo: int async def run( self, ctx: GraphRunContext[None, GraphDeps], ) -> Increment | End[int]: if self.foo % 5 == 0: return End(self.foo) else: return Increment(self.foo) @dataclass class Increment(BaseNode[None, GraphDeps]): foo: int async def run(self, ctx: GraphRunContext[None, GraphDeps]) -> DivisibleBy5: loop = asyncio.get_running_loop() compute_result = await loop.run_in_executor( ctx.deps.executor, self.compute, ) return DivisibleBy5(compute_result) def compute(self) -> int: return self.foo + 1 fives_graph = Graph(nodes=[DivisibleBy5, Increment]) async def main(): with ProcessPoolExecutor() as executor: deps = GraphDeps(executor) result = await fives_graph.run(DivisibleBy5(3), deps=deps, persistence=FullStatePersistence()) print(result.output) #> 5 # the full history is quite verbose (see below), so we'll just print the summary print([item.node for item in result.persistence.history]) """ [ DivisibleBy5(foo=3), Increment(foo=3), DivisibleBy5(foo=4), Increment(foo=4), DivisibleBy5(foo=5), End(data=5), ] """ ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## Mermaid Diagrams Pydantic Graph can generate [mermaid](https://mermaid.js.org/) [`stateDiagram-v2`](https://mermaid.js.org/syntax/stateDiagram.html) diagrams for graphs, as shown above. These diagrams can be generated with: - [`Graph.mermaid_code`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.mermaid_code) to generate the mermaid code for a graph - [`Graph.mermaid_image`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.mermaid_image) to generate an image of the graph using [mermaid.ink](https://mermaid.ink/) - [`Graph.mermaid_save`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.mermaid_save) to generate an image of the graph using [mermaid.ink](https://mermaid.ink/) and save it to a file Beyond the diagrams shown above, you can also customize mermaid diagrams with the following options: - [`Edge`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.Edge) allows you to apply a label to an edge - [`BaseNode.docstring_notes`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.docstring_notes) and [`BaseNode.get_note`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.get_note) allows you to add notes to nodes - The [`highlighted_nodes`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.mermaid_code) parameter allows you to highlight specific node(s) in the diagram Putting that together, we can edit the last [`ai_q_and_a_graph.py`](#example-human-in-the-loop) example to: - add labels to some edges - add a note to the `Ask` node - highlight the `Answer` node - save the diagram as a `PNG` image to file ai\_q\_and\_a\_graph\_extra.py ```python from typing import Annotated from pydantic_graph import BaseNode, End, Graph, GraphRunContext, Edge ask_agent = Agent('openai:gpt-5.2', output_type=str, instrument=True) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): """Generate question using GPT-5.""" docstring_notes = True async def run( self, ctx: GraphRunContext[QuestionState] ) -> Annotated[Answer, Edge(label='Ask the question')]: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.new_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationResult(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-5.2', output_type=EvaluationResult, instructions='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> Annotated[End[str], Edge(label='success')] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.new_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) ``` _(This example is not complete and cannot be run directly)_ This would generate an image that looks like this: ```mermaid --- title: question_graph --- stateDiagram-v2 Ask --> Answer: Ask the question note right of Ask Judge the answer. Decide on next step. end note Answer --> Evaluate Evaluate --> Reprimand Evaluate --> [*]: success Reprimand --> Ask classDef highlighted fill:#fdff32 class Answer highlighted ``` ### Setting Direction of the State Diagram You can specify the direction of the state diagram using one of the following values: - `'TB'`: Top to bottom, the diagram flows vertically from top to bottom. - `'LR'`: Left to right, the diagram flows horizontally from left to right. - `'RL'`: Right to left, the diagram flows horizontally from right to left. - `'BT'`: Bottom to top, the diagram flows vertically from bottom to top. Here is an example of how to do this using 'Left to Right' (LR) instead of the default 'Top to Bottom' (TB): vending\_machine\_diagram.py ```py from vending_machine import InsertCoin, vending_machine_graph vending_machine_graph.mermaid_code(start_node=InsertCoin, direction='LR') ``` ```mermaid --- title: vending_machine_graph --- stateDiagram-v2 direction LR [*] --> InsertCoin InsertCoin --> CoinsInserted CoinsInserted --> SelectProduct CoinsInserted --> Purchase SelectProduct --> Purchase Purchase --> InsertCoin Purchase --> SelectProduct Purchase --> [*] ``` --- # [Embeddings](https://pydantic.dev/docs/ai/guides/embeddings/) # Embeddings Embeddings are vector representations of text that capture semantic meaning. They're essential for building: - **Semantic search** -- Find documents based on meaning, not just keyword matching - **RAG (Retrieval-Augmented Generation)** -- Retrieve relevant context for your AI agents - **Similarity detection** -- Find similar documents, detect duplicates, or cluster content - **Classification** -- Use embeddings as features for downstream ML models Pydantic AI provides a unified interface for generating embeddings across multiple providers. ## Quick Start The [`Embedder`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder) class is the high-level interface for generating embeddings: embeddings\_quickstart.py ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): # Embed a search query result = await embedder.embed_query('What is machine learning?') print(f'Embedding dimensions: {len(result.embeddings[0])}') #> Embedding dimensions: 1536 # Embed multiple documents at once docs = [ 'Machine learning is a subset of AI.', 'Deep learning uses neural networks.', 'Python is a programming language.', ] result = await embedder.embed_documents(docs) print(f'Embedded {len(result.embeddings)} documents') #> Embedded 3 documents ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ Queries vs Documents Some embedding models optimize differently for queries and documents. Use [`embed_query()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_query) for search queries and [`embed_documents()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_documents) for content you're indexing. ## Embedding Result All embed methods return an [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing the embeddings along with useful metadata. For convenience, you can access embeddings either by index (`result[0]`) or by the original input text (`result['Hello world']`). embedding\_result.py ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('Hello world') # Access embeddings - each is a sequence of floats embedding = result.embeddings[0] # By index via .embeddings embedding = result[0] # Or directly via __getitem__ embedding = result['Hello world'] # Or by original input text print(f'Dimensions: {len(embedding)}') #> Dimensions: 1536 # Check usage print(f'Tokens used: {result.usage.input_tokens}') #> Tokens used: 2 # Calculate cost (requires `genai-prices` to have pricing data for the model) cost = result.cost() print(f'Cost: ${cost.total_price:.6f}') #> Cost: $0.000000 ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## Providers ### OpenAI [`OpenAIEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingModel) works with OpenAI's embeddings API and any [OpenAI-compatible provider](/docs/ai/models/openai#openai-compatible-models). #### Install To use OpenAI embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openai` optional group: - [pip](#tab-panel-26) - [uv](#tab-panel-27) Terminal ```bash pip install "pydantic-ai-slim[openai]" ``` Terminal ```bash uv add "pydantic-ai-slim[openai]" ``` #### Configuration To use `OpenAIEmbeddingModel` with the OpenAI API, go to [platform.openai.com](https://platform.openai.com/) and follow your nose until you find the place to generate an API key. Once you have the API key, you can set it as an environment variable: Terminal ```bash export OPENAI_API_KEY='your-api-key' ``` You can then use the model: openai\_embeddings.py ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 1536 ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ See [OpenAI's embedding models](https://platform.openai.com/docs/guides/embeddings) for available models. #### Dimension Control OpenAI's `text-embedding-3-*` models support dimension reduction via the `dimensions` setting: openai\_dimensions.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import EmbeddingSettings embedder = Embedder( 'openai:text-embedding-3-small', settings=EmbeddingSettings(dimensions=256), ) async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 256 ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ #### OpenAI-Compatible Providers Since [`OpenAIEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingModel) uses the same provider system as [`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel), you can use it with any [OpenAI-compatible provider](/docs/ai/models/openai#openai-compatible-models): openai\_compatible\_embeddings.py ```python # Using Azure OpenAI from openai import AsyncAzureOpenAI from pydantic_ai import Embedder from pydantic_ai.embeddings.openai import OpenAIEmbeddingModel from pydantic_ai.providers.openai import OpenAIProvider azure_client = AsyncAzureOpenAI( azure_endpoint='https://your-resource.openai.azure.com', api_version='2024-02-01', api_key='your-azure-key', ) model = OpenAIEmbeddingModel( 'text-embedding-3-small', provider=OpenAIProvider(openai_client=azure_client), ) embedder = Embedder(model) # Using any OpenAI-compatible API model = OpenAIEmbeddingModel( 'your-model-name', provider=OpenAIProvider( base_url='https://your-provider.com/v1', api_key='your-api-key', ), ) embedder = Embedder(model) ``` For providers with dedicated provider classes (like [`OllamaProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.ollama.OllamaProvider) or [`AzureProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.azure.AzureProvider)), you can use the shorthand syntax: ```python from pydantic_ai import Embedder embedder = Embedder('azure:text-embedding-3-small') embedder = Embedder('ollama:nomic-embed-text') ``` See [OpenAI-compatible Models](/docs/ai/models/openai#openai-compatible-models) for the full list of supported providers. ### Google [`GoogleEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.google.GoogleEmbeddingModel) works with Google's embedding models via the Gemini API (Google AI Studio) or Vertex AI. #### Install To use Google embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `google` optional group: - [pip](#tab-panel-28) - [uv](#tab-panel-29) Terminal ```bash pip install "pydantic-ai-slim[google]" ``` Terminal ```bash uv add "pydantic-ai-slim[google]" ``` #### Configuration To use `GoogleEmbeddingModel` with the Gemini API, go to [aistudio.google.com](https://aistudio.google.com/) and generate an API key. Once you have the API key, you can set it as an environment variable: Terminal ```bash export GOOGLE_API_KEY='your-api-key' ``` You can then use the model: google\_embeddings.py ```python from pydantic_ai import Embedder embedder = Embedder('google-gla:gemini-embedding-001') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 3072 ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ See the [Google Embeddings documentation](https://ai.google.dev/gemini-api/docs/embeddings) for available models. ##### Vertex AI To use Google's embedding models via Vertex AI instead of the Gemini API, use the `google-vertex` provider prefix: google\_vertex\_embeddings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.google import GoogleEmbeddingModel from pydantic_ai.providers.google import GoogleProvider # Using provider prefix embedder = Embedder('google-vertex:gemini-embedding-001') # Or with explicit provider configuration model = GoogleEmbeddingModel( 'gemini-embedding-001', provider=GoogleProvider(vertexai=True, project='my-project', location='us-central1'), ) embedder = Embedder(model) ``` See the [Google provider documentation](/docs/ai/models/google#vertex-ai-enterprisecloud) for more details on Vertex AI authentication options, including application default credentials, service accounts, and API keys. #### Dimension Control Google's embedding models support dimension reduction via the `dimensions` setting: google\_dimensions.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import EmbeddingSettings embedder = Embedder( 'google-gla:gemini-embedding-001', settings=EmbeddingSettings(dimensions=768), ) async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 768 ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ #### Google-Specific Settings Google models support additional settings via [`GoogleEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.google.GoogleEmbeddingSettings): google\_settings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.google import GoogleEmbeddingSettings embedder = Embedder( 'google-gla:gemini-embedding-001', settings=GoogleEmbeddingSettings( dimensions=768, google_task_type='SEMANTIC_SIMILARITY', # Optimize for similarity comparison ), ) ``` See [Google's task type documentation](https://ai.google.dev/gemini-api/docs/embeddings#task-types) for available task types. By default, `embed_query()` uses `RETRIEVAL_QUERY` and `embed_documents()` uses `RETRIEVAL_DOCUMENT`. ### Cohere [`CohereEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingModel) provides access to Cohere's embedding models, which offer multilingual support and various model sizes. #### Install To use Cohere embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `cohere` optional group: - [pip](#tab-panel-30) - [uv](#tab-panel-31) Terminal ```bash pip install "pydantic-ai-slim[cohere]" ``` Terminal ```bash uv add "pydantic-ai-slim[cohere]" ``` #### Configuration To use `CohereEmbeddingModel`, go to [dashboard.cohere.com/api-keys](https://dashboard.cohere.com/api-keys) and follow your nose until you find the place to generate an API key. Once you have the API key, you can set it as an environment variable: Terminal ```bash export CO_API_KEY='your-api-key' ``` You can then use the model: cohere\_embeddings.py ```python from pydantic_ai import Embedder embedder = Embedder('cohere:embed-v4.0') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 1024 ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ See the [Cohere Embed documentation](https://docs.cohere.com/docs/cohere-embed) for available models. #### Cohere-Specific Settings Cohere models support additional settings via [`CohereEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingSettings): cohere\_settings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.cohere import CohereEmbeddingSettings embedder = Embedder( 'cohere:embed-v4.0', settings=CohereEmbeddingSettings( dimensions=512, cohere_truncate='END', # Truncate long inputs instead of erroring cohere_max_tokens=256, # Limit tokens per input ), ) ``` ### VoyageAI [`VoyageAIEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.voyageai.VoyageAIEmbeddingModel) provides access to VoyageAI's embedding models, which are optimized for retrieval with specialized models for code, finance, and legal domains. #### Install To use VoyageAI embedding models, you need to install `pydantic-ai-slim` with the `voyageai` optional group: - [pip](#tab-panel-32) - [uv](#tab-panel-33) Terminal ```bash pip install "pydantic-ai-slim[voyageai]" ``` Terminal ```bash uv add "pydantic-ai-slim[voyageai]" ``` #### Configuration To use `VoyageAIEmbeddingModel`, go to [dash.voyageai.com](https://dash.voyageai.com/) to generate an API key. Once you have the API key, you can set it as an environment variable: Terminal ```bash export VOYAGE_API_KEY='your-api-key' ``` You can then use the model: voyageai\_embeddings.py ```python from pydantic_ai import Embedder embedder = Embedder('voyageai:voyage-3.5') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 1024 ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ See the [VoyageAI Embeddings documentation](https://docs.voyageai.com/docs/embeddings) for available models. #### VoyageAI-Specific Settings VoyageAI models support additional settings via [`VoyageAIEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.voyageai.VoyageAIEmbeddingSettings): voyageai\_settings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.voyageai import VoyageAIEmbeddingSettings embedder = Embedder( 'voyageai:voyage-3.5', settings=VoyageAIEmbeddingSettings( dimensions=512, # Reduce output dimensions voyageai_input_type='document', # Override input type for all requests ), ) ``` ### Bedrock [`BedrockEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingModel) provides access to embedding models through AWS Bedrock, including Amazon Titan, Cohere, and Amazon Nova models. #### Install To use Bedrock embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `bedrock` optional group: - [pip](#tab-panel-34) - [uv](#tab-panel-35) Terminal ```bash pip install "pydantic-ai-slim[bedrock]" ``` Terminal ```bash uv add "pydantic-ai-slim[bedrock]" ``` #### Configuration Authentication with AWS Bedrock uses standard AWS credentials. See the [Bedrock provider documentation](/docs/ai/models/bedrock#environment-variables) for details on configuring credentials via environment variables, AWS credentials file, or IAM roles. Ensure your AWS account has access to the Bedrock embedding models you want to use. See [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) for details. #### Basic Usage bedrock\_embeddings.py ```python from pydantic_ai import Embedder # Using Amazon Titan embedder = Embedder('bedrock:amazon.titan-embed-text-v2:0') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 1024 ``` _(This example requires AWS credentials configured)_ #### Supported Models Bedrock supports three families of embedding models. See the [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for the full list of available models. **Amazon Titan:** - `amazon.titan-embed-text-v1` -- 1536 dimensions (fixed), 8K tokens - `amazon.titan-embed-text-v2:0` -- 256/384/1024 dimensions (configurable, default: 1024), 8K tokens **Cohere Embed:** - `cohere.embed-english-v3` -- English-only, 1024 dimensions (fixed), 512 tokens - `cohere.embed-multilingual-v3` -- Multilingual, 1024 dimensions (fixed), 512 tokens - `cohere.embed-v4:0` -- 256/512/1024/1536 dimensions (configurable, default: 1536), 128K tokens **Amazon Nova:** - `amazon.nova-2-multimodal-embeddings-v1:0` -- 256/384/1024/3072 dimensions (configurable, default: 3072), 8K tokens #### Titan-Specific Settings Titan v2 supports vector normalization for direct similarity calculations via `bedrock_titan_normalize` (default: `True`). Titan v1 does not support this setting. bedrock\_titan.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings embedder = Embedder( 'bedrock:amazon.titan-embed-text-v2:0', settings=BedrockEmbeddingSettings( dimensions=512, bedrock_titan_normalize=True, ), ) ``` Note Titan models do not support the `truncate` setting. The `dimensions` setting is only supported by Titan v2. #### Cohere-Specific Settings Cohere models on Bedrock support additional settings via [`BedrockEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingSettings): - `bedrock_cohere_input_type` -- By default, `embed_query()` uses `'search_query'` and `embed_documents()` uses `'search_document'`. Also accepts `'classification'` or `'clustering'`. - `bedrock_cohere_truncate` -- Fine-grained truncation control: `'NONE'` (default, error on overflow), `'START'`, or `'END'`. Overrides the base `truncate` setting. - `bedrock_cohere_max_tokens` -- Limits tokens per input (default: 128000). Only supported by Cohere v4. bedrock\_cohere.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings embedder = Embedder( 'bedrock:cohere.embed-v4:0', settings=BedrockEmbeddingSettings( dimensions=512, bedrock_cohere_max_tokens=1000, bedrock_cohere_truncate='END', ), ) ``` Note The `dimensions` and `bedrock_cohere_max_tokens` settings are only supported by Cohere v4. Cohere v3 models have fixed 1024 dimensions. #### Nova-Specific Settings Nova models on Bedrock support additional settings via [`BedrockEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingSettings): - `bedrock_nova_truncate` -- Fine-grained truncation control: `'NONE'` (default, error on overflow), `'START'`, or `'END'`. Overrides the base `truncate` setting. - `bedrock_nova_embedding_purpose` -- By default, `embed_query()` uses `'GENERIC_RETRIEVAL'` and `embed_documents()` uses `'GENERIC_INDEX'`. Also accepts `'TEXT_RETRIEVAL'`, `'CLASSIFICATION'`, or `'CLUSTERING'`. bedrock\_nova.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings embedder = Embedder( 'bedrock:amazon.nova-2-multimodal-embeddings-v1:0', settings=BedrockEmbeddingSettings( dimensions=1024, bedrock_nova_embedding_purpose='TEXT_RETRIEVAL', truncate=True, ), ) ``` #### Concurrency Settings Models that don't support batch embedding (Titan and Nova) make individual API requests for each input text. By default, these requests run concurrently with a maximum of 5 parallel requests. You can adjust this with the `bedrock_max_concurrency` setting: bedrock\_concurrency.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings # Increase concurrency for faster throughput embedder = Embedder( 'bedrock:amazon.titan-embed-text-v2:0', settings=BedrockEmbeddingSettings(bedrock_max_concurrency=10), ) # Or reduce concurrency to avoid rate limits embedder = Embedder( 'bedrock:amazon.nova-2-multimodal-embeddings-v1:0', settings=BedrockEmbeddingSettings(bedrock_max_concurrency=2), ) ``` #### Regional Prefixes (Cross-Region Inference) Bedrock supports cross-region inference using geographic prefixes like `us.`, `eu.`, or `apac.`: bedrock\_regional.py ```python from pydantic_ai import Embedder embedder = Embedder('bedrock:us.amazon.titan-embed-text-v2:0') ``` #### Using AWS Application Inference Profiles Set [`bedrock_inference_profile`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingSettings.bedrock_inference_profile) to route requests through an inference profile while keeping the base model name for detecting model capabilities: bedrock\_inference\_profile.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.bedrock import BedrockEmbeddingModel from pydantic_ai.providers.bedrock import BedrockProvider provider = BedrockProvider(region_name='us-east-1') model = BedrockEmbeddingModel( 'amazon.titan-embed-text-v2:0', provider=provider, settings={ 'bedrock_inference_profile': 'arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/my-embed-profile', }, ) embedder = Embedder(model) ``` #### Using a Custom Provider For advanced configuration like explicit credentials or a custom boto3 client, you can create a [`BedrockProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.bedrock.BedrockProvider) directly. See the [Bedrock provider documentation](/docs/ai/models/bedrock#provider-argument) for more details. bedrock\_provider.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.bedrock import BedrockEmbeddingModel from pydantic_ai.providers.bedrock import BedrockProvider provider = BedrockProvider( region_name='us-west-2', aws_access_key_id='your-access-key', aws_secret_access_key='your-secret-key', ) model = BedrockEmbeddingModel('amazon.titan-embed-text-v2:0', provider=provider) embedder = Embedder(model) ``` Token Counting Bedrock embedding models do not support the `count_tokens()` method because AWS Bedrock's token counting API only works with text generation models (Claude, Llama, etc.), not embedding models. Calling `count_tokens()` will raise `NotImplementedError`. ### Sentence Transformers (Local) [`SentenceTransformerEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.sentence_transformers.SentenceTransformerEmbeddingModel) runs embeddings locally using the [sentence-transformers](https://www.sbert.net/) library. This is ideal for: - **Privacy** -- Data never leaves your infrastructure - **Cost** -- No API charges for high-volume workloads - **Offline use** -- No internet connection required after model download #### Install To use Sentence Transformers embedding models, you need to install `pydantic-ai-slim` with the `sentence-transformers` optional group: - [pip](#tab-panel-36) - [uv](#tab-panel-37) Terminal ```bash pip install "pydantic-ai-slim[sentence-transformers]" ``` Terminal ```bash uv add "pydantic-ai-slim[sentence-transformers]" ``` #### Usage sentence\_transformers\_embeddings.py ```python from pydantic_ai import Embedder # Model is downloaded from Hugging Face on first use embedder = Embedder('sentence-transformers:all-MiniLM-L6-v2') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 384 ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ See the [Sentence-Transformers pretrained models](https://www.sbert.net/docs/sentence_transformer/pretrained_models.html) documentation for available models. #### Device Selection Control which device to use for inference: sentence\_transformers\_device.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.sentence_transformers import ( SentenceTransformersEmbeddingSettings, ) embedder = Embedder( 'sentence-transformers:all-MiniLM-L6-v2', settings=SentenceTransformersEmbeddingSettings( sentence_transformers_device='cuda', # Use GPU sentence_transformers_normalize_embeddings=True, # L2 normalize ), ) ``` #### Using an Existing Model Instance If you need more control over model initialization: sentence\_transformers\_instance.py ```python from sentence_transformers import SentenceTransformer from pydantic_ai import Embedder from pydantic_ai.embeddings.sentence_transformers import ( SentenceTransformerEmbeddingModel, ) # Create and configure the model yourself st_model = SentenceTransformer('all-MiniLM-L6-v2', device='cpu') # Wrap it for use with Pydantic AI model = SentenceTransformerEmbeddingModel(st_model) embedder = Embedder(model) ``` ## Settings [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) provides common configuration options that work across providers: - `dimensions`: Reduce the output embedding dimensions (supported by OpenAI, Google, Cohere, Bedrock, VoyageAI) - `truncate`: When `True`, truncate input text that exceeds the model's context length instead of raising an error (supported by Cohere, Bedrock, VoyageAI) Settings can be specified at the embedder level (applied to all calls) or per-call: embedding\_settings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import EmbeddingSettings # Default settings for all calls embedder = Embedder( 'openai:text-embedding-3-small', settings=EmbeddingSettings(dimensions=512), ) async def main(): # Override for a specific call result = await embedder.embed_query( 'Hello world', settings=EmbeddingSettings(dimensions=256), ) print(len(result.embeddings[0])) #> 256 ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## Token Counting You can check token counts before embedding to avoid exceeding model limits: token\_counting.py ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): text = 'Hello world, this is a test.' # Count tokens in text token_count = await embedder.count_tokens(text) print(f'Tokens: {token_count}') #> Tokens: 7 # Check model's maximum input tokens (returns None if unknown) max_tokens = await embedder.max_input_tokens() print(f'Max tokens: {max_tokens}') #> Max tokens: 1024 ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## Testing Use [`TestEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.TestEmbeddingModel) for testing without making API calls: testing\_embeddings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import TestEmbeddingModel async def test_my_rag_system(): embedder = Embedder('openai:text-embedding-3-small') test_model = TestEmbeddingModel() with embedder.override(model=test_model): result = await embedder.embed_query('test query') # TestEmbeddingModel returns deterministic embeddings assert result.embeddings[0] == [1.0] * 8 # Check what settings were used assert test_model.last_settings is not None ``` ## Instrumentation Enable OpenTelemetry instrumentation for debugging and monitoring: instrumented\_embeddings.py ```python import logfire from pydantic_ai import Embedder logfire.configure() # Instrument a specific embedder embedder = Embedder('openai:text-embedding-3-small', instrument=True) # Or instrument all embedders globally Embedder.instrument_all() ``` See the [Debugging and Monitoring guide](/docs/ai/integrations/logfire) for more details on using Logfire with Pydantic AI. ## Building Custom Embedding Models To integrate a custom embedding provider, subclass [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel): custom\_embedding\_model.py ```python from collections.abc import Sequence from pydantic_ai.embeddings import EmbeddingModel, EmbeddingResult, EmbeddingSettings from pydantic_ai.embeddings.result import EmbedInputType class MyCustomEmbeddingModel(EmbeddingModel): @property def model_name(self) -> str: return 'my-custom-model' @property def system(self) -> str: return 'my-provider' async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None, ) -> EmbeddingResult: inputs, settings = self.prepare_embed(inputs, settings) # Call your embedding API here embeddings = [[0.1, 0.2, 0.3] for _ in inputs] # Placeholder return EmbeddingResult( embeddings=embeddings, inputs=inputs, input_type=input_type, model_name=self.model_name, provider_name=self.system, ) ``` Use [`WrapperEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.WrapperEmbeddingModel) if you want to wrap an existing model to add custom behavior like caching or logging. --- # [Extensibility](https://pydantic.dev/docs/ai/guides/extensibility/) # Extensibility Pydantic AI is designed to be extended. [Capabilities](/docs/ai/core-concepts/capabilities) are the primary extension point -- they bundle tools, lifecycle hooks, instructions, and model settings into reusable units that can be shared across agents, packaged as libraries, and loaded from [spec files](/docs/ai/core-concepts/agent-spec). Beyond capabilities, Pydantic AI provides several other extension mechanisms for specialized needs. ## Capabilities Capabilities are the recommended way to extend Pydantic AI. They are useful for: - **Teams** building reusable internal agent components (guardrails, audit logging, authentication) - **Package authors** shipping extensions that work across models and agents - **Community contributors** sharing solutions to common problems See [Capabilities](/docs/ai/core-concepts/capabilities) for using and building capabilities, and [Hooks](/docs/ai/core-concepts/hooks) for the lightweight decorator-based approach. Tip If you want to contribute a capability, open an issue on [**Pydantic AI Harness**](https://github.com/pydantic/pydantic-ai-harness) rather than on pydantic-ai. Most capabilities belong in the harness -- see [What goes where?](/docs/ai/harness/overview#what-goes-where) for the distinction. ## Publishing capability packages To make a capability installable and usable in [agent specs](/docs/ai/core-concepts/agent-spec): 1. **Implement [`get_serialization_name()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_serialization_name)** -- defaults to the class name. Return `None` to opt out of spec support. 2. **Implement [`from_spec()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.from_spec)** -- defaults to `cls(*args, **kwargs)`. Override when your constructor takes non-serializable types. 3. **Package naming** -- use the `pydantic-ai-` prefix (e.g. `pydantic-ai-guardrails`) so users can find your package. 4. **Registration** -- users pass custom capability types via `custom_capability_types` on [`Agent.from_spec`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_spec) or [`Agent.from_file`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_file). ```python from pydantic_ai import Agent from my_package import MyCapability agent = Agent.from_file('agent.yaml', custom_capability_types=[MyCapability]) ``` See [Custom capabilities in specs](/docs/ai/core-concepts/agent-spec#custom-capabilities-in-specs) for implementation details. ## Pydantic AI Harness [**Pydantic AI Harness**](/docs/ai/harness/overview) is the official capability library for Pydantic AI -- standalone capabilities like memory, guardrails, and context management live there rather than in core. See [What goes where?](/docs/ai/harness/overview#what-goes-where) for the full breakdown, or jump to the [capability matrix](https://github.com/pydantic/pydantic-ai-harness#capability-matrix). ## Third-party ecosystem ### Capabilities [Capabilities](/docs/ai/core-concepts/capabilities) are the recommended extension mechanism for packages that need to bundle tools with hooks, instructions, or model settings. See [Third-party capabilities](/docs/ai/core-concepts/capabilities#third-party-capabilities) for community packages. ### Toolsets Many third-party extensions are available as [toolsets](/docs/ai/tools-toolsets/toolsets), which can also be wrapped as [capabilities](/docs/ai/core-concepts/capabilities) to take advantage of hooks, instructions, and model settings. See [Third-party toolsets](/docs/ai/tools-toolsets/toolsets#third-party-toolsets) for the full list. ## Other extension points ### Custom toolsets For specialized tool execution needs (custom transport, tool filtering, execution wrapping), implement [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) or subclass [`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset): - [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) -- full control over tool definitions and execution - [`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset) -- delegates to a wrapped toolset, override specific methods See [Building a Custom Toolset](/docs/ai/tools-toolsets/toolsets#building-a-custom-toolset) for details. Tip If your toolset also needs to provide instructions, model settings, or hooks, consider building a [custom capability](/docs/ai/core-concepts/capabilities#building-custom-capabilities) instead. ### Custom models For connecting to model providers not yet supported by Pydantic AI, implement [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model): - [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) -- the base interface for model implementations - [`WrapperModel`](/docs/ai/api/models/wrapper/#pydantic_ai.models.wrapper.WrapperModel) -- delegates to a wrapped model, useful for adding instrumentation or transformations See [Custom Models](/docs/ai/models/overview#custom-models) for details. ### Custom agents For custom agent behavior, subclass [`AbstractAgent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent) or [`WrapperAgent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.WrapperAgent): - [`AbstractAgent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent) -- the base interface for agent implementations, providing `run`, `run_sync`, and `run_stream` - [`WrapperAgent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.WrapperAgent) -- delegates to a wrapped agent, useful for adding pre/post-processing or context management --- # [Multi-Agent Patterns](https://pydantic.dev/docs/ai/guides/multi-agent-applications/) # Multi-Agent Patterns There are roughly five levels of complexity when building applications with Pydantic AI: 1. Single agent workflows -- what most of the `pydantic_ai` documentation covers 2. [Agent delegation](#agent-delegation) -- agents using another agent via tools 3. [Programmatic agent hand-off](#programmatic-agent-hand-off) -- one agent runs, then application code calls another agent 4. [Graph based control flow](/docs/ai/graph/) -- for the most complex cases, a graph-based state machine can be used to control the execution of multiple agents 5. [Deep Agents](#deep-agents) -- autonomous agents with planning, file operations, task delegation, and sandboxed code execution Of course, you can combine multiple strategies in a single application. ## Agent delegation "Agent delegation" refers to the scenario where an agent delegates work to another agent, then takes back control when the delegate agent (the agent called from within a tool) finishes. If you want to hand off control to another agent completely, without coming back to the first agent, you can use an [output function](/docs/ai/core-concepts/output#output-functions). Since agents are stateless and designed to be global, you do not need to include the agent itself in agent [dependencies](/docs/ai/core-concepts/dependencies). You'll generally want to pass [`ctx.usage`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.usage) to the [`usage`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) keyword argument of the delegate agent run so usage within that run counts towards the total usage of the parent agent run. Multiple models Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final [`result.usage()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.usage) of the run will not be possible, but you can still use [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) -- including `request_limit`, `total_tokens_limit`, and `tool_calls_limit` -- to avoid unexpected costs or runaway tool loops. agent\_delegation\_simple.py ```python from pydantic_ai import Agent, RunContext, UsageLimits joke_selection_agent = Agent( # (1) 'openai:gpt-5.2', instructions=( 'Use the `joke_factory` to generate some jokes, then choose the best. ' 'You must return just a single joke.' ), ) joke_generation_agent = Agent( # (2) 'google-gla:gemini-3-flash-preview', output_type=list[str] ) @joke_selection_agent.tool async def joke_factory(ctx: RunContext[None], count: int) -> list[str]: r = await joke_generation_agent.run( # (3) f'Please generate {count} jokes.', usage=ctx.usage, # (4) ) return r.output # (5) result = joke_selection_agent.run_sync( 'Tell me a joke.', usage_limits=UsageLimits(request_limit=5, total_tokens_limit=500), ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. print(result.usage()) #> RunUsage(input_tokens=165, output_tokens=24, requests=3, tool_calls=1) ``` The "parent" or controlling agent. The "delegate" agent, which is called from within a tool of the parent agent. Call the delegate agent from within a tool of the parent agent. Pass the usage from the parent agent to the delegate agent so the final [`result.usage()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.usage) includes the usage from both agents. Since the function returns `list[str]`, and the `output_type` of `joke_generation_agent` is also `list[str]`, we can simply return `r.output` from the tool. _(This example is complete, it can be run "as is")_ The control flow for this example is pretty simple and can be summarised as follows: ```mermaid graph TD START --> joke_selection_agent joke_selection_agent --> joke_factory["joke_factory (tool)"] joke_factory --> joke_generation_agent joke_generation_agent --> joke_factory joke_factory --> joke_selection_agent joke_selection_agent --> END ``` ### Agent delegation and dependencies Generally the delegate agent needs to either have the same [dependencies](/docs/ai/core-concepts/dependencies) as the calling agent, or dependencies which are a subset of the calling agent's dependencies. Initializing dependencies We say "generally" above since there's nothing to stop you initializing dependencies within a tool call and therefore using interdependencies in a delegate agent that are not available on the parent, this should often be avoided since it can be significantly slower than reusing connections etc. from the parent agent. agent\_delegation\_deps.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class ClientAndKey: # (1) http_client: httpx.AsyncClient api_key: str joke_selection_agent = Agent( 'openai:gpt-5.2', deps_type=ClientAndKey, # (2) instructions=( 'Use the `joke_factory` tool to generate some jokes on the given subject, ' 'then choose the best. You must return just a single joke.' ), ) joke_generation_agent = Agent( 'google-gla:gemini-3-flash-preview', deps_type=ClientAndKey, # (4) output_type=list[str], instructions=( 'Use the "get_jokes" tool to get some jokes on the given subject, ' 'then extract each joke into a list.' ), ) @joke_selection_agent.tool async def joke_factory(ctx: RunContext[ClientAndKey], count: int) -> list[str]: r = await joke_generation_agent.run( f'Please generate {count} jokes.', deps=ctx.deps, # (3) usage=ctx.usage, ) return r.output @joke_generation_agent.tool # (5) async def get_jokes(ctx: RunContext[ClientAndKey], count: int) -> str: response = await ctx.deps.http_client.get( 'https://example.com', params={'count': count}, headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, ) response.raise_for_status() return response.text async def main(): async with httpx.AsyncClient() as client: deps = ClientAndKey(client, 'foobar') result = await joke_selection_agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. print(result.usage()) # (6) #> RunUsage(input_tokens=220, output_tokens=32, requests=4, tool_calls=2) ``` Define a dataclass to hold the client and API key dependencies. Set the `deps_type` of the calling agent -- `joke_selection_agent` here. Pass the dependencies to the delegate agent's run method within the tool call. Also set the `deps_type` of the delegate agent -- `joke_generation_agent` here. Define a tool on the delegate agent that uses the dependencies to make an HTTP request. Usage now includes 4 requests -- 2 from the calling agent and 2 from the delegate agent. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ This example shows how even a fairly simple agent delegation can lead to a complex control flow: ```mermaid graph TD START --> joke_selection_agent joke_selection_agent --> joke_factory["joke_factory (tool)"] joke_factory --> joke_generation_agent joke_generation_agent --> get_jokes["get_jokes (tool)"] get_jokes --> http_request["HTTP request"] http_request --> get_jokes get_jokes --> joke_generation_agent joke_generation_agent --> joke_factory joke_factory --> joke_selection_agent joke_selection_agent --> END ``` ## Programmatic agent hand-off "Programmatic agent hand-off" refers to the scenario where multiple agents are called in succession, with application code and/or a human in the loop responsible for deciding which agent to call next. Here agents don't need to use the same deps. Here we show two agents used in succession, the first to find a flight and the second to extract the user's seat preference. programmatic\_handoff.py ```python from typing import Literal from pydantic import BaseModel, Field from rich.prompt import Prompt from pydantic_ai import Agent, ModelMessage, RunContext, RunUsage, UsageLimits class FlightDetails(BaseModel): flight_number: str class Failed(BaseModel): """Unable to find a satisfactory choice.""" flight_search_agent = Agent[None, FlightDetails | Failed]( # (1) 'openai:gpt-5.2', output_type=FlightDetails | Failed, # type: ignore instructions=( 'Use the "flight_search" tool to find a flight ' 'from the given origin to the given destination.' ), ) @flight_search_agent.tool # (2) async def flight_search( ctx: RunContext[None], origin: str, destination: str ) -> FlightDetails | None: # in reality, this would call a flight search API or # use a browser to scrape a flight search website return FlightDetails(flight_number='AK456') usage_limits = UsageLimits(request_limit=15) # (3) async def find_flight(usage: RunUsage) -> FlightDetails | None: # (4) message_history: list[ModelMessage] | None = None for _ in range(3): prompt = Prompt.ask( 'Where would you like to fly from and to?', ) result = await flight_search_agent.run( prompt, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, FlightDetails): return result.output else: message_history = result.all_messages( output_tool_return_content='Please try again.' ) class SeatPreference(BaseModel): row: int = Field(ge=1, le=30) seat: Literal['A', 'B', 'C', 'D', 'E', 'F'] # This agent is responsible for extracting the user's seat selection seat_preference_agent = Agent[None, SeatPreference | Failed]( # (5) 'openai:gpt-5.2', output_type=SeatPreference | Failed, # type: ignore instructions=( "Extract the user's seat preference. " 'Seats A and F are window seats. ' 'Row 1 is the front row and has extra leg room. ' 'Rows 14, and 20 also have extra leg room. ' ), ) async def find_seat(usage: RunUsage) -> SeatPreference: # (6) message_history: list[ModelMessage] | None = None while True: answer = Prompt.ask('What seat would you like?') result = await seat_preference_agent.run( answer, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, SeatPreference): return result.output else: print('Could not understand seat preference. Please try again.') message_history = result.all_messages() async def main(): # (7) usage: RunUsage = RunUsage() opt_flight_details = await find_flight(usage) if opt_flight_details is not None: print(f'Flight found: {opt_flight_details.flight_number}') #> Flight found: AK456 seat_preference = await find_seat(usage) print(f'Seat preference: {seat_preference}') #> Seat preference: row=1 seat='A' ``` Define the first agent, which finds a flight. We use an explicit type annotation until [PEP-747](https://peps.python.org/pep-0747/) lands, see [structured output](/docs/ai/core-concepts/output#structured-output). We use a union as the output type so the model can communicate if it's unable to find a satisfactory choice; internally, each member of the union will be registered as a separate tool. Define a tool on the agent to find a flight. In this simple case we could dispense with the tool and just define the agent to return structured data, then search for a flight, but in more complex scenarios the tool would be necessary. Define usage limits for the entire app. Define a function to find a flight, which asks the user for their preferences and then calls the agent to find a flight. As with `flight_search_agent` above, we use an explicit type annotation to define the agent. Define a function to find the user's seat preference, which asks the user for their seat preference and then calls the agent to extract the seat preference. Now that we've put our logic for running each agent into separate functions, our main app becomes very simple. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ The control flow for this example can be summarised as follows: ```mermaid graph TB START --> ask_user_flight["ask user for flight"] subgraph find_flight flight_search_agent --> ask_user_flight ask_user_flight --> flight_search_agent end flight_search_agent --> ask_user_seat["ask user for seat"] flight_search_agent --> END subgraph find_seat seat_preference_agent --> ask_user_seat ask_user_seat --> seat_preference_agent end seat_preference_agent --> END ``` ## Pydantic Graphs See the [graph](/docs/ai/graph/) documentation on when and how to use graphs. ## Deep Agents Deep agents are autonomous agents that combine multiple architectural patterns and capabilities to handle complex, multi-step tasks reliably. These patterns can be implemented using Pydantic AI's built-in features and (third-party) toolsets: - **Planning and progress tracking** -- agents break down complex tasks into steps and track their progress, giving users visibility into what the agent is working on. See [Task Management toolsets](/docs/ai/tools-toolsets/toolsets#task-management). - **File system operations** -- reading, writing, and editing files with proper abstraction layers that work across in-memory storage, real file systems, and sandboxed containers. See [File Operations toolsets](/docs/ai/tools-toolsets/toolsets#file-operations). - **Task delegation** -- spawning specialized sub-agents for specific tasks, with isolated context to prevent recursive delegation issues. See [Agent Delegation](#agent-delegation) above. - **Sandboxed code execution** -- running AI-generated code in isolated environments (typically Docker containers) to prevent accidents. See [Code Execution toolsets](/docs/ai/tools-toolsets/toolsets#code-execution). - **Context management** -- automatic conversation summarization to handle long sessions that would otherwise exceed token limits. See [Processing Message History](/docs/ai/core-concepts/message-history#processing-message-history). - **Human-in-the-loop** -- approval workflows for dangerous operations like code execution or file deletion. See [Requiring Tool Approval](/docs/ai/tools-toolsets/toolsets#requiring-tool-approval). - **Durable execution** -- preserving agent state across transient API failures and application errors or restarts. See [Durable Execution](/docs/ai/integrations/durable_execution/overview). In addition, the community maintains packages that bring these concepts together in a more opinionated way: - [`pydantic-deep`](https://github.com/vstorm-co/pydantic-deepagents) by [Vstorm](https://vstorm.co/) ## Observing Multi-Agent Systems Multi-agent systems can be challenging to debug due to their complexity; when multiple agents interact, understanding the flow of execution becomes essential. ### Tracing Agent Delegation With [Logfire](/docs/ai/integrations/logfire), you can trace the entire flow across multiple agents: ```python import logfire logfire.configure() logfire.instrument_pydantic_ai() # Your multi-agent code here... ``` Logfire shows you: - **Which agent handled which part** of the request - **Delegation decisions**--when and why one agent called another - **End-to-end latency** broken down by agent - **Token usage and costs** per agent - **What triggered the agent run**--the HTTP request, scheduled job, or user action that started it all - **What happened inside tool calls**--database queries, HTTP requests, file operations, and any other instrumented code that tools execute This is essential for understanding and optimizing complex agent workflows. When something goes wrong in a multi-agent system, you'll see exactly which agent failed and what it was trying to do, and whether the problem was in the agent's reasoning or in the backend systems it called. ### Full-Stack Visibility If your PydanticAI application includes a TypeScript frontend, API gateway, or services in other languages, Logfire can trace them too--Logfire provides SDKs for Python, JavaScript/TypeScript, and Rust, plus compatibility with any OpenTelemetry-instrumented application. See traces from your entire stack in a unified view. For details on sending data from other languages using standard OpenTelemetry, see the [alternative clients guide](https://logfire.pydantic.dev/docs/how-to-guides/alternative-clients/). PydanticAI's instrumentation is built on [OpenTelemetry](https://opentelemetry.io/), so you can also use any OTel-compatible backend. See the [Logfire integration guide](/docs/ai/integrations/logfire) for details. ## Examples The following examples demonstrate how to use multi-agent patterns in Pydantic AI: - [Flight booking](/docs/ai/examples/flight-booking) --- # [Testing](https://pydantic.dev/docs/ai/guides/testing/) # Testing Writing unit tests for Pydantic AI code is just like unit tests for any other Python code. Because for the most part they're nothing new, we have pretty well established tools and patterns for writing and running these kinds of tests. Unless you're really sure you know better, you'll probably want to follow roughly this strategy: - Use [`pytest`](https://docs.pytest.org/en/stable/) as your test harness - If you find yourself typing out long assertions, use [inline-snapshot](https://15r10nk.github.io/inline-snapshot/latest/) - Similarly, [dirty-equals](https://dirty-equals.helpmanual.io/latest/) can be useful for comparing large data structures - Use [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) or [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) in place of your actual model to avoid the usage, latency and variability of real LLM calls - Use [`Agent.override`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.override) to replace an agent's model, dependencies, or toolsets inside your application logic - Set [`ALLOW_MODEL_REQUESTS=False`](/docs/ai/api/models/base/#pydantic_ai.models.ALLOW_MODEL_REQUESTS) globally to block any requests from being made to non-test models accidentally ### Unit testing with `TestModel` The simplest and fastest way to exercise most of your application code is using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel), this will (by default) call all tools in the agent, then return either plain text or a structured response depending on the return type of the agent. `TestModel` is not magic The "clever" (but not too clever) part of `TestModel` is that it will attempt to generate valid structured data for [function tools](/docs/ai/tools-toolsets/tools) and [output types](/docs/ai/core-concepts/output#structured-output) based on the schema of the registered tools. There's no ML or AI in `TestModel`, it's just plain old procedural Python code that tries to generate data that satisfies the JSON schema of a tool. The resulting data won't look pretty or relevant, but it should pass Pydantic's validation in most cases. If you want something more sophisticated, use [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) and write your own data generation logic. Let's write unit tests for the following application code: weather\_app.py ```python import asyncio from datetime import date from pydantic_ai import Agent, RunContext from fake_database import DatabaseConn # (1) from weather_service import WeatherService # (2) weather_agent = Agent( 'openai:gpt-5.2', deps_type=WeatherService, instructions='Providing a weather forecast at the locations the user provides.', ) @weather_agent.tool def weather_forecast( ctx: RunContext[WeatherService], location: str, forecast_date: date ) -> str: if forecast_date < date.today(): # (3) return ctx.deps.get_historic_weather(location, forecast_date) else: return ctx.deps.get_forecast(location, forecast_date) async def run_weather_forecast( # (4) user_prompts: list[tuple[str, int]], conn: DatabaseConn ): """Run weather forecast for a list of user prompts and save.""" async with WeatherService() as weather_service: async def run_forecast(prompt: str, user_id: int): result = await weather_agent.run(prompt, deps=weather_service) await conn.store_forecast(user_id, result.output) # run all prompts in parallel await asyncio.gather( *(run_forecast(prompt, user_id) for (prompt, user_id) in user_prompts) ) ``` `DatabaseConn` is a class that holds a database connection `WeatherService` has methods to get weather forecasts and historic data about the weather We need to call a different endpoint depending on whether the date is in the past or the future, you'll see why this nuance is important below This function is the code we want to test, together with the agent it uses Here we have a function that takes a list of `(user_prompt, user_id)` tuples, gets a weather forecast for each prompt, and stores the result in the database. **We want to test this code without having to mock certain objects or modify our code so we can pass test objects in.** Here's how we would write tests using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel): test\_weather\_app.py ```python from datetime import timezone import pytest from dirty_equals import IsNow, IsStr from pydantic_ai import models, capture_run_messages, RequestUsage from pydantic_ai.models.test import TestModel from pydantic_ai import ( ModelResponse, TextPart, ToolCallPart, ToolReturnPart, UserPromptPart, ModelRequest, ) from fake_database import DatabaseConn from weather_app import run_weather_forecast, weather_agent pytestmark = pytest.mark.anyio # (1) models.ALLOW_MODEL_REQUESTS = False # (2) async def test_forecast(): conn = DatabaseConn() user_id = 1 with capture_run_messages() as messages: with weather_agent.override(model=TestModel()): # (3) prompt = 'What will the weather be like in London on 2024-11-28?' await run_weather_forecast([(prompt, user_id)], conn) # (4) forecast = await conn.get_forecast(user_id) assert forecast == '{"weather_forecast":"Sunny with a chance of rain"}' # (5) assert messages == [ # (6) ModelRequest( parts=[ UserPromptPart( content='What will the weather be like in London on 2024-11-28?', timestamp=IsNow(tz=timezone.utc), # (7) ), ], instructions='Providing a weather forecast at the locations the user provides.', timestamp=IsNow(tz=timezone.utc), run_id=IsStr(), ), ModelResponse( parts=[ ToolCallPart( tool_name='weather_forecast', args={ 'location': 'a', 'forecast_date': '2024-01-01', # (8) }, tool_call_id=IsStr(), ) ], usage=RequestUsage( input_tokens=60, output_tokens=7, ), model_name='test', timestamp=IsNow(tz=timezone.utc), run_id=IsStr(), ), ModelRequest( parts=[ ToolReturnPart( tool_name='weather_forecast', content='Sunny with a chance of rain', tool_call_id=IsStr(), timestamp=IsNow(tz=timezone.utc), ), ], instructions='Providing a weather forecast at the locations the user provides.', timestamp=IsNow(tz=timezone.utc), run_id=IsStr(), ), ModelResponse( parts=[ TextPart( content='{"weather_forecast":"Sunny with a chance of rain"}', ) ], usage=RequestUsage( input_tokens=66, output_tokens=16, ), model_name='test', timestamp=IsNow(tz=timezone.utc), run_id=IsStr(), ), ] ``` We're using [anyio](https://anyio.readthedocs.io/en/stable/) to run async tests. This is a safety measure to make sure we don't accidentally make real requests to the LLM while testing, see [`ALLOW_MODEL_REQUESTS`](/docs/ai/api/models/base/#pydantic_ai.models.ALLOW_MODEL_REQUESTS) for more details. We're using [`Agent.override`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.override) to replace the agent's model with [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel), the nice thing about `override` is that we can replace the model inside agent without needing access to the agent `run*` methods call site. Now we call the function we want to test inside the `override` context manager. But default, `TestModel` will return a JSON string summarising the tools calls made, and what was returned. If you wanted to customise the response to something more closely aligned with the domain, you could add [`custom_output_text='Sunny'`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel.custom_output_text) when defining `TestModel`. So far we don't actually know which tools were called and with which values, we can use [`capture_run_messages`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.capture_run_messages) to inspect messages from the most recent run and assert the exchange between the agent and the model occurred as expected. The [`IsNow`](https://dirty-equals.helpmanual.io/latest/types/datetime/#dirty_equals.IsNow) helper allows us to use declarative asserts even with data which will contain timestamps that change over time. `TestModel` isn't doing anything clever to extract values from the prompt, so these values are hardcoded. ### Unit testing with `FunctionModel` The above tests are a great start, but careful readers will notice that the `WeatherService.get_forecast` is never called since `TestModel` calls `weather_forecast` with a date in the past. To fully exercise `weather_forecast`, we need to use [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) to customise how the tools is called. Here's an example of using `FunctionModel` to test the `weather_forecast` tool with custom inputs test\_weather\_app2.py ```python import re import pytest from pydantic_ai import models from pydantic_ai import ( ModelMessage, ModelResponse, TextPart, ToolCallPart, ) from pydantic_ai.models.function import AgentInfo, FunctionModel from fake_database import DatabaseConn from weather_app import run_weather_forecast, weather_agent pytestmark = pytest.mark.anyio models.ALLOW_MODEL_REQUESTS = False def call_weather_forecast( # (1) messages: list[ModelMessage], info: AgentInfo ) -> ModelResponse: if len(messages) == 1: # first call, call the weather forecast tool user_prompt = messages[0].parts[-1] m = re.search(r'd{4}-d{2}-d{2}', user_prompt.content) assert m is not None args = {'location': 'London', 'forecast_date': m.group()} # (2) return ModelResponse(parts=[ToolCallPart('weather_forecast', args)]) else: # second call, return the forecast msg = messages[-1].parts[0] assert msg.part_kind == 'tool-return' return ModelResponse(parts=[TextPart(f'The forecast is: {msg.content}')]) async def test_forecast_future(): conn = DatabaseConn() user_id = 1 with weather_agent.override(model=FunctionModel(call_weather_forecast)): # (3) prompt = 'What will the weather be like in London on 2032-01-01?' await run_weather_forecast([(prompt, user_id)], conn) forecast = await conn.get_forecast(user_id) assert forecast == 'The forecast is: Rainy with a chance of sun' ``` We define a function `call_weather_forecast` that will be called by `FunctionModel` in place of the LLM, this function has access to the list of [`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)s that make up the run, and [`AgentInfo`](/docs/ai/api/models/function/#pydantic_ai.models.function.AgentInfo) which contains information about the agent and the function tools and return tools. Our function is slightly intelligent in that it tries to extract a date from the prompt, but just hard codes the location. We use [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) to replace the agent's model with our custom function. ### Overriding model via pytest fixtures If you're writing lots of tests that all require model to be overridden, you can use [pytest fixtures](https://docs.pytest.org/en/6.2.x/fixture.html) to override the model with [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) or [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) in a reusable way. Here's an example of a fixture that overrides the model with `TestModel`: test\_agent.py ```python import pytest from pydantic_ai.models.test import TestModel from weather_app import weather_agent @pytest.fixture def override_weather_agent(): with weather_agent.override(model=TestModel()): yield async def test_forecast(override_weather_agent: None): ... # test code here ``` --- # [Web Chat UI](https://pydantic.dev/docs/ai/guides/web/) # Web Chat UI Pydantic AI includes a built-in web chat interface that you can use to interact with your agents through a browser. ![Web Chat UI](/docs/ai/img/web-chat-ui.png) For CLI usage with `clai web`, see the [CLI - Web Chat UI documentation](/docs/ai/integrations/cli#web-chat-ui). Note The web UI is meant for local development and debugging. In production, you can use one of the [UI Event Stream integrations](/docs/ai/integrations/ui/overview) to connect your agent to a custom frontend. ## Installation Install the `web` extra (installs Starlette and Uvicorn): - [pip](#tab-panel-24) - [uv](#tab-panel-25) Terminal ```bash pip install 'pydantic-ai-slim[web]' ``` Terminal ```bash uv add 'pydantic-ai-slim[web]' ``` ## Basic Usage Create a web app from an agent instance using [`Agent.to_web()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.to_web): ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You are a helpful assistant.') @agent.tool_plain def get_weather(city: str) -> str: return f'The weather in {city} is sunny' app = agent.to_web() ``` Run the app with any ASGI server: Terminal ```bash uvicorn my_module:app --host 127.0.0.1 --port 7932 ``` ## Configuring Models You can specify additional models to make available in the UI. Models can be provided as a list of model names/instances or a dictionary mapping display labels to model names/instances. ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel # Model with custom configuration anthropic_model = AnthropicModel('claude-sonnet-4-5') agent = Agent('openai:gpt-5.2') app = agent.to_web( models=['openai:gpt-5.2', anthropic_model], ) # Or with custom display labels app = agent.to_web( models={'GPT 5.2': 'openai:gpt-5.2', 'Claude': anthropic_model}, ) ``` ## Builtin Tool Support You can specify a list of [builtin tools](/docs/ai/tools-toolsets/builtin-tools) that will be shown as options to the user, if the selected model supports them: ```python from pydantic_ai import Agent from pydantic_ai.builtin_tools import CodeExecutionTool, WebSearchTool agent = Agent('openai:gpt-5.2') app = agent.to_web( models=['anthropic:claude-sonnet-4-6'], builtin_tools=[CodeExecutionTool(), WebSearchTool()], ) ``` Memory Tool The `memory` builtin tool is not supported via `to_web()` or `clai web`. If your agent needs memory, configure the [`MemoryTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MemoryTool) directly on the agent at construction time. ## Extra Instructions You can pass extra instructions that will be included in each agent run: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') app = agent.to_web(instructions='Always respond in a friendly tone.') ``` ## Reserved Routes The web UI app uses the following routes which should not be overwritten: - `/` and `/{id}` - Serves the chat UI - `/api/chat` - Chat endpoint (POST, OPTIONS) - `/api/configure` - Frontend configuration (GET) - `/api/health` - Health check (GET) The app cannot currently be mounted at a subpath (e.g., `/chat`) because the UI expects these routes at the root. You can add additional routes to the app, but avoid conflicts with these reserved paths. ## Custom HTML Source By default, the web UI is fetched from a CDN and cached locally. You can provide `html_source` to override this for offline usage or enterprise environments. For offline usage, download the html file once while you have internet access: ```python from pydantic_ai.ui import DEFAULT_HTML_URL print(DEFAULT_HTML_URL) # Use this URL to download the UI HTML file #> https://cdn.jsdelivr.net/npm/@pydantic/ai-chat-ui@1.0.0/dist/index.html ``` You can then download the file using the URL printed above: Terminal ```bash curl -o ~/pydantic-ai-ui.html ``` Then use `html_source` to point to your local file or custom URL: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') # Use a local file (e.g., for offline usage) app = agent.to_web(html_source='~/pydantic-ai-ui.html') # Or use a custom URL (e.g., for enterprise environments) app = agent.to_web(html_source='https://cdn.example.com/ui/index.html') ``` --- # [Code Mode](https://pydantic.dev/docs/ai/harness/code-mode/) # Code Mode Code mode is one of the capabilities in [**Pydantic AI Harness**](/docs/ai/harness/overview), the official capability library for Pydantic AI. The full docs live in the [harness repo](https://github.com/pydantic/pydantic-ai-harness) -- this page is a short intro. [`CodeMode`](https://github.com/pydantic/pydantic-ai-harness/blob/main/pydantic_ai_harness/code_mode/README.md) wraps your tools into a single `run_code` tool powered by our [Monty](https://github.com/pydantic/monty) sandbox. The model writes Python that calls multiple tools with loops, conditionals, variables, and `asyncio.gather` -- all inside one tool call. Standard tool calling requires one model round-trip per tool call. An agent that needs to fetch 10 items and process each one makes 11+ model calls -- slow, expensive, and context-heavy. Code mode collapses that into one. ## Usage Terminal ```bash uv add "pydantic-ai-harness[code-mode]" ``` ```python from pydantic_ai import Agent from pydantic_ai_harness import CodeMode agent = Agent('anthropic:claude-sonnet-4-6', capabilities=[CodeMode()]) @agent.tool_plain def get_weather(city: str) -> dict: """Get current weather for a city.""" return {'city': city, 'temp_f': 72, 'condition': 'sunny'} @agent.tool_plain def convert_temp(fahrenheit: float) -> float: """Convert Fahrenheit to Celsius.""" return round((fahrenheit - 32) * 5 / 9, 1) result = agent.run_sync("What's the weather in Paris and Tokyo, in Celsius?") print(result.output) ``` The model writes code like: ```python paris, tokyo = await asyncio.gather( get_weather(city='Paris'), get_weather(city='Tokyo'), ) paris_c = await convert_temp(fahrenheit=paris['temp_f']) tokyo_c = await convert_temp(fahrenheit=tokyo['temp_f']) {'paris': paris_c, 'tokyo': tokyo_c} ``` ## Full documentation See the [Code Mode README](https://github.com/pydantic/pydantic-ai-harness/blob/main/pydantic_ai_harness/code_mode/README.md) in the harness repo for selective tool sandboxing, metadata-based selection, return value handling, REPL state, observability, sandbox restrictions, the full API, and agent spec usage. --- # [Overview](https://pydantic.dev/docs/ai/harness/overview/) # Overview [**Pydantic AI Harness**](https://github.com/pydantic/pydantic-ai-harness) is the official [capability](/docs/ai/core-concepts/capabilities) library for Pydantic AI -- the batteries for your agent. Terminal ```bash uv add pydantic-ai-harness ``` ## What goes where? Pydantic AI core ships the agent loop, model providers, the capabilities/hooks abstraction, and two kinds of capabilities: - **Capabilities that require model or framework support** -- anything backed by provider built-in tools (like [image generation](/docs/ai/core-concepts/capabilities#provider-adaptive-tools)), provider-specific APIs (like compaction via the OpenAI or Anthropic APIs), or deep agent graph integration. These go hand-in-hand with model class code and need to ship together. - **Capabilities that are fundamental to the agent experience** -- things nearly every agent benefits from, like [web search](/docs/ai/core-concepts/capabilities#provider-adaptive-tools), [tool search](/docs/ai/tools-toolsets/tools-advanced#tool-search), and [thinking](/docs/ai/core-concepts/capabilities#thinking). These feel like qualities of the agent itself, not accessories. **Pydantic AI Harness** is where everything else lives: standalone capabilities that make specific categories of agents powerful, or that are still finding their final shape. Context management, memory, guardrails, file system access, code execution, multi-agent orchestration -- these are the building blocks you pick and choose based on what your agent needs to do. The harness is also where new capabilities _start_. It ships as a separate package so capabilities can iterate faster without the strict backward-compatibility requirements of core. As a capability stabilizes and proves itself broadly essential, it can graduate into core -- [code mode](https://github.com/pydantic/pydantic-ai-harness/tree/main/pydantic_ai_harness/code_mode) is an early candidate. Many capabilities benefit from a "fall up" pattern: they typically start as a local implementation that works with every model, then gain provider-native support that uses the provider's built-in API when available -- auto-switching between the two. This is how [web search](/docs/ai/core-concepts/capabilities#provider-adaptive-tools), [web fetch](/docs/ai/core-concepts/capabilities#provider-adaptive-tools), and [image generation](/docs/ai/core-concepts/capabilities#provider-adaptive-tools) already work in core, and the same approach is coming for skills, code mode, and context compaction. ## What's in it? See the [capability matrix](https://github.com/pydantic/pydantic-ai-harness#capability-matrix) on the harness README for the full list of what's available and what's coming, along with community alternatives. ## Contributing capabilities Capability contributions should go to the [harness repo](https://github.com/pydantic/pydantic-ai-harness), not to pydantic-ai. The capabilities abstraction gives contributions clear boundaries, which makes them easier to review. See [Contributing](https://github.com/pydantic/pydantic-ai-harness#contributing) for details. ## Build your own You can also publish capabilities as standalone packages. See [Building custom capabilities](/docs/ai/core-concepts/capabilities#building-custom-capabilities) for the API and [Publishing capability packages](/docs/ai/guides/extensibility#publishing-capability-packages) for packaging guidance. ## Links - [GitHub](https://github.com/pydantic/pydantic-ai-harness) - [PyPI](https://pypi.org/project/pydantic-ai-harness/) - [Capability matrix](https://github.com/pydantic/pydantic-ai-harness#capability-matrix) - [Version policy](https://github.com/pydantic/pydantic-ai-harness#version-policy) - [Contributing](https://github.com/pydantic/pydantic-ai-harness#contributing) --- # [](https://pydantic.dev/docs/ai/) --- # [Agent2Agent (A2A)](https://pydantic.dev/docs/ai/integrations/a2a/) # Agent2Agent (A2A) The [Agent2Agent (A2A) Protocol](https://google.github.io/A2A/) is an open standard introduced by Google that enables communication and interoperability between AI agents, regardless of the framework or vendor they are built on. At Pydantic, we built the [FastA2A](#fasta2a) library to make it easier to implement the A2A protocol in Python. We also built a convenience method that expose Pydantic AI agents as A2A servers - let's have a quick look at how to use it: agent\_to\_a2a.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be fun!') app = agent.to_a2a() ``` _You can run the example with `uvicorn agent_to_a2a:app --host 0.0.0.0 --port 8000`_ This will expose the agent as an A2A server, and you can start sending requests to it. See more about [exposing Pydantic AI agents as A2A servers](#pydantic-ai-agent-to-a2a-server). ## FastA2A **FastA2A** is an agentic framework agnostic implementation of the A2A protocol in Python. The library is designed to be used with any agentic framework, and is **not exclusive to Pydantic AI**. ### Design **FastA2A** is built on top of [Starlette](https://www.starlette.io), which means it's fully compatible with any ASGI server. Given the nature of the A2A protocol, it's important to understand the design before using it, as a developer you'll need to provide some components: - [`Storage`](/docs/ai/api/fasta2a/#fasta2a.Storage): to save and load tasks, as well as store context for conversations - [`Broker`](/docs/ai/api/fasta2a/#fasta2a.Broker): to schedule tasks - [`Worker`](/docs/ai/api/fasta2a/#fasta2a.Worker): to execute tasks Let's have a look at how those components fit together: ```mermaid flowchart TB Server["HTTP Server"] <--> |Sends Requests/
Receives Results| TM subgraph CC[Core Components] direction RL TM["TaskManager
(coordinates)"] --> |Schedules Tasks| Broker TM <--> Storage Broker["Broker
(queues & schedules)"] <--> Storage["Storage
(persistence)"] Broker --> |Delegates Execution| Worker end Worker["Worker
(implementation)"] ``` FastA2A allows you to bring your own [`Storage`](/docs/ai/api/fasta2a/#fasta2a.Storage), [`Broker`](/docs/ai/api/fasta2a/#fasta2a.Broker) and [`Worker`](/docs/ai/api/fasta2a/#fasta2a.Worker). #### Understanding Tasks and Context In the A2A protocol: - **Task**: Represents one complete execution of an agent. When a client sends a message to the agent, a new task is created. The agent runs until completion (or failure), and this entire execution is considered one task. The final output is stored as a task artifact. - **Context**: Represents a conversation thread that can span multiple tasks. The A2A protocol uses a `context_id` to maintain conversation continuity: - When a new message is sent without a `context_id`, the server generates a new one - Subsequent messages can include the same `context_id` to continue the conversation - All tasks sharing the same `context_id` have access to the complete message history #### Storage Architecture The [`Storage`](/docs/ai/api/fasta2a/#fasta2a.Storage) component serves two purposes: 1. **Task Storage**: Stores tasks in A2A protocol format, including their status, artifacts, and message history 2. **Context Storage**: Stores conversation context in a format optimized for the specific agent implementation This design allows for agents to store rich internal state (e.g., tool calls, reasoning traces) as well as store task-specific A2A-formatted messages and artifacts. For example, a Pydantic AI agent might store its complete internal message format (including tool calls and responses) in the context storage, while storing only the A2A-compliant messages in the task history. ### Installation FastA2A is available on PyPI as [`fasta2a`](https://pypi.org/project/fasta2a/) so installation is as simple as: - [pip](#tab-panel-46) - [uv](#tab-panel-47) Terminal ```bash pip install fasta2a ``` Terminal ```bash uv add fasta2a ``` The only dependencies are: - [starlette](https://www.starlette.io): to expose the A2A server as an [ASGI application](https://asgi.readthedocs.io/en/latest/) - [pydantic](https://pydantic.dev): to validate the request/response messages - [opentelemetry-api](https://opentelemetry-python.readthedocs.io/en/latest): to provide tracing capabilities You can install Pydantic AI with the `a2a` extra to include **FastA2A**: - [pip](#tab-panel-48) - [uv](#tab-panel-49) Terminal ```bash pip install 'pydantic-ai-slim[a2a]' ``` Terminal ```bash uv add 'pydantic-ai-slim[a2a]' ``` ### Pydantic AI Agent to A2A Server To expose a Pydantic AI agent as an A2A server, you can use the `to_a2a` method: agent\_to\_a2a.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be fun!') app = agent.to_a2a() ``` Since `app` is an ASGI application, it can be used with any ASGI server. Terminal ```bash uvicorn agent_to_a2a:app --host 0.0.0.0 --port 8000 ``` Since the goal of `to_a2a` is to be a convenience method, it accepts the same arguments as the [`FastA2A`](/docs/ai/api/fasta2a/#fasta2a.FastA2A) constructor. When using `to_a2a()`, Pydantic AI automatically: - Stores the complete conversation history (including tool calls and responses) in the context storage - Ensures that subsequent messages with the same `context_id` have access to the full conversation history - Persists agent results as A2A artifacts: - String results become `TextPart` artifacts and also appear in the message history - Structured data (Pydantic models, dataclasses, tuples, etc.) become `DataPart` artifacts with the data wrapped as `{"result": }` - Artifacts include metadata with type information and JSON schema when available --- # [Command Line Interface (CLI)](https://pydantic.dev/docs/ai/integrations/cli/) # Command Line Interface (CLI) **Pydantic AI** comes with a CLI, `clai` (pronounced "clay"). You can use it to chat with various LLMs and quickly get answers, right from the command line, or spin up a uvicorn server to chat with your Pydantic AI agents from your browser. ## Installation You can run the `clai` using [`uvx`](https://docs.astral.sh/uv/guides/tools/): Terminal ```bash uvx clai ``` Or install `clai` globally [with `uv`](https://docs.astral.sh/uv/guides/tools/#installing-tools): Terminal ```bash uv tool install clai ... clai ``` Or with `pip`: Terminal ```bash pip install clai ... clai ``` ## CLI Usage You'll need to set an environment variable depending on the provider you intend to use. E.g. if you're using OpenAI, set the `OPENAI_API_KEY` environment variable: Terminal ```bash export OPENAI_API_KEY='your-api-key-here' ``` Then running `clai` will start an interactive session where you can chat with the AI model. Special commands available in interactive mode: - `/exit`: Exit the session - `/markdown`: Show the last response in markdown format - `/multiline`: Toggle multiline input mode (use Ctrl+D to submit) - `/cp`: Copy the last response to clipboard ### CLI Options Option Description `prompt` AI prompt for one-shot mode (positional). If omitted, starts interactive mode. `-m`, `--model` Model to use in `provider:model` format (e.g., `openai:gpt-5.2`) `-a`, `--agent` Custom agent in `module:variable` format `-t`, `--code-theme` Syntax highlighting theme (`dark`, `light`, or [pygments theme](https://pygments.org/styles/)) `--no-stream` Disable streaming from the model `-l`, `--list-models` List all available models and exit `--version` Show version and exit ### Choose a model You can specify which model to use with the `--model` flag: Terminal ```bash clai --model anthropic:claude-sonnet-4-6 ``` (a full list of models available can be printed with `clai --list-models`) ### Custom Agents You can specify a custom agent using the `--agent` flag with a module path and variable name: custom\_agent.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') ``` Then run: Terminal ```bash clai --agent custom_agent:agent "What's the weather today?" ``` The format must be `module:variable` where: - `module` is the importable Python module path - `variable` is the name of the Agent instance in that module Additionally, you can directly launch CLI mode from an `Agent` instance using `Agent.to_cli_sync()`: agent\_to\_cli\_sync.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') agent.to_cli_sync() ``` You can also use the async interface with `Agent.to_cli()`: agent\_to\_cli.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` _(You'll need to add `asyncio.run(main())` to run `main`)_ ### Message History Both `Agent.to_cli()` and `Agent.to_cli_sync()` support a `message_history` parameter, allowing you to continue an existing conversation or provide conversation context: agent\_with\_history.py ```python from pydantic_ai import ( Agent, ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) agent = Agent('openai:gpt-5.2') # Create some conversation history message_history: list[ModelMessage] = [ ModelRequest([UserPromptPart(content='What is 2+2?')]), ModelResponse([TextPart(content='2+2 equals 4.')]) ] # Start CLI with existing conversation context agent.to_cli_sync(message_history=message_history) ``` The CLI will start with the provided conversation history, allowing the agent to refer back to previous exchanges and maintain context throughout the session. ## Web Chat UI Launch a web-based chat interface by running: Terminal ```bash clai web -m openai:gpt-5.2 ``` This will start a web server (default: [http://127.0.0.1:7932](http://127.0.0.1:7932)) with a chat interface. You can also serve an existing agent. For example, if you have an agent defined in `my_agent.py`: ```python from pydantic_ai import Agent my_agent = Agent('openai:gpt-5.2', instructions='You are a helpful assistant.') ``` Launch the web UI: Terminal ```bash # With a custom agent clai web --agent my_module:my_agent # With specific models (first is default when no --agent) clai web -m openai:gpt-5.2 -m anthropic:claude-sonnet-4-6 # With builtin tools clai web -m openai:gpt-5.2 -t web_search -t code_execution # Generic agent with system instructions clai web -m openai:gpt-5.2 -i 'You are a helpful coding assistant' # Custom agent with extra instructions for each run clai web --agent my_module:my_agent -i 'Always respond in Spanish' ``` Memory Tool The [`memory`](/docs/ai/tools-toolsets/builtin-tools#memory-tool) builtin tool cannot be enabled via `-t memory`. If your agent needs memory, configure the [`MemoryTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MemoryTool) directly on the agent and provide it via `--agent`. ### Web UI Options Option Description `--agent`, `-a` Agent to serve in [`module:variable` format](#custom-agents) `--model`, `-m` Models to list as options in the UI (repeatable) `--tool`, `-t` [Builtin tool](/docs/ai/tools-toolsets/builtin-tools)s to list as options in the UI (repeatable). See [available tools](/docs/ai/guides/web#builtin-tool-support). `--instructions`, `-i` System instructions. When `--agent` is specified, these are additional to the agent's existing instructions. `--host` Host to bind server (default: 127.0.0.1) `--port` Port to bind server (default: 7932) `--html-source` URL or file path for the chat UI HTML. When using `--agent`, the agent's configured model becomes the default. CLI models (`-m`) are additional options. Without `--agent`, the first `-m` model is the default. The web chat UI can also be launched programmatically using [`Agent.to_web()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.to_web), see the [Web UI documentation](/docs/ai/guides/web). Run the `web` command with `--help` to see all available options: Terminal ```bash clai web --help ``` --- # [DBOS](https://pydantic.dev/docs/ai/integrations/durable_execution/dbos/) # DBOS [DBOS](https://www.dbos.dev/) is a lightweight [durable execution](https://docs.dbos.dev/architecture) library natively integrated with Pydantic AI. ## Durable Execution DBOS workflows make your program **durable** by checkpointing its state in a database. If your program ever fails, when it restarts all your workflows will automatically resume from the last completed step. - **Workflows** must be deterministic and generally cannot include I/O. - **Steps** may perform I/O (network, disk, API calls). If a step fails, it restarts from the beginning. Every workflow input and step output is durably stored in the system database. When workflow execution fails, whether from crashes, network issues, or server restarts, DBOS leverages these checkpoints to recover workflows from their last completed step. DBOS **queues** provide durable, database-backed alternatives to systems like Celery or BullMQ, supporting features such as concurrency limits, rate limits, timeouts, and prioritization. See the [DBOS docs](https://docs.dbos.dev/architecture) for details. The diagram below shows the overall architecture of an agentic application in DBOS. DBOS runs fully in-process as a library. Functions remain normal Python functions but are checkpointed into a database (Postgres or SQLite). ```text Clients (HTTP, RPC, Kafka, etc.) | v +------------------------------------------------------+ | Application Servers | | | | +----------------------------------------------+ | | | Pydantic AI + DBOS Libraries | | | | | | | | [ Workflows (Agent Run Loop) ] | | | | [ Steps (Tool, MCP, Model) ] | | | | [ Queues ] [ Cron Jobs ] [ Messaging ] | | | +----------------------------------------------+ | | | +------------------------------------------------------+ | v +------------------------------------------------------+ | Database | | (Stores workflow and step state, schedules tasks) | +------------------------------------------------------+ ``` See the [DBOS documentation](https://docs.dbos.dev/architecture) for more information. ## Durable Agent Any agent can be wrapped in a [`DBOSAgent`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent) to get durable execution. `DBOSAgent` automatically:, - Wraps `Agent.run` and `Agent.run_sync` as DBOS workflows. - Wraps [model requests](/docs/ai/models/overview) and [MCP communication](/docs/ai/mcp/client) as DBOS steps. Custom tool functions and event stream handlers are **not automatically wrapped** by DBOS. If they involve non-deterministic behavior or perform I/O, you should explicitly decorate them with `@DBOS.step`. The original agent, model, and MCP server can still be used as normal outside the DBOS workflow. Here is a simple but complete example of wrapping an agent for durable execution. All it requires is to install Pydantic AI with the DBOS [open-source library](https://github.com/dbos-inc/dbos-transact-py): - [pip](#tab-panel-168) - [uv](#tab-panel-169) Terminal ```bash pip install pydantic-ai[dbos] ``` Terminal ```bash uv add pydantic-ai[dbos] ``` Or if you're using the slim package, you can install it with the `dbos` optional group: - [pip](#tab-panel-170) - [uv](#tab-panel-171) Terminal ```bash pip install pydantic-ai-slim[dbos] ``` Terminal ```bash uv add pydantic-ai-slim[dbos] ``` dbos\_agent.py ```python from dbos import DBOS, DBOSConfig from pydantic_ai import Agent from pydantic_ai.durable_exec.dbos import DBOSAgent dbos_config: DBOSConfig = { 'name': 'pydantic_dbos_agent', 'system_database_url': 'sqlite:///dbostest.sqlite', # (3) } DBOS(config=dbos_config) agent = Agent( 'gpt-5.2', instructions="You're an expert in geography.", name='geography', # (4) ) dbos_agent = DBOSAgent(agent) # (1) async def main(): DBOS.launch() result = await dbos_agent.run('What is the capital of Mexico?') # (2) print(result.output) #> Mexico City (Ciudad de México, CDMX) ``` Workflows and `DBOSAgent` must be defined before `DBOS.launch()` so that recovery can correctly find all workflows. [`DBOSAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent.run) works like `Agent.run()`, but runs as a DBOS workflow and executes model requests, decorated tool calls, and MCP communication as DBOS steps. This example uses SQLite. Postgres is recommended for production. The agent's `name` is used to uniquely identify its workflows. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ Because DBOS workflows need to be defined before calling `DBOS.launch()` and the `DBOSAgent` instance automatically registers `run` and `run_sync` as workflows, it needs to be defined before calling `DBOS.launch()` as well. For more information on how to use DBOS in Python applications, see their [Python SDK guide](https://docs.dbos.dev/python/programming-guide). ## DBOS Integration Considerations When using DBOS with Pydantic AI agents, there are a few important considerations to ensure workflows and toolsets behave correctly. ### Agent and Toolset Requirements Each agent instance must have a unique `name` so DBOS can correctly resume workflows after a failure or restart. Tools and event stream handlers are not automatically wrapped by DBOS. You can decide how to integrate them: - Decorate with `@DBOS.step` if the function involves non-determinism or I/O. - Skip the decorator if durability isn't needed, so you avoid the extra DB checkpoint write. - If the function needs to enqueue tasks or invoke other DBOS workflows, run it inside the agent's main workflow (not as a step). Other than that, any agent and toolset will just work! ### Agent Run Context and Dependencies DBOS checkpoints workflow inputs/outputs and step outputs into a database using [`pickle`](https://docs.python.org/3/library/pickle.html). This means you need to make sure [dependencies](/docs/ai/core-concepts/dependencies) object provided to [`DBOSAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent.run) or [`DBOSAgent.run_sync()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent.run_sync), and tool outputs can be serialized using pickle. You may also want to keep the inputs and outputs small (under ~2 MB). PostgreSQL and SQLite support up to 1 GB per field, but large objects may impact performance. ### Streaming Because DBOS cannot stream output directly to the workflow or step call site, `Agent.run_stream()` and `Agent.run_stream_events()` are not supported when running inside of a DBOS workflow. Instead, you can implement streaming by setting an [`event_stream_handler`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EventStreamHandler) on the `Agent` or `DBOSAgent` instance and using [`DBOSAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent.run). The event stream handler function will receive the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](/docs/ai/core-concepts/agent#streaming-all-events). ### Parallel Tool Execution When using `DBOSAgent`, tools are executed in parallel by default to minimize latency. To guarantee deterministic replay and reliable recovery, DBOS waits for all parallel tool calls to complete before emitting events **in order**. It's equivalent to the behavior of [`with agent.parallel_tool_call_execution_mode('parallel_ordered_events')`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.parallel_tool_call_execution_mode). If you prefer strict ordering, you can configure the agent to run tools sequentially by setting [`parallel_execution_mode='sequential'`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent) when initializing the `DBOSAgent`. ## Step Configuration You can customize DBOS step behavior, such as retries, by passing [`StepConfig`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.StepConfig) objects to the `DBOSAgent` constructor: - `mcp_step_config`: The DBOS step config to use for MCP server communication. No retries if omitted. - `model_step_config`: The DBOS step config to use for model request steps. No retries if omitted. For custom tools, you can annotate them directly with [`@DBOS.step`](https://docs.dbos.dev/python/reference/decorators#step) or [`@DBOS.workflow`](https://docs.dbos.dev/python/reference/decorators#workflow) decorators as needed. These decorators have no effect outside DBOS workflows, so tools remain usable in non-DBOS agents. ## Step Retries On top of the automatic retries for request failures that DBOS will perform, Pydantic AI and various provider API clients also have their own request retry logic. Enabling these at the same time may cause the request to be retried more often than expected, with improper `Retry-After` handling. When using DBOS, it's recommended to not use [HTTP Request Retries](/docs/ai/advanced-features/retries) and to turn off your provider API client's own retry logic, for example by setting `max_retries=0` on a [custom `OpenAIProvider` API client](/docs/ai/models/openai#custom-openai-client). You can customize DBOS's retry policy using [step configuration](#step-configuration). ## Observability with Logfire DBOS can be configured to generate OpenTelemetry spans for each workflow and step execution, and Pydantic AI emits spans for each agent run, model request, and tool invocation. You can send these spans to [Pydantic Logfire](/docs/ai/integrations/logfire) to get a full, end-to-end view of what's happening in your application. For more information about DBOS logging and tracing, please see the [DBOS docs](https://docs.dbos.dev/python/tutorials/logging-and-tracing) for details. --- # [Overview](https://pydantic.dev/docs/ai/integrations/durable_execution/overview/) # Overview Pydantic AI allows you to build durable agents that can preserve their progress across transient API failures and application errors or restarts, and handle long-running, asynchronous, and human-in-the-loop workflows with production-grade reliability. Durable agents have full support for [streaming](/docs/ai/core-concepts/agent#streaming-all-events) and [MCP](/docs/ai/mcp/client), with the added benefit of fault tolerance. Pydantic AI natively supports three durable execution solutions: - [Temporal](/docs/ai/integrations/durable_execution/temporal) - [DBOS](/docs/ai/integrations/durable_execution/dbos) - [Prefect](/docs/ai/integrations/durable_execution/prefect) These integrations only use Pydantic AI's public interface, so they also serve as a reference for integrating with other durable systems. --- # [Prefect](https://pydantic.dev/docs/ai/integrations/durable_execution/prefect/) # Prefect [Prefect](https://www.prefect.io/) is a workflow orchestration framework for building resilient data pipelines in Python, natively integrated with Pydantic AI. ## Durable Execution Prefect 3.0 brings [transactional semantics](https://www.prefect.io/blog/transactional-ml-pipelines-with-prefect-3-0) to your Python workflows, allowing you to group tasks into atomic units and define failure modes. If any part of a transaction fails, the entire transaction can be rolled back to a clean state. - **Flows** are the top-level entry points for your workflow. They can contain tasks and other flows. - **Tasks** are individual units of work that can be retried, cached, and monitored independently. Prefect 3.0's approach to transactional orchestration makes your workflows automatically **idempotent**: rerunnable without duplication or inconsistency across any environment. Every task is executed within a transaction that governs when and where the task's result record is persisted. If the task runs again under an identical context, it will not re-execute but instead load its previous result. The diagram below shows the overall architecture of an agentic application with Prefect. Prefect uses client-side task orchestration by default, with optional server connectivity for advanced features like scheduling and monitoring. ```text +---------------------+ | Prefect Server | (Monitoring, | or Cloud | scheduling, UI, +---------------------+ orchestration) ^ | Flow state, | Schedule flows, metadata, | track execution logs | | +------------------------------------------------------+ | Application Process | | +----------------------------------------------+ | | | Flow (Agent.run) | | | +----------------------------------------------+ | | | | | | | v v v | | +-----------+ +------------+ +-------------+ | | | Task | | Task | | Task | | | | (Tool) | | (MCP Tool) | | (Model API) | | | +-----------+ +------------+ +-------------+ | | | | | | | Cache & Cache & Cache & | | persist persist persist | | to to to | | v v v | | +----------------------------------------------+ | | | Result Storage (Local FS, S3, etc.) | | | +----------------------------------------------+ | +------------------------------------------------------+ | | | v v v [External APIs, services, databases, etc.] ``` See the [Prefect documentation](https://docs.prefect.io/) for more information. ## Durable Agent Any agent can be wrapped in a [`PrefectAgent`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.prefect.PrefectAgent) to get durable execution. `PrefectAgent` automatically: - Wraps `Agent.run` and `Agent.run_sync` as Prefect flows. - Wraps [model requests](/docs/ai/models/overview) as Prefect tasks. - Wraps [tool calls](/docs/ai/tools-toolsets/tools) as Prefect tasks (configurable per-tool). - Wraps [MCP communication](/docs/ai/mcp/client) as Prefect tasks. Event stream handlers are **automatically wrapped** by Prefect when running inside a Prefect flow. Each event from the stream is processed in a separate Prefect task for durability. You can customize the task behavior using the `event_stream_handler_task_config` parameter when creating the `PrefectAgent`. Do **not** manually decorate event stream handlers with `@task`. For examples, see the [streaming docs](/docs/ai/core-concepts/agent#streaming-all-events) The original agent, model, and MCP server can still be used as normal outside the Prefect flow. Here is a simple but complete example of wrapping an agent for durable execution. All it requires is to install Pydantic AI with Prefect: - [pip](#tab-panel-164) - [uv](#tab-panel-165) Terminal ```bash pip install pydantic-ai[prefect] ``` Terminal ```bash uv add pydantic-ai[prefect] ``` Or if you're using the slim package, you can install it with the `prefect` optional group: - [pip](#tab-panel-166) - [uv](#tab-panel-167) Terminal ```bash pip install pydantic-ai-slim[prefect] ``` Terminal ```bash uv add pydantic-ai-slim[prefect] ``` prefect\_agent.py ```python from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent agent = Agent( 'gpt-5.2', instructions="You're an expert in geography.", name='geography', # (1) ) prefect_agent = PrefectAgent(agent) # (2) async def main(): result = await prefect_agent.run('What is the capital of Mexico?') # (3) print(result.output) #> Mexico City (Ciudad de México, CDMX) ``` The agent's `name` is used to uniquely identify its flows and tasks. Wrapping the agent with `PrefectAgent` enables durable execution for all agent runs. [`PrefectAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.prefect.PrefectAgent.run) works like `Agent.run()`, but runs as a Prefect flow and executes model requests, decorated tool calls, and MCP communication as Prefect tasks. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ For more information on how to use Prefect in Python applications, see their [Python documentation](https://docs.prefect.io/v3/how-to-guides/workflows/write-and-run). ## Prefect Integration Considerations When using Prefect with Pydantic AI agents, there are a few important considerations to ensure workflows behave correctly. ### Agent Requirements Each agent instance must have a unique `name` so Prefect can correctly identify and track its flows and tasks. ### Tool Wrapping Agent tools are automatically wrapped as Prefect tasks, which means they benefit from: - **Retry logic**: Failed tool calls can be retried automatically - **Caching**: Tool results are cached based on their inputs - **Observability**: Tool execution is tracked in the Prefect UI You can customize tool task behavior using `tool_task_config` (applies to all tools) or `tool_task_config_by_name` (per-tool configuration): prefect\_agent\_config.py ```python from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent, TaskConfig agent = Agent('gpt-5.2', name='my_agent') @agent.tool_plain def fetch_data(url: str) -> str: # This tool will be wrapped as a Prefect task ... prefect_agent = PrefectAgent( agent, tool_task_config=TaskConfig(retries=3), # Default for all tools tool_task_config_by_name={ 'fetch_data': TaskConfig(timeout_seconds=10.0), # Specific to fetch_data 'simple_tool': None, # Disable task wrapping for simple_tool }, ) ``` Set a tool's config to `None` in `tool_task_config_by_name` to disable task wrapping for that specific tool. ### Streaming When running inside a Prefect flow, `Agent.run_stream()` works but doesn't provide real-time streaming because Prefect tasks consume their entire execution before returning results. The method will execute fully and return the complete result at once. For real-time streaming behavior inside Prefect flows, you can set an [`event_stream_handler`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EventStreamHandler) on the `Agent` or `PrefectAgent` instance and use [`PrefectAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.prefect.PrefectAgent.run). **Note**: Event stream handlers behave differently when running inside a Prefect flow versus outside: - **Outside a flow**: The handler receives events as they stream from the model - **Inside a flow**: Each event is wrapped as a Prefect task for durability, which may affect timing but ensures reliability The event stream handler function will receive the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](/docs/ai/core-concepts/agent#streaming-all-events). ## Task Configuration You can customize Prefect task behavior, such as retries and timeouts, by passing [`TaskConfig`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.prefect.TaskConfig) objects to the `PrefectAgent` constructor: - `mcp_task_config`: Configuration for MCP server communication tasks - `model_task_config`: Configuration for model request tasks - `tool_task_config`: Default configuration for all tool calls - `tool_task_config_by_name`: Per-tool task configuration (overrides `tool_task_config`) - `event_stream_handler_task_config`: Configuration for event stream handler tasks (applies when running inside a Prefect flow) Available `TaskConfig` options: - `retries`: Maximum number of retries for the task (default: `0`) - `retry_delay_seconds`: Delay between retries in seconds (can be a single value or list for exponential backoff, default: `1.0`) - `timeout_seconds`: Maximum time in seconds for the task to complete - `cache_policy`: Custom Prefect cache policy for the task - `persist_result`: Whether to persist the task result - `result_storage`: Prefect result storage for the task (e.g., `'s3-bucket/my-storage'` or a `WritableFileSystem` block) - `log_prints`: Whether to log print statements from the task (default: `False`) Example: prefect\_agent\_config.py ```python from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent, TaskConfig agent = Agent( 'gpt-5.2', instructions="You're an expert in geography.", name='geography', ) prefect_agent = PrefectAgent( agent, model_task_config=TaskConfig( retries=3, retry_delay_seconds=[1.0, 2.0, 4.0], # Exponential backoff timeout_seconds=30.0, ), ) async def main(): result = await prefect_agent.run('What is the capital of France?') print(result.output) #> Paris ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ### Retry Considerations Pydantic AI and provider API clients have their own retry logic. When using Prefect, you may want to: - Disable [HTTP Request Retries](/docs/ai/advanced-features/retries) in Pydantic AI - Turn off your provider API client's retry logic (e.g., `max_retries=0` on a [custom OpenAI client](/docs/ai/models/openai#custom-openai-client)) - Rely on Prefect's task-level retry configuration for consistency This prevents requests from being retried multiple times at different layers. ## Caching and Idempotency Prefect 3.0 provides built-in caching and transactional semantics. Tasks with identical inputs will not re-execute if their results are already cached, making workflows naturally idempotent and resilient to failures. - **Task inputs**: Messages, settings, parameters, tool arguments, and serializable dependencies **Note**: For user dependencies to be included in cache keys, they must be serializable (e.g., Pydantic models or basic Python types). Non-serializable dependencies are automatically excluded from cache computation. ## Observability with Prefect and Logfire Prefect provides a built-in UI for monitoring flow runs, task executions, and failures. You can: - View real-time flow run status - Debug failures with full stack traces - Set up alerts and notifications To access the Prefect UI, you can either: 1. Use [Prefect Cloud](https://www.prefect.io/cloud) (managed service) 2. Run a local [Prefect server](https://docs.prefect.io/v3/how-to-guides/self-hosted/server-cli) with `prefect server start` You can also use [Pydantic Logfire](/docs/ai/integrations/logfire) for detailed observability. When using both Prefect and Logfire, you'll get complementary views: - **Prefect**: Workflow-level orchestration, task status, and retry history - **Logfire**: Fine-grained tracing of agent runs, model requests, and tool invocations When using Logfire with Prefect, you can enable distributed tracing to see spans for your Prefect runs included with your agent runs, model requests, and tool invocations. For more information about Prefect monitoring, see the [Prefect documentation](https://docs.prefect.io/). ## Deployments and Scheduling To deploy and schedule a `PrefectAgent`, wrap it in a Prefect flow and use the flow's [`serve()`](https://docs.prefect.io/v3/how-to-guides/deployments/create-deployments#create-a-deployment-with-serve) or [`deploy()`](https://docs.prefect.io/v3/how-to-guides/deployments/deploy-via-python) methods: serve\_agent.py ```python from prefect import flow from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent @flow async def daily_report_flow(user_prompt: str): """Generate a daily report using the agent.""" agent = Agent( # (1) 'openai:gpt-5.2', name='daily_report_agent', instructions='Generate a daily summary report.', ) prefect_agent = PrefectAgent(agent) result = await prefect_agent.run(user_prompt) return result.output # Serve the flow with a daily schedule if __name__ == '__main__': daily_report_flow.serve( name='daily-report-deployment', cron='0 9 * * *', # Run daily at 9am parameters={'user_prompt': "Generate today's report"}, tags=['production', 'reports'], ) ``` Each flow run executes in an isolated process, and all inputs and dependencies must be serializable. Because Agent instances cannot be serialized, instantiate the agent inside the flow rather than at the module level. The `serve()` method accepts scheduling options: - **`cron`**: Cron schedule string (e.g., `'0 9 * * *'` for daily at 9am) - **`interval`**: Schedule interval in seconds or as a timedelta - **`rrule`**: iCalendar RRule schedule string For production deployments with Docker, Kubernetes, or other infrastructure, use the flow's [`deploy()`](https://docs.prefect.io/v3/how-to-guides/deployments/deploy-via-python) method. See the [Prefect deployment documentation](https://docs.prefect.io/v3/how-to-guides/deployments/create-deploymentsy) for more information. --- # [Temporal](https://pydantic.dev/docs/ai/integrations/durable_execution/temporal/) # Temporal [Temporal](https://temporal.io) is a popular [durable execution](https://docs.temporal.io/evaluate/understanding-temporal#durable-execution) platform that's natively supported by Pydantic AI. ## Durable Execution In Temporal's durable execution implementation, a program that crashes or encounters an exception while interacting with a model or API will retry until it can successfully complete. Temporal relies primarily on a replay mechanism to recover from failures. As the program makes progress, Temporal saves key inputs and decisions, allowing a re-started program to pick up right where it left off. The key to making this work is to separate the application's repeatable (deterministic) and non-repeatable (non-deterministic) parts: 1. Deterministic pieces, termed [**workflows**](https://docs.temporal.io/workflow-definition), execute the same way when re-run with the same inputs. 2. Non-deterministic pieces, termed [**activities**](https://docs.temporal.io/activities), can run arbitrary code, performing I/O and any other operations. Workflow code can run for extended periods and, if interrupted, resume exactly where it left off. Critically, workflow code generally _cannot_ include any kind of I/O, over the network, disk, etc. Activity code faces no restrictions on I/O or external interactions, but if an activity fails part-way through it is restarted from the beginning. Note If you are familiar with celery, it may be helpful to think of Temporal activities as similar to celery tasks, but where you wait for the task to complete and obtain its result before proceeding to the next step in the workflow. However, Temporal workflows and activities offer a great deal more flexibility and functionality than celery tasks. See the [Temporal documentation](https://docs.temporal.io/evaluate/understanding-temporal#temporal-application-the-building-blocks) for more information In the case of Pydantic AI agents, integration with Temporal means that [model requests](/docs/ai/models/overview), [tool calls](/docs/ai/tools-toolsets/tools) that may require I/O, and [MCP server communication](/docs/ai/mcp/client) all need to be offloaded to Temporal activities due to their I/O requirements, while the logic that coordinates them (i.e. the agent run) lives in the workflow. Code that handles a scheduled job or web request can then execute the workflow, which will in turn execute the activities as needed. The diagram below shows the overall architecture of an agentic application in Temporal. The Temporal Server is responsible for tracking program execution and making sure the associated state is preserved reliably (i.e., stored to an internal database, and possibly replicated across cloud regions). Temporal Server manages data in encrypted form, so all data processing occurs on the Worker, which runs the workflow and activities. ```text +---------------------+ | Temporal Server | (Stores workflow state, +---------------------+ schedules activities, ^ persists progress) | Save state, | Schedule Tasks, progress, | load state on resume timeouts | | +------------------------------------------------------+ | Worker | | +----------------------------------------------+ | | | Workflow Code | | | | (Agent Run Loop) | | | +----------------------------------------------+ | | | | | | | v v v | | +-----------+ +------------+ +-------------+ | | | Activity | | Activity | | Activity | | | | (Tool) | | (MCP Tool) | | (Model API) | | | +-----------+ +------------+ +-------------+ | | | | | | +------------------------------------------------------+ | | | v v v [External APIs, services, databases, etc.] ``` See the [Temporal documentation](https://docs.temporal.io/evaluate/understanding-temporal#temporal-application-the-building-blocks) for more information. ## Durable Agent Any agent can be wrapped in a [`TemporalAgent`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent) to get a durable agent that can be used inside a deterministic Temporal workflow, by automatically offloading all work that requires I/O (namely model requests, tool calls, and MCP server communication) to non-deterministic activities. At the time of wrapping, the agent's [model](/docs/ai/models/overview) and [toolsets](/docs/ai/tools-toolsets/toolsets) (including function tools registered on the agent and MCP servers) are frozen, activities are dynamically created for each, and the original model and toolsets are wrapped to call on the worker to execute the corresponding activities instead of directly performing the actions inside the workflow. The original agent can still be used as normal outside the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Here is a simple but complete example of wrapping an agent for durable execution, creating a Temporal workflow with durable execution logic, connecting to a Temporal server, and running the workflow from non-durable code. All it requires is a Temporal server to be [running locally](https://github.com/temporalio/temporal#download-and-start-temporal-server-locally): Terminal ```sh brew install temporal temporal server start-dev ``` temporal\_agent.py ```python import uuid from temporalio import workflow from temporalio.client import Client from temporalio.worker import Worker from pydantic_ai import Agent from pydantic_ai.durable_exec.temporal import ( PydanticAIPlugin, PydanticAIWorkflow, TemporalAgent, ) agent = Agent( 'openai:gpt-5.2', instructions="You're an expert in geography.", name='geography', # (10) ) temporal_agent = TemporalAgent(agent) # (1) @workflow.defn class GeographyWorkflow(PydanticAIWorkflow): # (2) __pydantic_ai_agents__ = [temporal_agent] # (3) @workflow.run async def run(self, prompt: str) -> str: result = await temporal_agent.run(prompt) # (4) return result.output async def main(): client = await Client.connect( # (5) 'localhost:7233', # (6) plugins=[PydanticAIPlugin()], # (7) ) async with Worker( # (8) client, task_queue='geography', workflows=[GeographyWorkflow], ): output = await client.execute_workflow( # (10) GeographyWorkflow.run, args=['What is the capital of Mexico?'], id=f'geography-{uuid.uuid4()}', task_queue='geography', ) print(output) #> Mexico City (Ciudad de México, CDMX) ``` The original `Agent` cannot be used inside a deterministic Temporal workflow, but the `TemporalAgent` can. As explained above, the workflow represents a deterministic piece of code that can use non-deterministic activities for operations that require I/O. Subclassing [`PydanticAIWorkflow`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.PydanticAIWorkflow) is optional but provides proper typing for the `__pydantic_ai_agents__` class variable. List the `TemporalAgent`s used by this workflow. The [`PydanticAIPlugin`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.PydanticAIPlugin) will automatically register their activities with the worker. Alternatively, if modifying the worker initialization is easier than the workflow class, you can use [`AgentPlugin`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.AgentPlugin) to register agents directly on the worker. [`TemporalAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent.run) works just like `Agent.run()`, but it will automatically offload model requests, tool calls, and MCP server communication to Temporal activities. We connect to the Temporal server which keeps track of workflow and activity execution. This assumes the Temporal server is [running locally](https://github.com/temporalio/temporal#download-and-start-temporal-server-locally). The [`PydanticAIPlugin`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.PydanticAIPlugin) tells Temporal to use Pydantic for serialization and deserialization, treats [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) exceptions as non-retryable, and automatically registers activities for agents listed in `__pydantic_ai_agents__`. We start the worker that will listen on the specified task queue and run workflows and activities. In a real world application, this might be run in a separate service. The agent's `name` is used to uniquely identify its activities. We call on the server to execute the workflow on a worker that's listening on the specified task queue. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ In a real world application, the agent, workflow, and worker are typically defined separately from the code that calls for a workflow to be executed. Because Temporal workflows need to be defined at the top level of the file and the `TemporalAgent` instance is needed inside the workflow and when starting the worker (to register the activities), it needs to be defined at the top level of the file as well. For more information on how to use Temporal in Python applications, see their [Python SDK guide](https://docs.temporal.io/develop/python). ## Temporal Integration Considerations There are a few considerations specific to agents and toolsets when using Temporal for durable execution. These are important to understand to ensure that your agents and toolsets work correctly with Temporal's workflow and activity model. ### Agent Names and Toolset IDs To ensure that Temporal knows what code to run when an activity fails or is interrupted and then restarted, even if your code is changed in between, each activity needs to have a name that's stable and unique. When `TemporalAgent` dynamically creates activities for the wrapped agent's model requests and toolsets (specifically those that implement their own tool listing and calling, i.e. [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) and [`MCPServer`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer)), their names are derived from the agent's [`name`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.name) and the toolsets' [`id`s](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.id). These fields are normally optional, but are required to be set when using Temporal. They should not be changed once the durable agent has been deployed to production as this would break active workflows. For dynamic toolsets created with the [`@agent.toolset`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.toolset) decorator, the `id` parameter must be set explicitly. Note that with Temporal, `per_run_step=False` is not respected, as the toolset always needs to be created on-the-fly in the activity. Other than that, any agent and toolset will just work! ### Agent Run Context and Dependencies As workflows and activities run in separate processes, any values passed between them need to be serializable. As these payloads are stored in the workflow execution event history, Temporal limits their size to 2MB. To account for these limitations, tool functions and the [event stream handler](#streaming) running inside activities receive a limited version of the agent's [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), and it's your responsibility to make sure that the [dependencies](/docs/ai/core-concepts/dependencies) object provided to [`TemporalAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent.run) can be serialized using Pydantic. Specifically, only the `deps`, `run_id`, `metadata`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `tool_call_metadata`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` fields are available by default, and trying to access `model`, `prompt`, `messages`, or `tracer` will raise an error. If you need one or more of these attributes to be available inside activities, you can create a [`TemporalRunContext`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalRunContext) subclass with custom `serialize_run_context` and `deserialize_run_context` class methods and pass it to [`TemporalAgent`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent) as `run_context_type`. ### Streaming Because Temporal activities cannot stream output directly to the activity call site, `Agent.run_stream()`, `Agent.run_stream_events()`, and [`Agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) are not supported. Instead, you can implement streaming by setting an [`event_stream_handler`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EventStreamHandler) on the `Agent` or `TemporalAgent` instance and using [`TemporalAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent.run) inside the workflow. The event stream handler function will receive the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](/docs/ai/core-concepts/agent#streaming-all-events). As the streaming model request activity, workflow, and workflow execution call all take place in separate processes, passing data between them requires some care: - To get data from the workflow call site or workflow to the event stream handler, you can use a [dependencies object](#agent-run-context-and-dependencies). - To get data from the event stream handler to the workflow, workflow call site, or a frontend, you need to use an external system that the event stream handler can write to and the event consumer can read from, like a message queue. You can use the dependency object to make sure the same connection string or other unique ID is available in all the places that need it. ### Model Selection at Runtime `Agent.run(model=...)` normally supports both model strings (like `'openai:gpt-5.2'`) and model instances. However, `TemporalAgent` does not support arbitrary model instances because they cannot be serialized for Temporal's replay mechanism. To use model instances with `TemporalAgent`, you need to pre-register them by passing a dict of model instances to `TemporalAgent(models={...})`. You can then reference them by name or by passing the registered instance directly. If the wrapped agent doesn't have a model set, the first registered model will be used as the default. Model strings work as expected. For scenarios where you need to customize the provider used by the model string (e.g., inject API keys from deps), you can pass a `provider_factory` to `TemporalAgent`, which is passed the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and provider name. Here's an example showing how to pre-register and use multiple models: multi\_model\_temporal.py ```python from dataclasses import dataclass from typing import Any from temporalio import workflow from pydantic_ai import Agent, RunContext from pydantic_ai.durable_exec.temporal import TemporalAgent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.google import GoogleModel from pydantic_ai.models.openai import OpenAIResponsesModel from pydantic_ai.providers import Provider @dataclass class Deps: openai_api_key: str | None = None anthropic_api_key: str | None = None # Create models from different providers default_model = OpenAIResponsesModel('gpt-5.2') fast_model = AnthropicModel('claude-sonnet-4-5') reasoning_model = GoogleModel('gemini-3-pro-preview') # Optional: provider factory for dynamic model configuration def my_provider_factory(run_context: RunContext[Deps], provider_name: str) -> Provider[Any]: """Create providers with custom configuration based on run context.""" if provider_name == 'openai': from pydantic_ai.providers.openai import OpenAIProvider return OpenAIProvider(api_key=run_context.deps.openai_api_key) elif provider_name == 'anthropic': from pydantic_ai.providers.anthropic import AnthropicProvider return AnthropicProvider(api_key=run_context.deps.anthropic_api_key) else: raise ValueError(f'Unknown provider: {provider_name}') agent = Agent(default_model, name='multi_model_agent', deps_type=Deps) temporal_agent = TemporalAgent( agent, models={ 'fast': fast_model, 'reasoning': reasoning_model, }, provider_factory=my_provider_factory, # Optional ) @workflow.defn class MultiModelWorkflow: @workflow.run async def run(self, prompt: str, use_reasoning: bool, use_fast: bool) -> str: if use_reasoning: # Select by registered name result = await temporal_agent.run(prompt, model='reasoning') elif use_fast: # Or pass the registered instance directly result = await temporal_agent.run(prompt, model=fast_model) else: # Or pass a model string (uses provider_factory if set) result = await temporal_agent.run(prompt, model='openai:gpt-5-mini') return result.output ``` ## Activity Configuration Temporal activity configuration, like timeouts and retry policies, can be customized by passing [`temporalio.workflow.ActivityConfig`](https://python.temporal.io/temporalio.workflow.ActivityConfig.html) objects to the `TemporalAgent` constructor: - `activity_config`: The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used. - `model_activity_config`: The Temporal activity config to use for model request activities. This is merged with the base activity config. - `toolset_activity_config`: The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. - `tool_activity_config`: The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use I/O, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. ## Activity Retries On top of the automatic retries for request failures that Temporal will perform, Pydantic AI and various provider API clients also have their own request retry logic. Enabling these at the same time may cause the request to be retried more often than expected, with improper `Retry-After` handling. When using Temporal, it's recommended to not use [HTTP Request Retries](/docs/ai/advanced-features/retries) and to turn off your provider API client's own retry logic, for example by setting `max_retries=0` on a [custom `OpenAIProvider` API client](/docs/ai/models/openai#custom-openai-client). You can customize Temporal's retry policy using [activity configuration](#activity-configuration). ## Observability with Logfire Temporal generates telemetry events and metrics for each workflow and activity execution, and Pydantic AI generates events for each agent run, model request and tool call. These can be sent to [Pydantic Logfire](/docs/ai/integrations/logfire) to get a complete picture of what's happening in your application. To use Logfire with Temporal, you need to pass a [`LogfirePlugin`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.LogfirePlugin) object to Temporal's `Client.connect()`: logfire\_plugin.py ```py from temporalio.client import Client from pydantic_ai.durable_exec.temporal import LogfirePlugin, PydanticAIPlugin async def main(): client = await Client.connect( 'localhost:7233', plugins=[PydanticAIPlugin(), LogfirePlugin()], ) ``` By default, the `LogfirePlugin` will instrument Temporal (including metrics) and Pydantic AI and send all data to Logfire. To customize Logfire configuration and instrumentation, you can pass a `logfire_setup` function to the `LogfirePlugin` constructor and return a custom `Logfire` instance (i.e. the result of `logfire.configure()`). To disable sending Temporal metrics to Logfire, you can pass `metrics=False` to the `LogfirePlugin` constructor. ## Known Issues ### Pandas When `logfire.info` is used inside an activity and the `pandas` package is among your project's dependencies, you may encounter the following error which seems to be the result of an import race condition: ``` AttributeError: partially initialized module 'pandas' has no attribute '_pandas_parser_CAPI' (most likely due to a circular import) ``` To fix this, you can use the [`temporalio.workflow.unsafe.imports_passed_through()`](https://python.temporal.io/temporalio.workflow.unsafe.html#imports_passed_through) context manager to proactively import the package and not have it be reloaded in the workflow sandbox: temporal\_activity.py ```python from temporalio import workflow with workflow.unsafe.imports_passed_through(): import pandas ``` --- # [Debugging & Monitoring with Pydantic Logfire](https://pydantic.dev/docs/ai/integrations/logfire/) # Debugging & Monitoring with Pydantic Logfire Applications that use LLMs have some challenges that are well known and understood: LLMs are **slow**, **unreliable** and **expensive**. These applications also have some challenges that most developers have encountered much less often: LLMs are **fickle** and **non-deterministic**. Subtle changes in a prompt can completely change a model's performance, and there's no `EXPLAIN` query you can run to understand why. Warning From a software engineers point of view, you can think of LLMs as the worst database you've ever heard of, but worse. If LLMs weren't so bloody useful, we'd never touch them. To build successful applications with LLMs, we need new tools to understand both model performance, and the behavior of applications that rely on them. LLM Observability tools that just let you understand how your model is performing are useless: making API calls to an LLM is easy, it's building that into an application that's hard. ## Pydantic Logfire [Pydantic Logfire](https://pydantic.dev/logfire) is an observability platform developed by the team who created and maintain Pydantic Validation and Pydantic AI. Logfire aims to let you understand your entire application: Gen AI, classic predictive AI, HTTP traffic, database queries and everything else a modern application needs, all using OpenTelemetry. Pydantic Logfire is a commercial product Logfire is a commercially supported, hosted platform with an extremely generous and perpetual [free tier](https://pydantic.dev/pricing/). You can sign up and start using Logfire in a couple of minutes. Logfire can also be self-hosted on the enterprise tier. Pydantic AI has built-in (but optional) support for Logfire. That means if the `logfire` package is installed and configured and agent instrumentation is enabled then detailed information about agent runs is sent to Logfire. Otherwise there's virtually no overhead and nothing is sent. Here's an example showing details of running the [Weather Agent](/docs/ai/examples/weather-agent) in Logfire: ![Weather Agent Logfire](/docs/ai/img/logfire-weather-agent.png) A trace is generated for the agent run, and spans are emitted for each model request and tool call. ## Using Logfire To use Logfire, you'll need a Logfire [account](https://logfire.pydantic.dev). The Logfire Python SDK is included with `pydantic-ai`: - [pip](#tab-panel-38) - [uv](#tab-panel-39) Terminal ```bash pip install pydantic-ai ``` Terminal ```bash uv add pydantic-ai ``` Or if you're using the slim package, you can install it with the `logfire` optional group: - [pip](#tab-panel-40) - [uv](#tab-panel-41) Terminal ```bash pip install "pydantic-ai-slim[logfire]" ``` Terminal ```bash uv add "pydantic-ai-slim[logfire]" ``` Then authenticate your local environment with Logfire: - [pip](#tab-panel-42) - [uv](#tab-panel-43) Terminal ```bash logfire auth ``` Terminal ```bash uv run logfire auth ``` And configure a project to send data to: - [pip](#tab-panel-44) - [uv](#tab-panel-45) Terminal ```bash logfire projects new ``` Terminal ```bash uv run logfire projects new ``` (Or use an existing project with `logfire projects use`) This will write to a `.logfire` directory in the current working directory, which the Logfire SDK will use for configuration at run time. With that, you can start using Logfire to instrument Pydantic AI code: instrument\_pydantic\_ai.py ```python import logfire from pydantic_ai import Agent logfire.configure() # (1) logfire.instrument_pydantic_ai() # (2) agent = Agent('openai:gpt-5.2', instructions='Be concise, reply with one sentence.') result = agent.run_sync('Where does "hello world" come from?') # (3) print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` [`logfire.configure()`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.configure) configures the SDK, by default it will find the write token from the `.logfire` directory, but you can also pass a token directly. [`logfire.instrument_pydantic_ai()`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_pydantic_ai) enables instrumentation of Pydantic AI. Since we've enabled instrumentation, a trace will be generated for each run, with spans emitted for models calls and tool function execution _(This example is complete, it can be run "as is")_ Which will display in Logfire thus: ![Logfire Simple Agent Run](/docs/ai/img/logfire-simple-agent.png) The [Logfire documentation](https://logfire.pydantic.dev/docs/) has more details on how to use Logfire, including how to instrument other libraries like [HTTPX](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) and [FastAPI](https://logfire.pydantic.dev/docs/integrations/web-frameworks/fastapi/). Since Logfire is built on [OpenTelemetry](https://opentelemetry.io/), you can use the Logfire Python SDK to send data to any OpenTelemetry collector, see [below](#using-opentelemetry). ### Debugging To demonstrate how Logfire can let you visualise the flow of a Pydantic AI run, here's the view you get from Logfire while running the [chat app examples](/docs/ai/examples/chat-app): ### Monitoring Performance We can also query data with SQL in Logfire to monitor the performance of an application. Here's a real world example of using Logfire to monitor Pydantic AI runs inside Logfire itself: ![Logfire monitoring Pydantic AI](/docs/ai/img/logfire-monitoring-pydanticai.png) ### Monitoring HTTP Requests As per Hamel Husain's influential 2024 blog post ["Fuck You, Show Me The Prompt."](https://hamel.dev/blog/posts/prompt/) (bear with the capitalization, the point is valid), it's often useful to be able to view the raw HTTP requests and responses made to model providers. To observe raw HTTP requests made to model providers, you can use Logfire's [HTTPX instrumentation](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) since all provider SDKs (except for [Bedrock](/docs/ai/models/bedrock)) use the [HTTPX](https://www.python-httpx.org/) library internally: with\_logfire\_instrument\_httpx.py ```python import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) # (1) agent = Agent('openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` See the [`logfire.instrument_httpx` docs](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_httpx) more details, `capture_all=True` means both headers and body are captured for both the request and response. ![Logfire with HTTPX instrumentation](/docs/ai/img/logfire-with-httpx.png) ## Using OpenTelemetry Pydantic AI's instrumentation uses [OpenTelemetry](https://opentelemetry.io/) (OTel), which Logfire is based on. This means you can debug and monitor Pydantic AI with any OpenTelemetry backend. Pydantic AI follows the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/), so while we think you'll have the best experience using the Logfire platform 😉, you should be able to use any OTel service with GenAI support. ### Logfire with an alternative OTel backend You can use the Logfire SDK completely freely and send the data to any OpenTelemetry backend. Here's an example of configuring the Logfire library to send data to the excellent [otel-tui](https://github.com/ymtdzzz/otel-tui) -- an open source terminal based OTel backend and viewer (no association with Pydantic Validation). Run `otel-tui` with docker (see [the otel-tui readme](https://github.com/ymtdzzz/otel-tui) for more instructions): Terminal ```txt docker run --rm -it -p 4318:4318 --name otel-tui ymtdzzz/otel-tui:latest ``` then run, otel\_tui.py ```python import os import logfire from pydantic_ai import Agent os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318' # (1) logfire.configure(send_to_logfire=False) # (2) logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) agent = Agent('openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> Paris ``` Set the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable to the URL of your OpenTelemetry backend. If you're using a backend that requires authentication, you may need to set [other environment variables](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/). Of course, these can also be set outside the process, e.g. with `export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318`. We [configure](https://logfire.pydantic.dev/docs/api/logfire/#logfire.configure) Logfire to disable sending data to the Logfire OTel backend itself. If you removed `send_to_logfire=False`, data would be sent to both Logfire and your OpenTelemetry backend. Running the above code will send tracing data to `otel-tui`, which will display like this: ![otel tui simple](/docs/ai/img/otel-tui-simple.png) Running the [weather agent](/docs/ai/examples/weather-agent) example connected to `otel-tui` shows how it can be used to visualise a more complex trace: ![otel tui weather agent](/docs/ai/img/otel-tui-weather.png) For more information on using the Logfire SDK to send data to alternative backends, see [the Logfire documentation](https://logfire.pydantic.dev/docs/how-to-guides/alternative-backends/). ### OTel without Logfire You can also emit OpenTelemetry data from Pydantic AI without using Logfire at all. To do this, you'll need to install and configure the OpenTelemetry packages you need. To run the following examples, use Terminal ```txt uv run \ --with 'pydantic-ai-slim[openai]' \ --with opentelemetry-sdk \ --with opentelemetry-exporter-otlp \ raw_otel.py ``` raw\_otel.py ```python import os from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor from opentelemetry.trace import set_tracer_provider from pydantic_ai import Agent os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318' exporter = OTLPSpanExporter() span_processor = BatchSpanProcessor(exporter) tracer_provider = TracerProvider() tracer_provider.add_span_processor(span_processor) set_tracer_provider(tracer_provider) Agent.instrument_all() agent = Agent('openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> Paris ``` ### Alternative Observability backends Because Pydantic AI uses OpenTelemetry for observability, you can easily configure it to send data to any OpenTelemetry-compatible backend, not just our observability platform [Pydantic Logfire](#pydantic-logfire). The following providers have dedicated documentation on Pydantic AI: - [Langfuse](https://langfuse.com/docs/integrations/pydantic-ai) - [W&B Weave](https://weave-docs.wandb.ai/guides/integrations/pydantic_ai/) - [Arize](https://arize.com/docs/ax/observe/tracing-integrations-auto/pydantic-ai) - [Openlayer](https://www.openlayer.com/docs/integrations/pydantic-ai) - [OpenLIT](https://docs.openlit.io/latest/integrations/pydantic) - [LangWatch](https://docs.langwatch.ai/integration/python/integrations/pydantic-ai) - [Patronus AI](https://docs.patronus.ai/docs/percival/pydantic) - [Opik](https://www.comet.com/docs/opik/tracing/integrations/pydantic-ai) - [mlflow](https://mlflow.org/docs/latest/genai/tracing/integrations/listing/pydantic_ai) - [Agenta](https://docs.agenta.ai/observability/integrations/pydanticai) - [Confident AI](https://documentation.confident-ai.com/docs/llm-tracing/integrations/pydanticai) - [Braintrust](https://www.braintrust.dev/docs/integrations/sdk-integrations/pydantic-ai) - [SigNoz](https://signoz.io/docs/pydantic-ai-observability/) - [Laminar](https://docs.laminar.sh/tracing/integrations/pydantic-ai) - [Respan](https://respan.ai/docs/integrations/pydantic-ai) ## Advanced usage ### Aggregated usage attribute names By default, both model/request spans and agent run spans use the standard `gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens` attributes. Some observability backends (e.g., Datadog, New Relic, LangSmith, Opik) aggregate these attributes across all spans, which can cause double-counting since agent run spans report the sum of their child spans' usage. To avoid this, you can enable `use_aggregated_usage_attribute_names` so that agent run spans use distinct attribute names (e.g., `gen_ai.aggregated_usage.input_tokens`, `gen_ai.aggregated_usage.output_tokens`, and `gen_ai.aggregated_usage.details.*`): Custom namespace The `gen_ai.aggregated_usage.*` namespace is a custom extension not part of the [OpenTelemetry Semantic Conventions for GenAI](https://opentelemetry.io/docs/specs/semconv/gen-ai/). It was introduced to work around double-counting in observability backends. If OpenTelemetry introduces an official convention for aggregated usage in the future, this namespace may be updated or deprecated. ```python from pydantic_ai import Agent from pydantic_ai.models.instrumented import InstrumentationSettings Agent.instrument_all(InstrumentationSettings(use_aggregated_usage_attribute_names=True)) ``` ### Configuring data format Pydantic AI follows the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/), specifically version 1.37.0 of the conventions. The instrumentation format can be configured using the `version` parameter of [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings). **The default is `version=2`**, which provides a good balance between spec compliance and compatibility. #### Version 1 (Legacy, deprecated) Based on [OpenTelemetry semantic conventions version 1.36.0](https://github.com/open-telemetry/semantic-conventions/blob/v1.36.0/docs/gen-ai/README.md) or older. Messages are captured as individual events (logs) that are children of the request span. Use `event_mode='logs'` to emit events as OpenTelemetry log-based events: instrumentation\_settings\_event\_mode.py ```python import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai(version=1, event_mode='logs') agent = Agent('openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` This version won't look as good in the Logfire UI and will be removed from Pydantic AI in a future release, but may be useful for backwards compatibility. #### Version 2 (Default) Uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans - `pydantic_ai.all_messages` on agent run spans Some span and attribute names are not fully spec-compliant for compatibility reasons. Use version 3 or 4 for better compliance. #### Version 3 Builds on version 2 with the following improvements: - **Spec-compliant span names:** - `agent run` becomes `invoke_agent {gen_ai.agent.name}` (with the agent name filled in) - `running tool` becomes `execute_tool {gen_ai.tool.name}` (with the tool name filled in) - **Spec-compliant attribute names:** - `tool_arguments` becomes `gen_ai.tool.call.arguments` - `tool_response` becomes `gen_ai.tool.call.result` - **Thinking tokens support:** Captures thinking/reasoning tokens when available #### Version 4 Builds on version 3 with improved multimodal content handling to better align with the [GenAI semantic conventions for multimodal inputs](https://opentelemetry.io/docs/specs/semconv/gen-ai/non-normative/examples-llm-calls/#multimodal-inputs-example): **URL-based media (ImageUrl, AudioUrl, VideoUrl):** - Old (v1-3): `{"type": "image-url", "url": "..."}` - New (v4): `{"type": "uri", "modality": "image", "uri": "...", "mime_type": "..."}` **Inline binary content (BinaryContent, FilePart):** - Old (v1-3): `{"type": "binary", "media_type": "...", "content": "..."}` - New (v4): `{"type": "blob", "modality": "image", "mime_type": "...", "content": "..."}` Note: The `modality` field is only included for image, audio, and video content types as specified in the OTel spec. DocumentUrl and unsupported media types omit the `modality` field. #### Version 5 Builds on version 4 with improved handling of deferred tool calls: - [`CallDeferred`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.CallDeferred) and [`ApprovalRequired`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ApprovalRequired) exceptions no longer record an exception event or set the span status to ERROR -- the span is left as UNSET, since deferrals are control flow, not errors. * * * Note that the OpenTelemetry Semantic Conventions are still experimental and are likely to change. ### Setting OpenTelemetry SDK providers By default, the global `TracerProvider` and `LoggerProvider` are used. These are set automatically by `logfire.configure()`. They can also be set by the `set_tracer_provider` and `set_logger_provider` functions in the OpenTelemetry Python SDK. You can set custom providers with [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings). instrumentation\_settings\_providers.py ```python from opentelemetry.sdk._logs import LoggerProvider from opentelemetry.sdk.trace import TracerProvider from pydantic_ai import Agent, InstrumentationSettings instrumentation_settings = InstrumentationSettings( tracer_provider=TracerProvider(), logger_provider=LoggerProvider(), ) agent = Agent('openai:gpt-5.2', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` ### Instrumenting a specific `Model` instrumented\_model\_example.py ```python from pydantic_ai import Agent from pydantic_ai.models.instrumented import InstrumentationSettings, InstrumentedModel settings = InstrumentationSettings() model = InstrumentedModel('openai:gpt-5.2', settings) agent = Agent(model) ``` ### Excluding binary content excluding\_binary\_content.py ```python from pydantic_ai import Agent, InstrumentationSettings instrumentation_settings = InstrumentationSettings(include_binary_content=False) agent = Agent('openai:gpt-5.2', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` ### Excluding prompts and completions For privacy and security reasons, you may want to monitor your agent's behavior and performance without exposing sensitive user data or proprietary prompts in your observability platform. Pydantic AI allows you to exclude the actual content from telemetry while preserving the structural information needed for debugging and monitoring. When `include_content=False` is set, Pydantic AI will exclude sensitive content from telemetry, including user prompts and model completions, tool call arguments and responses, and any other message content. excluding\_sensitive\_content.py ```python from pydantic_ai import Agent from pydantic_ai.models.instrumented import InstrumentationSettings instrumentation_settings = InstrumentationSettings(include_content=False) agent = Agent('openai:gpt-5.2', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` This setting is particularly useful in production environments where compliance requirements or data sensitivity concerns make it necessary to limit what content is sent to your observability platform. ### Adding Custom Metadata Use the agent's `metadata` parameter to attach additional data to the agent's span. When instrumentation is enabled, the computed metadata is recorded on the agent span under the `metadata` attribute. See the [usage and metadata example in the agents guide](/docs/ai/core-concepts/agent#run-metadata) for details and usage. --- # [AG-UI](https://pydantic.dev/docs/ai/integrations/ui/ag-ui/) # AG-UI The [Agent-User Interaction (AG-UI) Protocol](https://docs.ag-ui.com/introduction) is an open standard introduced by the [CopilotKit](https://webflow.copilotkit.ai/blog/introducing-ag-ui-the-protocol-where-agents-meet-users) team that standardises how frontend applications communicate with AI agents, with support for streaming, frontend tools, shared state, and custom events. Note The AG-UI integration was originally built by the team at [Rocket Science](https://www.rocketscience.gg/) and contributed in collaboration with the Pydantic AI and CopilotKit teams. Thanks Rocket Science! ## Installation The only dependencies are: - [ag-ui-protocol](https://docs.ag-ui.com/introduction): to provide the AG-UI types and encoder. - [starlette](https://www.starlette.io): to handle [ASGI](https://asgi.readthedocs.io/en/latest/) requests from a framework like FastAPI. You can install Pydantic AI with the `ag-ui` extra to ensure you have all the required AG-UI dependencies: - [pip](#tab-panel-172) - [uv](#tab-panel-173) Terminal ```bash pip install 'pydantic-ai-slim[ag-ui]' ``` Terminal ```bash uv add 'pydantic-ai-slim[ag-ui]' ``` To run the examples you'll also need: - [uvicorn](https://www.uvicorn.org/) or another ASGI compatible server - [pip](#tab-panel-174) - [uv](#tab-panel-175) Terminal ```bash pip install uvicorn ``` Terminal ```bash uv add uvicorn ``` ## Usage There are three ways to run a Pydantic AI agent based on AG-UI run input with streamed AG-UI events as output, from most to least flexible. If you're using a Starlette-based web framework like FastAPI, you'll typically want to use the second method. 1. The `AGUIAdapter.run_stream()` method, when called on an [`AGUIAdapter`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter) instantiated with an agent and an AG-UI [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object, will run the agent and return a stream of AG-UI events. It also takes optional [`Agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) arguments including `deps`. Use this if you're using a web framework not based on Starlette (e.g. Django or Flask) or want to modify the input or output some way. 2. The `AGUIAdapter.dispatch_request()` class method takes an agent and a Starlette request (e.g. from FastAPI) coming from an AG-UI frontend, and returns a streaming Starlette response of AG-UI events that you can return directly from your endpoint. It also takes optional [`Agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) arguments including `deps`, that you can vary for each request (e.g. based on the authenticated user). This is a convenience method that combines [`AGUIAdapter.from_request()`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.from_request), `AGUIAdapter.run_stream()`, and `AGUIAdapter.streaming_response()`. 3. [`AGUIApp`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.app.AGUIApp) represents an ASGI application that handles every AG-UI request by running the agent. It also takes optional [`Agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) arguments including `deps`, but these will be the same for each request, with the exception of the AG-UI state that's injected as described under [state management](#state-management). This ASGI app can be [mounted](https://fastapi.tiangolo.com/advanced/sub-applications/) at a given path in an existing FastAPI app. ### Handle run input and output directly This example uses `AGUIAdapter.run_stream()` and performs its own request parsing and response generation. This can be modified to work with any web framework. run\_ag\_ui.py ```python import json from http import HTTPStatus from fastapi import FastAPI from fastapi.requests import Request from fastapi.responses import Response, StreamingResponse from pydantic import ValidationError from pydantic_ai import Agent from pydantic_ai.ui import SSE_CONTENT_TYPE from pydantic_ai.ui.ag_ui import AGUIAdapter agent = Agent('openai:gpt-5.2', instructions='Be fun!') app = FastAPI() @app.post('/') async def run_agent(request: Request) -> Response: accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: run_input = AGUIAdapter.build_run_input(await request.body()) # (1) except ValidationError as e: return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) adapter = AGUIAdapter(agent=agent, run_input=run_input, accept=accept) event_stream = adapter.run_stream() # (2) sse_event_stream = adapter.encode_stream(event_stream) return StreamingResponse(sse_event_stream, media_type=accept) # (3) ``` 1. [`AGUIAdapter.build_run_input()`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.build_run_input) takes the request body as bytes and returns an AG-UI [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object. You can also use the [`AGUIAdapter.from_request()`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.from_request) class method to build an adapter directly from a request. 2. `AGUIAdapter.run_stream()` runs the agent and returns a stream of AG-UI events. It supports the same optional arguments as [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents), including `deps`. You can also use `AGUIAdapter.run_stream_native()` to run the agent and return a stream of Pydantic AI events instead, which can then be transformed into AG-UI events using `AGUIAdapter.transform_stream()`. 3. `AGUIAdapter.encode_stream()` encodes the stream of AG-UI events as strings according to the accept header value. You can also use `AGUIAdapter.streaming_response()` to generate a streaming response directly from the AG-UI event stream returned by `run_stream()`. Since `app` is an ASGI application, it can be used with any ASGI server: Terminal ```shell uvicorn run_ag_ui:app ``` This will expose the agent as an AG-UI server, and your frontend can start sending requests to it. ### Handle a Starlette request This example uses `AGUIAdapter.dispatch_request()` to directly handle a FastAPI request and return a response. Something analogous to this will work with any Starlette-based web framework. handle\_ag\_ui\_request.py ```python from fastapi import FastAPI from starlette.requests import Request from starlette.responses import Response from pydantic_ai import Agent from pydantic_ai.ui.ag_ui import AGUIAdapter agent = Agent('openai:gpt-5.2', instructions='Be fun!') app = FastAPI() @app.post('/') async def run_agent(request: Request) -> Response: return await AGUIAdapter.dispatch_request(request, agent=agent) # (1) ``` 1. This method essentially does the same as the previous example, but it's more convenient to use when you're already using a Starlette/FastAPI app. Since `app` is an ASGI application, it can be used with any ASGI server: Terminal ```shell uvicorn handle_ag_ui_request:app ``` This will expose the agent as an AG-UI server, and your frontend can start sending requests to it. ### Stand-alone ASGI app This example uses [`AGUIApp`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.app.AGUIApp) to turn the agent into a stand-alone ASGI application: ag\_ui\_app.py ```python from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent('openai:gpt-5.2', instructions='Be fun!') app = AGUIApp(agent) ``` Since `app` is an ASGI application, it can be used with any ASGI server: Terminal ```shell uvicorn ag_ui_app:app ``` This will expose the agent as an AG-UI server, and your frontend can start sending requests to it. ## Design The Pydantic AI AG-UI integration supports all features of the spec: - [Events](https://docs.ag-ui.com/concepts/events) - [Messages](https://docs.ag-ui.com/concepts/messages) - [State Management](https://docs.ag-ui.com/concepts/state) - [Tools](https://docs.ag-ui.com/concepts/tools) The integration receives messages in the form of a [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object that describes the details of the requested agent run including message history, state, and available tools. These are converted to Pydantic AI types and passed to the agent's run method. Events from the agent, including tool calls, are converted to AG-UI events and streamed back to the caller as Server-Sent Events (SSE). A user request may require multiple round trips between client UI and Pydantic AI server, depending on the tools and events needed. ## Features ### State management The integration provides full support for [AG-UI state management](https://docs.ag-ui.com/concepts/state), which enables real-time synchronization between agents and frontend applications. In the example below we have document state which is shared between the UI and server using the [`StateDeps`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateDeps) [dependencies type](/docs/ai/core-concepts/dependencies) that can be used to automatically validate state contained in [`RunAgentInput.state`](https://docs.ag-ui.com/sdk/js/core/types#runagentinput) using a Pydantic `BaseModel` specified as a generic parameter. Custom dependencies type with AG-UI state If you want to use your own dependencies type to hold AG-UI state as well as other things, it needs to implements the [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler) protocol, meaning it needs to be a [dataclass](https://docs.python.org/3/library/dataclasses.html) with a non-optional `state` field. This lets Pydantic AI ensure that state is properly isolated between requests by building a new dependencies object each time. If the `state` field's type is a Pydantic `BaseModel` subclass, the raw state dictionary on the request is automatically validated. If not, you can validate the raw value yourself in your dependencies dataclass's `__post_init__` method. If AG-UI state is provided but your dependencies do not implement [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler), Pydantic AI will emit a warning and ignore the state. Use [`StateDeps`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateDeps) or a custom [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler) implementation to receive and validate the incoming state. ag\_ui\_state.py ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent( 'openai:gpt-5.2', instructions='Be fun!', deps_type=StateDeps[DocumentState], ) app = AGUIApp(agent, deps=StateDeps(DocumentState())) ``` Since `app` is an ASGI application, it can be used with any ASGI server: Terminal ```bash uvicorn ag_ui_state:app --host 0.0.0.0 --port 9000 ``` ### Tools AG-UI frontend tools are seamlessly provided to the Pydantic AI agent, enabling rich user experiences with frontend user interfaces. ### Events Pydantic AI tools can send [AG-UI events](https://docs.ag-ui.com/concepts/events) simply by returning a [`ToolReturn`](/docs/ai/tools-toolsets/tools-advanced#advanced-tool-returns) object with a [`BaseEvent`](https://docs.ag-ui.com/sdk/python/core/events#baseevent) (or a list of events) as `metadata`, which allows for custom events and state updates. ag\_ui\_tool\_events.py ```python from ag_ui.core import CustomEvent, EventType, StateSnapshotEvent from pydantic import BaseModel from pydantic_ai import Agent, RunContext, ToolReturn from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent( 'openai:gpt-5.2', instructions='Be fun!', deps_type=StateDeps[DocumentState], ) app = AGUIApp(agent, deps=StateDeps(DocumentState())) @agent.tool async def update_state(ctx: RunContext[StateDeps[DocumentState]]) -> ToolReturn: return ToolReturn( return_value='State updated', metadata=[ StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot=ctx.deps.state, ), ], ) @agent.tool_plain async def custom_events() -> ToolReturn: return ToolReturn( return_value='Count events sent', metadata=[ CustomEvent( type=EventType.CUSTOM, name='count', value=1, ), CustomEvent( type=EventType.CUSTOM, name='count', value=2, ), ] ) ``` Since `app` is an ASGI application, it can be used with any ASGI server: Terminal ```bash uvicorn ag_ui_tool_events:app --host 0.0.0.0 --port 9000 ``` ## Examples For more examples of how to use [`AGUIApp`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.app.AGUIApp) see [`pydantic_ai_examples.ag_ui`](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples/ag_ui), which includes a server for use with the [AG-UI Dojo](https://docs.ag-ui.com/tutorials/debugging#the-ag-ui-dojo). --- # [Overview](https://pydantic.dev/docs/ai/integrations/ui/overview/) # Overview If you're building a chat app or other interactive frontend for an AI agent, your backend will need to receive agent run input (like a chat message or complete [message history](/docs/ai/core-concepts/message-history)) from the frontend, and will need to stream the [agent's events](/docs/ai/core-concepts/agent#streaming-all-events) (like text, thinking, and tool calls) to the frontend so that the user knows what's happening in real time. While your frontend could use Pydantic AI's [`ModelRequest`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest) and [`AgentStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent) directly, you'll typically want to use a UI event stream protocol that's natively supported by your frontend framework. Pydantic AI natively supports two UI event stream protocols: - [Agent-User Interaction (AG-UI) Protocol](/docs/ai/integrations/ui/ag-ui) - [Vercel AI Data Stream Protocol](/docs/ai/integrations/ui/vercel-ai) These integrations are implemented as subclasses of the abstract [`UIAdapter`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter) class, so they also serve as a reference for integrating with other UI event stream protocols. ## Usage The protocol-specific [`UIAdapter`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter) subclass (i.e. [`AGUIAdapter`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter) or [`VercelAIAdapter`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIAdapter)) is responsible for transforming agent run input received from the frontend into arguments for [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents), running the agent, and then transforming Pydantic AI events into protocol-specific events. The event stream transformation is handled by a protocol-specific [`UIEventStream`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream) subclass, but you typically won't use this directly. If you're using a Starlette-based web framework like FastAPI, you can use the [`UIAdapter.dispatch_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.dispatch_request) class method from an endpoint function to directly handle a request and return a streaming response of protocol-specific events. This is demonstrated in the next section. If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `UIAdapter` instance and directly use its methods. This is demonstrated in "Advanced Usage" section below. ### Usage with Starlette/FastAPI Besides the request, [`UIAdapter.dispatch_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.dispatch_request) takes the agent, the same optional arguments as [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents), and an optional `on_complete` callback function that receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events. Note These examples use the `VercelAIAdapter`, but the same patterns apply to all `UIAdapter` subclasses. dispatch\_request.py ```python from fastapi import FastAPI from starlette.requests import Request from starlette.responses import Response from pydantic_ai import Agent from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: return await VercelAIAdapter.dispatch_request(request, agent=agent) ``` ### Advanced Usage If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `UIAdapter` instance and directly use its methods, which can be chained to accomplish the same thing as the `UIAdapter.dispatch_request()` class method shown above: 1. The [`UIAdapter.build_run_input()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.build_run_input) class method takes the request body as bytes and returns a protocol-specific run input object, which you can then pass to the [`UIAdapter()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter) constructor along with the agent. - You can also use the [`UIAdapter.from_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request) class method to build an adapter directly from a Starlette/FastAPI request. 2. The [`UIAdapter.run_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.run_stream) method runs the agent and returns a stream of protocol-specific events. It supports the same optional arguments as [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents) and an optional `on_complete` callback function that receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events. - You can also use [`UIAdapter.run_stream_native()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.run_stream_native) to run the agent and return a stream of Pydantic AI events instead, which can then be transformed into protocol-specific events using [`UIAdapter.transform_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.transform_stream). 3. The [`UIAdapter.encode_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.encode_stream) method encodes the stream of protocol-specific events as SSE (HTTP Server-Sent Events) strings, which you can then return as a streaming response. - You can also use [`UIAdapter.streaming_response()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.streaming_response) to generate a Starlette/FastAPI streaming response directly from the protocol-specific event stream returned by `run_stream()`. Note This example uses FastAPI, but can be modified to work with any web framework. run\_stream.py ```python import json from http import HTTPStatus from fastapi import FastAPI from fastapi.requests import Request from fastapi.responses import Response, StreamingResponse from pydantic import ValidationError from pydantic_ai import Agent from pydantic_ai.ui import SSE_CONTENT_TYPE from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: run_input = VercelAIAdapter.build_run_input(await request.body()) except ValidationError as e: return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) adapter = VercelAIAdapter(agent=agent, run_input=run_input, accept=accept) event_stream = adapter.run_stream() sse_event_stream = adapter.encode_stream(event_stream) return StreamingResponse(sse_event_stream, media_type=accept) ``` --- # [Vercel AI](https://pydantic.dev/docs/ai/integrations/ui/vercel-ai/) # Vercel AI Pydantic AI natively supports the [Vercel AI Data Stream Protocol](https://ai-sdk.dev/docs/ai-sdk-ui/stream-protocol#data-stream-protocol) to receive agent run input from, and stream events to, a frontend using [AI SDK UI](https://ai-sdk.dev/docs/ai-sdk-ui/overview) hooks like [`useChat`](https://ai-sdk.dev/docs/reference/ai-sdk-ui/use-chat). You can optionally use [AI Elements](https://ai-sdk.dev/elements) for pre-built UI components. Note By default, the adapter targets AI SDK v5 for backwards compatibility. To use features introduced in AI SDK v6, set `sdk_version=6` on the adapter. ## Usage The [`VercelAIAdapter`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIAdapter) class is responsible for transforming agent run input received from the frontend into arguments for [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents), running the agent, and then transforming Pydantic AI events into Vercel AI events. The event stream transformation is handled by the [`VercelAIEventStream`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIEventStream) class, but you typically won't use this directly. If you're using a Starlette-based web framework like FastAPI, you can use the [`VercelAIAdapter.dispatch_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.dispatch_request) class method from an endpoint function to directly handle a request and return a streaming response of Vercel AI events. This is demonstrated in the next section. If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `VercelAIAdapter` instance and directly use its methods. This is demonstrated in "Advanced Usage" section below. ### Usage with Starlette/FastAPI Besides the request, [`VercelAIAdapter.dispatch_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.dispatch_request) takes the agent, the same optional arguments as [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents), and an optional `on_complete` callback function that receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional Vercel AI events. dispatch\_request.py ```python from fastapi import FastAPI from starlette.requests import Request from starlette.responses import Response from pydantic_ai import Agent from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: return await VercelAIAdapter.dispatch_request(request, agent=agent) ``` ### Advanced Usage If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `VercelAIAdapter` instance and directly use its methods, which can be chained to accomplish the same thing as the `VercelAIAdapter.dispatch_request()` class method shown above: 1. The [`VercelAIAdapter.build_run_input()`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIAdapter.build_run_input) class method takes the request body as bytes and returns a Vercel AI [`RequestData`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.request_types.RequestData) run input object, which you can then pass to the [`VercelAIAdapter()`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIAdapter) constructor along with the agent. - You can also use the [`VercelAIAdapter.from_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request) class method to build an adapter directly from a Starlette/FastAPI request. 2. The [`VercelAIAdapter.run_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.run_stream) method runs the agent and returns a stream of Vercel AI events. It supports the same optional arguments as [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents) and an optional `on_complete` callback function that receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional Vercel AI events. - You can also use [`VercelAIAdapter.run_stream_native()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.run_stream_native) to run the agent and return a stream of Pydantic AI events instead, which can then be transformed into Vercel AI events using [`VercelAIAdapter.transform_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.transform_stream). 3. The [`VercelAIAdapter.encode_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.encode_stream) method encodes the stream of Vercel AI events as SSE (HTTP Server-Sent Events) strings, which you can then return as a streaming response. - You can also use [`VercelAIAdapter.streaming_response()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.streaming_response) to generate a Starlette/FastAPI streaming response directly from the Vercel AI event stream returned by `run_stream()`. Note This example uses FastAPI, but can be modified to work with any web framework. run\_stream.py ```python import json from http import HTTPStatus from fastapi import FastAPI from fastapi.requests import Request from fastapi.responses import Response, StreamingResponse from pydantic import ValidationError from pydantic_ai import Agent from pydantic_ai.ui import SSE_CONTENT_TYPE from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: run_input = VercelAIAdapter.build_run_input(await request.body()) except ValidationError as e: return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) adapter = VercelAIAdapter(agent=agent, run_input=run_input, accept=accept) event_stream = adapter.run_stream() sse_event_stream = adapter.encode_stream(event_stream) return StreamingResponse(sse_event_stream, media_type=accept) ``` ### Data Chunks Pydantic AI tools can send [Vercel AI data stream chunks](https://ai-sdk.dev/docs/ai-sdk-ui/stream-protocol#data-stream-protocol) by returning a [`ToolReturn`](/docs/ai/tools-toolsets/tools-advanced#advanced-tool-returns) object with a data-carrying chunk (or a list of chunks) as `metadata`. The supported chunk types are [`DataChunk`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.response_types.DataChunk), [`SourceUrlChunk`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.response_types.SourceUrlChunk), [`SourceDocumentChunk`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.response_types.SourceDocumentChunk), and [`FileChunk`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.response_types.FileChunk). This is useful for attaching structured data to the frontend alongside the tool result, such as source URLs or custom data payloads. vercel\_ai\_tool\_chunks.py ```python from pydantic_ai import Agent, ToolReturn from pydantic_ai.ui.vercel_ai.response_types import DataChunk, SourceUrlChunk agent = Agent('openai:gpt-5.2') @agent.tool_plain async def search_docs(query: str) -> ToolReturn: return ToolReturn( return_value=f'Found 2 results for "{query}"', metadata=[ SourceUrlChunk( source_id='doc-1', url='https://example.com/docs/intro', title='Introduction', ), DataChunk( type='data-search-results', data={'query': query, 'count': 2}, ), ], ) ``` Note Protocol-control chunks such as `StartChunk`, `FinishChunk`, `StartStepChunk`, or `FinishStepChunk` are automatically filtered out -- only the four data-carrying chunk types listed above are forwarded to the stream and preserved in `dump_messages`. ## Tool Approval Note Tool approval requires AI SDK UI v6 or later on the frontend. Pydantic AI supports human-in-the-loop tool approval workflows with AI SDK UI, allowing users to approve or deny tool executions before they run. See the [deferred tool calls documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for details on setting up tools that require approval. To enable tool approval streaming, pass `sdk_version=6` to `dispatch_request`: ```py @app.post('/chat') async def chat(request: Request) -> Response: return await VercelAIAdapter.dispatch_request(request, agent=agent, sdk_version=6) ``` When `sdk_version=6`, the adapter will: 1. Emit `tool-approval-request` chunks when tools with `requires_approval=True` are called 2. Automatically extract approval responses from follow-up requests 3. Emit `tool-output-denied` chunks for rejected tools On the frontend, AI SDK UI's [`useChat`](https://ai-sdk.dev/docs/reference/ai-sdk-ui/use-chat) hook handles the approval flow. You can use the [`Confirmation`](https://ai-sdk.dev/elements/components/confirmation) component from AI Elements for a pre-built approval UI, or build your own using the hook's `addToolApprovalResponse` function. --- # [Client](https://pydantic.dev/docs/ai/mcp/client/) # Client Pydantic AI can act as an [MCP client](https://modelcontextprotocol.io/quickstart/client), connecting to MCP servers to use their tools. ## Install You need to either install [`pydantic-ai`](/docs/ai/overview/install), or [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `mcp` optional group: - [pip](#tab-panel-50) - [uv](#tab-panel-51) Terminal ```bash pip install "pydantic-ai-slim[mcp]" ``` Terminal ```bash uv add "pydantic-ai-slim[mcp]" ``` ## Usage Pydantic AI comes with three ways to connect to MCP servers: - [`MCPServerStreamableHTTP`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStreamableHTTP) which connects to an MCP server using the [Streamable HTTP](https://modelcontextprotocol.io/introduction#streamable-http) transport - [`MCPServerSSE`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerSSE) which connects to an MCP server using the [HTTP SSE](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) transport - [`MCPServerStdio`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStdio) which runs the server as a subprocess and connects to it using the [stdio](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) transport Examples of all three are shown below. Each MCP server instance is a [toolset](/docs/ai/tools-toolsets/toolsets) and can be registered with an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) using the `toolsets` argument. You can use the [`async with agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__aenter__) context manager to open and close connections to all registered servers (and in the case of stdio servers, start and stop the subprocesses) around the context where they'll be used in agent runs. You can also use [`async with server`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.__aenter__) to manage the connection or subprocess of a specific server, for example if you'd like to use it with multiple agents. If you don't explicitly enter one of these context managers to set up the server, this will be done automatically when it's needed (e.g. to list the available tools or call a specific tool), but it's more efficient to do so around the entire context where you expect the servers to be used. ### Streamable HTTP Client [`MCPServerStreamableHTTP`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStreamableHTTP) connects over HTTP using the [Streamable HTTP](https://modelcontextprotocol.io/introduction#streamable-http) transport to a server. Note [`MCPServerStreamableHTTP`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStreamableHTTP) requires an MCP server to be running and accepting HTTP connections before running the agent. Running the server is not managed by Pydantic AI. Before creating the Streamable HTTP client, we need to run a server that supports the Streamable HTTP transport. streamable\_http\_server.py ```python from mcp.server.fastmcp import FastMCP app = FastMCP() @app.tool() def add(a: int, b: int) -> int: return a + b if __name__ == '__main__': app.run(transport='streamable-http') ``` Then we can create the client: mcp\_streamable\_http\_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') # (1) agent = Agent('openai:gpt-5.2', toolsets=[server]) # (2) async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` Define the MCP server with the URL used to connect. Create an agent with the MCP server attached. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ **What's happening here?** - The model receives the prompt "What is 7 plus 5?" - The model decides "Oh, I've got this `add` tool, that will be a good way to answer this question" - The model returns a tool call - Pydantic AI sends the tool call to the MCP server using the Streamable HTTP transport - The model is called again with the return value of running the `add` tool (12) - The model returns the final answer You can visualise this clearly, and even see the tool call, by adding three lines of code to instrument the example with [logfire](https://logfire.pydantic.dev/docs): mcp\_sse\_client\_logfire.py ```python import logfire logfire.configure() logfire.instrument_pydantic_ai() ``` ### SSE Client [`MCPServerSSE`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerSSE) connects over HTTP using the [HTTP + Server Sent Events transport](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) to a server. Note The SSE transport in MCP is deprecated, you should use Streamable HTTP instead. Before creating the SSE client, we need to run a server that supports the SSE transport. sse\_server.py ```python from mcp.server.fastmcp import FastMCP app = FastMCP() @app.tool() def add(a: int, b: int) -> int: return a + b if __name__ == '__main__': app.run(transport='sse') ``` Then we can create the client: mcp\_sse\_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') # (1) agent = Agent('openai:gpt-5.2', toolsets=[server]) # (2) async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` Define the MCP server with the URL used to connect. Create an agent with the MCP server attached. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ### MCP "stdio" Server MCP also offers [stdio transport](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) where the server is run as a subprocess and communicates with the client over `stdin` and `stdout`. In this case, you'd use the [`MCPServerStdio`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStdio) class. In this example we use a simple MCP server that provides weather tools. mcp\_stdio\_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio('python', args=['mcp_server.py'], timeout=10) agent = Agent('openai:gpt-5.2', toolsets=[server]) async def main(): result = await agent.run('What is the weather in Paris?') print(result.output) #> The weather in Paris is sunny and 26 degrees Celsius. ``` ## Loading MCP Servers from Configuration Instead of creating MCP server instances individually in code, you can load multiple servers from a JSON configuration file using [`load_mcp_servers()`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.load_mcp_servers). This is particularly useful when you need to manage multiple MCP servers or want to configure servers externally without modifying code. ### Configuration Format The configuration file should be a JSON file with an `mcpServers` object containing server definitions. Each server is identified by a unique key and contains the configuration for that server type: mcp\_config.json ```json { "mcpServers": { "python-runner": { "command": "uv", "args": ["run", "mcp-run-python", "stdio"] }, "weather": { "command": "python", "args": ["mcp_server.py"] }, "weather-api": { "url": "http://localhost:3001/sse" }, "calculator": { "url": "http://localhost:8000/mcp" } } } ``` Note The MCP server is only inferred to be an SSE server because of the `/sse` suffix. Any other server with the "url" field will be inferred to be a Streamable HTTP server. We made this decision given that the SSE transport is deprecated. ### Environment Variables The configuration file supports environment variable expansion using the `${VAR}` and `${VAR:-default}` syntax, [like Claude Code](https://code.claude.com/docs/en/mcp#environment-variable-expansion-in-mcp-json). This is useful for keeping sensitive information like API keys or host names out of your configuration files: mcp\_config\_with\_env.json ```json { "mcpServers": { "python-runner": { "command": "${PYTHON_CMD:-python3}", "args": ["run", "${MCP_MODULE}", "stdio"], "env": { "API_KEY": "${MY_API_KEY}" } }, "weather-api": { "url": "https://${SERVER_HOST:-localhost}:${SERVER_PORT:-8080}/sse" } } } ``` When loading this configuration with [`load_mcp_servers()`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.load_mcp_servers): - `${VAR}` references will be replaced with the corresponding environment variable values. - `${VAR:-default}` references will use the environment variable value if set, otherwise the default value. Caution If a referenced environment variable using `${VAR}` syntax is not defined, a `ValueError` will be raised. Use the `${VAR:-default}` syntax to provide a fallback value. ### Usage mcp\_config\_loader.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import load_mcp_servers # Load all servers from configuration file servers = load_mcp_servers('mcp_config.json') # Create agent with all loaded servers agent = Agent('openai:gpt-5.2', toolsets=servers) async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## Tool call customization The MCP servers provide the ability to set a `process_tool_call` which allows the customization of tool call requests and their responses. A common use case for this is to inject metadata to the requests which the server call needs: mcp\_process\_tool\_call.py ```python from typing import Any from pydantic_ai import Agent, RunContext from pydantic_ai.mcp import CallToolFunc, MCPServerStdio, ToolResult from pydantic_ai.models.test import TestModel async def process_tool_call( ctx: RunContext[int], call_tool: CallToolFunc, name: str, tool_args: dict[str, Any], ) -> ToolResult: """A tool call processor that passes along the deps.""" return await call_tool(name, tool_args, {'deps': ctx.deps}) server = MCPServerStdio('python', args=['mcp_server.py'], process_tool_call=process_tool_call) agent = Agent( model=TestModel(call_tools=['echo_deps']), deps_type=int, toolsets=[server] ) async def main(): result = await agent.run('Echo with deps set to 42', deps=42) print(result.output) #> {"echo_deps":{"echo":"This is an echo message","deps":42}} ``` How to access the metadata is MCP server SDK specific. For example with the [MCP Python SDK](https://github.com/modelcontextprotocol/python-sdk), it is accessible via the [`ctx: Context`](https://github.com/modelcontextprotocol/python-sdk#context) argument that can be included on tool call handlers: mcp\_server.py ```python from typing import Any from mcp.server.fastmcp import Context, FastMCP from mcp.server.session import ServerSession mcp = FastMCP('Pydantic AI MCP Server') log_level = 'unset' @mcp.tool() async def echo_deps(ctx: Context[ServerSession, None]) -> dict[str, Any]: """Echo the run context. Args: ctx: Context object containing request and session information. Returns: Dictionary with an echo message and the deps. """ await ctx.info('This is an info message') deps: Any = getattr(ctx.request_context.meta, 'deps') return {'echo': 'This is an echo message', 'deps': deps} if __name__ == '__main__': mcp.run() ``` ## Using Tool Prefixes to Avoid Naming Conflicts When connecting to multiple MCP servers that might provide tools with the same name, you can use the `tool_prefix` parameter to avoid naming conflicts. This parameter adds a prefix to all tool names from a specific server. This allows you to use multiple servers that might have overlapping tool names without conflicts: mcp\_tool\_prefix\_http\_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE # Create two servers with different prefixes weather_server = MCPServerSSE( 'http://localhost:3001/sse', tool_prefix='weather' # Tools will be prefixed with 'weather_' ) calculator_server = MCPServerSSE( 'http://localhost:3002/sse', tool_prefix='calc' # Tools will be prefixed with 'calc_' ) # Both servers might have a tool named 'get_data', but they'll be exposed as: # - 'weather_get_data' # - 'calc_get_data' agent = Agent('openai:gpt-5.2', toolsets=[weather_server, calculator_server]) ``` ## Server Instructions MCP servers can provide instructions during initialization that give context about how to best interact with the server's tools. These are accessible via the [`instructions`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.instructions) property after the connection is established, and can be automatically injected into the agent's instructions by setting `include_instructions=True` when creating the server. mcp\_server\_include\_instructions.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp', include_instructions=True) agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` ## Tool metadata MCP tools can include metadata that provides additional information about the tool's characteristics, which can be useful when [filtering tools](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FilteredToolset). The `meta`, `annotations`, and `output_schema` fields can be found on the `metadata` dict on the [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) object that's passed to filter functions. ## Resources MCP servers can provide [resources](https://modelcontextprotocol.io/docs/concepts/resources) - files, data, or content that can be accessed by the client. Resources in MCP are application-driven, with host applications determining how to incorporate context manually, based on their needs. This means they will _not_ be exposed to the LLM automatically (unless a tool returns a `ResourceLink` or `EmbeddedResource`). Pydantic AI provides methods to discover and read resources from MCP servers: - [`list_resources()`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.list_resources) - List all available resources on the server - [`list_resource_templates()`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.list_resource_templates) - List resource templates with parameter placeholders - [`read_resource(uri)`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.read_resource) - Read the contents of a specific resource by URI Resources are automatically converted: text content is returned as `str`, and binary content is returned as [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent). Before consuming resources, we need to run a server that exposes some: mcp\_resource\_server.py ```python from mcp.server.fastmcp import FastMCP mcp = FastMCP('Pydantic AI MCP Server') log_level = 'unset' @mcp.resource('resource://user_name.txt', mime_type='text/plain') async def user_name_resource() -> str: return 'Alice' if __name__ == '__main__': mcp.run() ``` Then we can create the client: mcp\_resources.py ```python import asyncio from pydantic_ai.mcp import MCPServerStdio async def main(): server = MCPServerStdio('python', args=['-m', 'mcp_resource_server']) async with server: # List all available resources resources = await server.list_resources() for resource in resources: print(f' - {resource.name}: {resource.uri} ({resource.mime_type})') #> - user_name_resource: resource://user_name.txt (text/plain) # Read a text resource user_name = await server.read_resource('resource://user_name.txt') print(f'Text content: {user_name}') #> Text content: Alice if __name__ == '__main__': asyncio.run(main()) ``` _(This example is complete, it can be run "as is")_ ## Custom TLS / SSL configuration In some environments you need to tweak how HTTPS connections are established - for example to trust an internal Certificate Authority, present a client certificate for **mTLS**, or (during local development only!) disable certificate verification altogether. All HTTP-based MCP client classes ([`MCPServerStreamableHTTP`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStreamableHTTP) and [`MCPServerSSE`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerSSE)) expose an `http_client` parameter that lets you pass your own pre-configured [`httpx.AsyncClient`](https://www.python-httpx.org/async/). mcp\_custom\_tls\_client.py ```python import ssl import httpx from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE # Trust an internal / self-signed CA ssl_ctx = ssl.create_default_context(cafile='/etc/ssl/private/my_company_ca.pem') # OPTIONAL: if the server requires **mutual TLS** load your client certificate ssl_ctx.load_cert_chain(certfile='/etc/ssl/certs/client.crt', keyfile='/etc/ssl/private/client.key',) http_client = httpx.AsyncClient( verify=ssl_ctx, timeout=httpx.Timeout(10.0), ) server = MCPServerSSE( 'http://localhost:3001/sse', http_client=http_client, # (1) ) agent = Agent('openai:gpt-5.2', toolsets=[server]) async def main(): result = await agent.run('How many days between 2000-01-01 and 2025-03-18?') print(result.output) #> There are 9,208 days between January 1, 2000, and March 18, 2025. ``` When you supply `http_client`, Pydantic AI reuses this client for every request. Anything supported by **httpx** (`verify`, `cert`, custom proxies, timeouts, etc.) therefore applies to all MCP traffic. ## Client Identification When connecting to an MCP server, you can optionally specify an [Implementation](https://modelcontextprotocol.io/specification/2025-11-25/schema#implementation) object as client information that will be sent to the server during initialization. This is useful for: - Identifying your application in server logs - Allowing servers to provide custom behavior based on the client - Debugging and monitoring MCP connections - Version-specific feature negotiation All MCP client classes ([`MCPServerStdio`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStdio), [`MCPServerStreamableHTTP`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStreamableHTTP), and [`MCPServerSSE`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerSSE)) support the `client_info` parameter: mcp\_client\_with\_name.py ```python from mcp import types as mcp_types from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE( 'http://localhost:3001/sse', client_info=mcp_types.Implementation( name='MyApplication', version='2.1.0', ), ) ``` ## MCP Sampling What is MCP Sampling? In MCP [sampling](https://modelcontextprotocol.io/docs/concepts/sampling) is a system by which an MCP server can make LLM calls via the MCP client - effectively proxying requests to an LLM via the client over whatever transport is being used. Sampling is extremely useful when MCP servers need to use Gen AI but you don't want to provision them each with their own LLM credentials or when a public MCP server would like the connecting client to pay for LLM calls. Confusingly it has nothing to do with the concept of "sampling" in observability, or frankly the concept of "sampling" in any other domain. Sampling Diagram Here's a mermaid diagram that may or may not make the data flow clearer: ```mermaid sequenceDiagram participant LLM participant MCP_Client as MCP client participant MCP_Server as MCP server MCP_Client->>LLM: LLM call LLM->>MCP_Client: LLM tool call response MCP_Client->>MCP_Server: tool call MCP_Server->>MCP_Client: sampling "create message" MCP_Client->>LLM: LLM call LLM->>MCP_Client: LLM text response MCP_Client->>MCP_Server: sampling response MCP_Server->>MCP_Client: tool call response ``` Pydantic AI supports sampling as both a client and server. See the [server](/docs/ai/mcp/server#mcp-sampling) documentation for details on how to use sampling within a server. Sampling is automatically supported by Pydantic AI agents when they act as a client. To be able to use sampling, an MCP server instance needs to have a [`sampling_model`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.sampling_model) set. This can be done either directly on the server using the constructor keyword argument or the property, or by using [`agent.set_mcp_sampling_model()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.set_mcp_sampling_model) to set the agent's model or one specified as an argument as the sampling model on all MCP servers registered with that agent. Let's say we have an MCP server that wants to use sampling (in this case to generate an SVG as per the tool arguments). Sampling MCP Server generate\_svg.py ```python import re from pathlib import Path from mcp import SamplingMessage from mcp.server.fastmcp import Context, FastMCP from mcp.types import TextContent app = FastMCP() @app.tool() async def image_generator(ctx: Context, subject: str, style: str) -> str: prompt = f'{subject=} {style=}' # `ctx.session.create_message` is the sampling call result = await ctx.session.create_message( [SamplingMessage(role='user', content=TextContent(type='text', text=prompt))], max_tokens=1_024, system_prompt='Generate an SVG image as per the user input', ) assert isinstance(result.content, TextContent) path = Path(f'{subject}_{style}.svg') # remove triple backticks if the svg was returned within markdown if m := re.search(r'^```\w*$(.+?)```$', result.content.text, re.S | re.M): path.write_text(m.group(1), encoding='utf-8') else: path.write_text(result.content.text, encoding='utf-8') return f'See {path}' if __name__ == '__main__': # run the server via stdio app.run() ``` Using this server with an `Agent` will automatically allow sampling: sampling\_mcp\_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio('python', args=['generate_svg.py']) agent = Agent('openai:gpt-5.2', toolsets=[server]) async def main(): agent.set_mcp_sampling_model() result = await agent.run('Create an image of a robot in a punk style.') print(result.output) #> Image file written to robot_punk.svg. ``` _(This example is complete, it can be run "as is")_ You can disallow sampling by setting [`allow_sampling=False`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.allow_sampling) when creating the server reference, e.g.: sampling\_disallowed.py ```python from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( 'python', args=['generate_svg.py'], allow_sampling=False, ) ``` ## Elicitation In MCP, [elicitation](https://modelcontextprotocol.io/docs/concepts/elicitation) allows a server to request for [structured input](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#supported-schema-types) from the client for missing or additional context during a session. Elicitation let models essentially say "Hold on - I need to know X before i can continue" rather than requiring everything upfront or taking a shot in the dark. ### How Elicitation works Elicitation introduces a new protocol message type called [`ElicitRequest`](https://modelcontextprotocol.io/specification/2025-06-18/schema#elicitrequest), which is sent from the server to the client when it needs additional information. The client can then respond with an [`ElicitResult`](https://modelcontextprotocol.io/specification/2025-06-18/schema#elicitresult) or an `ErrorData` message. Here's a typical interaction: - User makes a request to the MCP server (e.g. "Book a table at that Italian place") - The server identifies that it needs more information (e.g. "Which Italian place?", "What date and time?") - The server sends an `ElicitRequest` to the client asking for the missing information. - The client receives the request, presents it to the user (e.g. via a terminal prompt, GUI dialog, or web interface). - User provides the requested information, `decline` or `cancel` the request. - The client sends an `ElicitResult` back to the server with the user's response. - With the structured data, the server can continue processing the original request. This allows for a more interactive and user-friendly experience, especially for multi-staged workflows. Instead of requiring all information upfront, the server can ask for it as needed, making the interaction feel more natural. ### Setting up Elicitation To enable elicitation, provide an [`elicitation_callback`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.elicitation_callback) function when creating your MCP server instance: restaurant\_server.py ```python from mcp.server.fastmcp import Context, FastMCP from pydantic import BaseModel, Field mcp = FastMCP(name='Restaurant Booking') class BookingDetails(BaseModel): """Schema for restaurant booking information.""" restaurant: str = Field(description='Choose a restaurant') party_size: int = Field(description='Number of people', ge=1, le=8) date: str = Field(description='Reservation date (DD-MM-YYYY)') @mcp.tool() async def book_table(ctx: Context) -> str: """Book a restaurant table with user input.""" # Ask user for booking details using Pydantic schema result = await ctx.elicit(message='Please provide your booking details:', schema=BookingDetails) if result.action == 'accept' and result.data: booking = result.data return f'✅ Booked table for {booking.party_size} at {booking.restaurant} on {booking.date}' elif result.action == 'decline': return 'No problem! Maybe another time.' else: # cancel return 'Booking cancelled.' if __name__ == '__main__': mcp.run(transport='stdio') ``` This server demonstrates elicitation by requesting structured booking details from the client when the `book_table` tool is called. Here's how to create a client that handles these elicitation requests: client\_example.py ```python import asyncio from typing import Any from mcp.client.session import ClientSession from mcp.shared.context import RequestContext from mcp.types import ElicitRequestParams, ElicitResult from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio async def handle_elicitation( context: RequestContext[ClientSession, Any, Any], params: ElicitRequestParams, ) -> ElicitResult: """Handle elicitation requests from MCP server.""" print(f'\n{params.message}') if not params.requestedSchema: response = input('Response: ') return ElicitResult(action='accept', content={'response': response}) # Collect data for each field properties = params.requestedSchema['properties'] data = {} for field, info in properties.items(): description = info.get('description', field) value = input(f'{description}: ') # Convert to proper type based on JSON schema if info.get('type') == 'integer': data[field] = int(value) else: data[field] = value # Confirm confirm = input('\nConfirm booking? (y/n/c): ').lower() if confirm == 'y': print('Booking details:', data) return ElicitResult(action='accept', content=data) elif confirm == 'n': return ElicitResult(action='decline') else: return ElicitResult(action='cancel') # Set up MCP server connection restaurant_server = MCPServerStdio( 'python', args=['restaurant_server.py'], elicitation_callback=handle_elicitation ) # Create agent agent = Agent('openai:gpt-5.2', toolsets=[restaurant_server]) async def main(): """Run the agent to book a restaurant table.""" result = await agent.run('Book me a table') print(f'\nResult: {result.output}') if __name__ == '__main__': asyncio.run(main()) ``` ### Supported Schema Types MCP elicitation supports string, number, boolean, and enum types with flat object structures only. These limitations ensure reliable cross-client compatibility. See [supported schema types](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#supported-schema-types) for details. ### Security MCP Elicitation requires careful handling - servers must not request sensitive information, and clients must implement user approval controls with clear explanations. See [security considerations](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#security-considerations) for details. --- # [FastMCP Client](https://pydantic.dev/docs/ai/mcp/fastmcp-client/) # FastMCP Client [FastMCP](https://gofastmcp.com/) is a higher-level MCP framework that bills itself as "The fast, Pythonic way to build MCP servers and clients." It supports additional capabilities on top of the MCP specification like [Tool Transformation](https://gofastmcp.com/patterns/tool-transformation), [OAuth](https://gofastmcp.com/clients/auth/oauth), and more. As an alternative to Pydantic AI's standard [`MCPServer` MCP client](/docs/ai/mcp/client) built on the [MCP SDK](https://github.com/modelcontextprotocol/python-sdk), you can use the [`FastMCPToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.fastmcp.FastMCPToolset) [toolset](/docs/ai/tools-toolsets/toolsets) that leverages the [FastMCP Client](https://gofastmcp.com/clients/) to connect to local and remote MCP servers, whether or not they're built using [FastMCP Server](https://gofastmcp.com/servers/). Note that it does not yet support integration elicitation or sampling, which are supported by the [standard `MCPServer` client](/docs/ai/mcp/client). ## Install To use the `FastMCPToolset`, you will need to install [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `fastmcp` optional group: - [pip](#tab-panel-52) - [uv](#tab-panel-53) Terminal ```bash pip install "pydantic-ai-slim[fastmcp]" ``` Terminal ```bash uv add "pydantic-ai-slim[fastmcp]" ``` ## Usage A `FastMCPToolset` can then be created from: - A FastMCP Server: `FastMCPToolset(fastmcp.FastMCP('my_server'))` - A FastMCP Client: `FastMCPToolset(fastmcp.Client(...))` - A FastMCP Transport: `FastMCPToolset(fastmcp.StdioTransport(command='python', args=['mcp_server.py']))` - A Streamable HTTP URL: `FastMCPToolset('http://localhost:8000/mcp')` - An HTTP SSE URL: `FastMCPToolset('http://localhost:8000/sse')` - A Python Script: `FastMCPToolset('my_server.py')` - A Node.js Script: `FastMCPToolset('my_server.js')` - A JSON MCP Configuration: `FastMCPToolset({'mcpServers': {'my_server': {'command': 'python', 'args': ['mcp_server.py']}}})` If you already have a [FastMCP Server](https://gofastmcp.com/servers) in the same codebase as your Pydantic AI agent, you can create a `FastMCPToolset` directly from it and save agent a network round trip: ```python from fastmcp import FastMCP from pydantic_ai import Agent from pydantic_ai.toolsets.fastmcp import FastMCPToolset fastmcp_server = FastMCP('my_server') @fastmcp_server.tool() async def add(a: int, b: int) -> int: return a + b toolset = FastMCPToolset(fastmcp_server) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ Connecting your agent to a Streamable HTTP MCP Server is as simple as: ```python from pydantic_ai import Agent from pydantic_ai.toolsets.fastmcp import FastMCPToolset toolset = FastMCPToolset('http://localhost:8000/mcp') agent = Agent('openai:gpt-5.2', toolsets=[toolset]) ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ You can also create a `FastMCPToolset` from a JSON MCP Configuration: ```python from pydantic_ai import Agent from pydantic_ai.toolsets.fastmcp import FastMCPToolset mcp_config = { 'mcpServers': { 'time_mcp_server': { 'command': 'uvx', 'args': ['mcp-run-python', 'stdio'] }, 'weather_server': { 'command': 'python', 'args': ['mcp_server.py'] } } } toolset = FastMCPToolset(mcp_config) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) ``` _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ --- # [Overview](https://pydantic.dev/docs/ai/mcp/overview/) # Overview Pydantic AI supports [Model Context Protocol (MCP)](https://modelcontextprotocol.io) in multiple ways: 1. [Agents](/docs/ai/core-concepts/agent) can connect to MCP servers and use their tools using three different methods: 1. Pydantic AI can act as an MCP client and connect directly to local and remote MCP servers. [Learn more](/docs/ai/mcp/client) about [`MCPServer`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer). 2. Pydantic AI can use the [FastMCP Client](https://gofastmcp.com/clients/client/) to connect to local and remote MCP servers, whether or not they're built using [FastMCP Server](https://gofastmcp.com/servers). [Learn more](/docs/ai/mcp/fastmcp-client) about [`FastMCPToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.fastmcp.FastMCPToolset). 3. Some model providers can themselves connect to remote MCP servers using a "built-in tool". [Learn more](/docs/ai/tools-toolsets/builtin-tools#mcp-server-tool) about [`MCPServerTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MCPServerTool). 2. Agents can be used within MCP servers. [Learn more](/docs/ai/mcp/server) ## What is MCP? The Model Context Protocol is a standardized protocol that allow AI applications (including programmatic agents like Pydantic AI, coding agents like [cursor](https://www.cursor.com/), and desktop applications like [Claude Desktop](https://claude.ai/download)) to connect to external tools and services using a common interface. As with other protocols, the dream of MCP is that a wide range of applications can speak to each other without the need for specific integrations. There is a great list of MCP servers at [github.com/modelcontextprotocol/servers](https://github.com/modelcontextprotocol/servers). Some examples of what this means: - Pydantic AI could use a web search service implemented as an MCP server to implement a deep research agent - Cursor could connect to the [Pydantic Logfire](https://github.com/pydantic/logfire-mcp) MCP server to search logs, traces and metrics to gain context while fixing a bug - Pydantic AI, or any other MCP client could connect to our [Run Python](https://github.com/pydantic/mcp-run-python) MCP server to run arbitrary Python code in a sandboxed environment --- # [Server](https://pydantic.dev/docs/ai/mcp/server/) # Server Pydantic AI models can also be used within MCP Servers. ## MCP Server Here's a simple example of a [Python MCP server](https://github.com/modelcontextprotocol/python-sdk) using Pydantic AI within a tool call: mcp\_server.py ```python from mcp.server.fastmcp import FastMCP from pydantic_ai import Agent server = FastMCP('Pydantic AI Server') server_agent = Agent( 'anthropic:claude-haiku-4-5', instructions='always reply in rhyme' ) @server.tool() async def poet(theme: str) -> str: """Poem generator""" r = await server_agent.run(f'write a poem about {theme}') return r.output if __name__ == '__main__': server.run() ``` ## Simple client This server can be queried with any MCP client. Here is an example using the Python SDK directly: mcp\_client.py ```py import asyncio import os from mcp import ClientSession, StdioServerParameters from mcp.client.stdio import stdio_client async def client(): server_params = StdioServerParameters( command='python', args=['mcp_server.py'], env=os.environ ) async with stdio_client(server_params) as (read, write): async with ClientSession(read, write) as session: await session.initialize() result = await session.call_tool('poet', {'theme': 'socks'}) print(result.content[0].text) """ Oh, socks, those garments soft and sweet, That nestle softly 'round our feet, From cotton, wool, or blended thread, They keep our toes from feeling dread. """ if __name__ == '__main__': asyncio.run(client()) ``` ## MCP Sampling What is MCP Sampling? See the [MCP client docs](/docs/ai/mcp/client#mcp-sampling) for details of what MCP sampling is, and how you can support it when using Pydantic AI as an MCP client. When Pydantic AI agents are used within MCP servers, they can use sampling via [`MCPSamplingModel`](/docs/ai/api/models/mcp-sampling/#pydantic_ai.models.mcp_sampling.MCPSamplingModel). We can extend the above example to use sampling so instead of connecting directly to the LLM, the agent calls back through the MCP client to make LLM calls. mcp\_server\_sampling.py ```py from mcp.server.fastmcp import Context, FastMCP from pydantic_ai import Agent from pydantic_ai.models.mcp_sampling import MCPSamplingModel server = FastMCP('Pydantic AI Server with sampling') server_agent = Agent(instructions='always reply in rhyme') @server.tool() async def poet(ctx: Context, theme: str) -> str: """Poem generator""" r = await server_agent.run(f'write a poem about {theme}', model=MCPSamplingModel(session=ctx.session)) return r.output if __name__ == '__main__': server.run() # run the server over stdio ``` The [above](#simple-client) client does not support sampling, so if you tried to use it with this server you'd get an error. The simplest way to support sampling in an MCP client is to [use](/docs/ai/mcp/client#mcp-sampling) a Pydantic AI agent as the client, but if you wanted to support sampling with the vanilla MCP SDK, you could do so like this: mcp\_client\_sampling.py ```py import asyncio from typing import Any from mcp import ClientSession, StdioServerParameters from mcp.client.stdio import stdio_client from mcp.shared.context import RequestContext from mcp.types import ( CreateMessageRequestParams, CreateMessageResult, ErrorData, TextContent, ) async def sampling_callback( context: RequestContext[ClientSession, Any], params: CreateMessageRequestParams ) -> CreateMessageResult | ErrorData: print('sampling system prompt:', params.systemPrompt) #> sampling system prompt: always reply in rhyme print('sampling messages:', params.messages) """ sampling messages: [ SamplingMessage( role='user', content=TextContent( type='text', text='write a poem about socks', annotations=None, meta=None, ), meta=None, ) ] """ # TODO get the response content by calling an LLM... response_content = 'Socks for a fox.' return CreateMessageResult( role='assistant', content=TextContent(type='text', text=response_content), model='fictional-llm', ) async def client(): server_params = StdioServerParameters(command='python', args=['mcp_server_sampling.py']) async with stdio_client(server_params) as (read, write): async with ClientSession(read, write, sampling_callback=sampling_callback) as session: await session.initialize() result = await session.call_tool('poet', {'theme': 'socks'}) print(result.content[0].text) #> Socks for a fox. if __name__ == '__main__': asyncio.run(client()) ``` _(This example is complete, it can be run "as is")_ --- # [Anthropic](https://pydantic.dev/docs/ai/models/anthropic/) # Anthropic ## Install To use `AnthropicModel` models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `anthropic` optional group: - [pip](#tab-panel-62) - [uv](#tab-panel-63) Terminal ```bash pip install "pydantic-ai-slim[anthropic]" ``` Terminal ```bash uv add "pydantic-ai-slim[anthropic]" ``` ## Configuration To use [Anthropic](https://anthropic.com) through their API, go to [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) to generate an API key. `AnthropicModelName` contains a list of available Anthropic models. ## Environment variable Once you have the API key, you can set it as an environment variable: Terminal ```bash export ANTHROPIC_API_KEY='your-api-key' ``` You can then use `AnthropicModel` by name: ```python from pydantic_ai import Agent agent = Agent('anthropic:claude-sonnet-4-6') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel model = AnthropicModel('claude-sonnet-4-5') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider model = AnthropicModel( 'claude-sonnet-4-5', provider=AnthropicProvider(api_key='your-api-key') ) agent = Agent(model) ... ``` ## Custom HTTP Client You can customize the `AnthropicProvider` with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider custom_http_client = AsyncClient(timeout=30) model = AnthropicModel( 'claude-sonnet-4-5', provider=AnthropicProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` ## Cloud Platform Integrations You can use Anthropic models through cloud platforms by passing a custom client to [`AnthropicProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.anthropic.AnthropicProvider). ### AWS Bedrock To use Claude models via [AWS Bedrock](https://aws.amazon.com/bedrock/claude/), follow the [Anthropic documentation](https://docs.anthropic.com/en/api/claude-on-amazon-bedrock) on how to set up an `AsyncAnthropicBedrock` client and then pass it to `AnthropicProvider`: ```python from anthropic import AsyncAnthropicBedrock from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider bedrock_client = AsyncAnthropicBedrock() # Uses AWS credentials from environment provider = AnthropicProvider(anthropic_client=bedrock_client) model = AnthropicModel('us.anthropic.claude-sonnet-4-5-20250929-v1:0', provider=provider) agent = Agent(model) ... ``` Bedrock vs BedrockConverseModel This approach uses Anthropic's SDK with AWS Bedrock credentials. For an alternative using AWS SDK (boto3) directly, see [`BedrockConverseModel`](/docs/ai/models/bedrock). ### Google Vertex AI To use Claude models via [Google Cloud Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude), follow the [Anthropic documentation](https://docs.anthropic.com/en/api/claude-on-vertex-ai) on how to set up an `AsyncAnthropicVertex` client and then pass it to `AnthropicProvider`: ```python from anthropic import AsyncAnthropicVertex from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider vertex_client = AsyncAnthropicVertex(region='us-east5', project_id='your-project-id') provider = AnthropicProvider(anthropic_client=vertex_client) model = AnthropicModel('claude-sonnet-4-5', provider=provider) agent = Agent(model) ... ``` Vertex vs GoogleModel This approach uses Anthropic's SDK with Vertex AI credentials. For an alternative using Google's Vertex AI SDK directly, see [`GoogleModel`](/docs/ai/models/google). ### Microsoft Foundry To use Claude models via [Microsoft Foundry](https://ai.azure.com/), follow the [Anthropic documentation](https://platform.claude.com/docs/en/build-with-claude/claude-in-microsoft-foundry) on how to set up an `AsyncAnthropicFoundry` client and then pass it to `AnthropicProvider`: ```python from anthropic import AsyncAnthropicFoundry from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider foundry_client = AsyncAnthropicFoundry( api_key='your-foundry-api-key', # Or set ANTHROPIC_FOUNDRY_API_KEY resource='your-resource-name', ) provider = AnthropicProvider(anthropic_client=foundry_client) model = AnthropicModel('claude-sonnet-4-5', provider=provider) agent = Agent(model) ... ``` See [Anthropic's Microsoft Foundry documentation](https://platform.claude.com/docs/en/build-with-claude/claude-in-microsoft-foundry) for setup instructions including Entra ID authentication. ## Prompt Caching Anthropic supports [prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) to reduce costs by caching parts of your prompts. Pydantic AI provides four ways to use prompt caching: 1. **Cache User Messages with [`CachePoint`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.CachePoint)**: Insert a `CachePoint` marker in your user messages to cache everything before it 2. **Cache System Instructions**: Set [`AnthropicModelSettings.anthropic_cache_instructions`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions) to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly. When you have both static and dynamic [instructions](/docs/ai/core-concepts/agent#instructions), the cache point is automatically placed after the last static instruction, so dynamic instructions can change without invalidating the static cache. 3. **Cache Tool Definitions**: Set [`AnthropicModelSettings.anthropic_cache_tool_definitions`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions) to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly 4. **Cache All Messages**: Set [`AnthropicModelSettings.anthropic_cache_messages`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_messages) to `True` to automatically cache all messages ### Example 1: Automatic Message Caching Use `anthropic_cache_messages` to automatically cache all messages up to and including the newest user message: ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'anthropic:claude-sonnet-4-6', instructions='You are a helpful assistant.', model_settings=AnthropicModelSettings( anthropic_cache_messages=True, # Automatically caches the last message ), ) # The last message is automatically cached - no need for manual CachePoint result1 = agent.run_sync('What is the capital of France?') # Subsequent calls with similar conversation benefit from cache result2 = agent.run_sync('What is the capital of Germany?') print(f'Cache write: {result1.usage().cache_write_tokens}') print(f'Cache read: {result2.usage().cache_read_tokens}') ``` ### Example 2: Comprehensive Caching Strategy Combine multiple cache settings for maximum savings: ```python from pydantic_ai import Agent, RunContext from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'anthropic:claude-sonnet-4-6', instructions='Detailed instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True, # Cache system instructions anthropic_cache_tool_definitions='1h', # Cache tool definitions with 1h TTL anthropic_cache_messages=True, # Also cache the last message ), ) @agent.tool def search_docs(ctx: RunContext, query: str) -> str: """Search documentation.""" return f'Results for {query}' result = agent.run_sync('Search for Python best practices') print(result.output) ``` ### Smart Instruction Caching When you use `anthropic_cache_instructions` with both static and dynamic [instructions](/docs/ai/core-concepts/agent#instructions), Pydantic AI automatically places the cache boundary at the optimal point. Static instructions (from `Agent(instructions=...)`) are sorted before dynamic instructions (from `@agent.instructions` functions or [toolsets](/docs/ai/tools-toolsets/toolsets)), and the cache point is placed after the last static instruction block. This means your stable, static instructions are cached efficiently, while dynamic instructions (which may change between requests) remain outside the cache boundary and don't cause cache invalidation. ```python from datetime import date from pydantic_ai import Agent, RunContext from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'anthropic:claude-sonnet-4-6', deps_type=str, instructions='You are a helpful customer service agent. Follow company policy.', # (1) model_settings=AnthropicModelSettings( anthropic_cache_instructions=True, # (2) ), ) @agent.instructions def dynamic_context(ctx: RunContext[str]) -> str: # (3) return f"Customer name: {ctx.deps}. Today's date: {date.today()}." result = agent.run_sync('What is your return policy?', deps='Alice') print(result.output) ``` Static instructions are cached across requests. Enables smart cache placement at the static/dynamic boundary. Dynamic instructions change per-request and are not cached. ### Example 3: Fine-Grained Control with CachePoint Use manual `CachePoint` markers to control cache locations precisely: ```python from pydantic_ai import Agent, CachePoint agent = Agent( 'anthropic:claude-sonnet-4-6', instructions='Instructions...', ) # Manually control cache points for specific content blocks result = agent.run_sync([ 'Long context from documentation...', CachePoint(), # Cache everything up to this point 'First question' ]) print(result.output) ``` ### Accessing Cache Usage Statistics Access cache usage statistics via `result.usage()`: ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'anthropic:claude-sonnet-4-6', instructions='Instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True # Default 5m TTL ), ) result = agent.run_sync('Your question') usage = result.usage() print(f'Cache write tokens: {usage.cache_write_tokens}') print(f'Cache read tokens: {usage.cache_read_tokens}') ``` ### Cache Point Limits Anthropic enforces a maximum of 4 cache points per request. Pydantic AI automatically manages this limit to ensure your requests always comply without errors. #### How Cache Points Are Allocated Cache points can be placed in three locations: 1. **System Prompt**: Via `anthropic_cache_instructions` setting (adds cache point to last system prompt block) 2. **Tool Definitions**: Via `anthropic_cache_tool_definitions` setting (adds cache point to last tool definition) 3. **Messages**: Via `CachePoint` markers or `anthropic_cache_messages` setting (adds cache points to message content) Each setting uses **at most 1 cache point**, but you can combine them. #### Example: Using All 3 Cache Point Sources Define an agent with all cache settings enabled: ```python from pydantic_ai import Agent, CachePoint from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'anthropic:claude-sonnet-4-6', instructions='Detailed instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True, # 1 cache point anthropic_cache_tool_definitions=True, # 1 cache point anthropic_cache_messages=True, # 1 cache point ), ) @agent.tool_plain def my_tool() -> str: return 'result' # This uses 3 cache points (instructions + tools + last message) # You can add 1 more CachePoint marker before hitting the limit result = agent.run_sync([ 'Context', CachePoint(), # 4th cache point - OK 'Question' ]) print(result.output) usage = result.usage() print(f'Cache write tokens: {usage.cache_write_tokens}') print(f'Cache read tokens: {usage.cache_read_tokens}') ``` #### Automatic Cache Point Limiting When cache points from all sources (settings + `CachePoint` markers) exceed 4, Pydantic AI automatically removes excess cache points from **older message content** (keeping the most recent ones). Define an agent with 2 cache points from settings: ```python from pydantic_ai import Agent, CachePoint from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'anthropic:claude-sonnet-4-6', instructions='Instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True, # 1 cache point anthropic_cache_tool_definitions=True, # 1 cache point ), ) @agent.tool_plain def search() -> str: return 'data' # Already using 2 cache points (instructions + tools) # Can add 2 more CachePoint markers (4 total limit) result = agent.run_sync([ 'Context 1', CachePoint(), # Oldest - will be removed 'Context 2', CachePoint(), # Will be kept (3rd point) 'Context 3', CachePoint(), # Will be kept (4th point) 'Question' ]) # Final cache points: instructions + tools + Context 2 + Context 3 = 4 print(result.output) usage = result.usage() print(f'Cache write tokens: {usage.cache_write_tokens}') print(f'Cache read tokens: {usage.cache_read_tokens}') ``` **Key Points**: - System and tool cache points are **always preserved** - The cache point created by `anthropic_cache_messages` is **always preserved** (as it's the newest message cache point) - Additional `CachePoint` markers in messages are removed from oldest to newest when the limit is exceeded - This ensures critical caching (instructions/tools) is maintained while still benefiting from message-level caching ## Message Compaction Anthropic supports [automatic context compaction](https://docs.anthropic.com/en/docs/build-with-claude/compaction) to manage long conversations. When input tokens exceed a configured threshold, the API automatically generates a summary that replaces older messages while preserving context. The easiest way to enable compaction is with the [`AnthropicCompaction`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicCompaction) capability: anthropic\_compaction.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicCompaction agent = Agent( 'anthropic:claude-sonnet-4-6', capabilities=[AnthropicCompaction(token_threshold=100_000)], ) ``` The capability accepts: - **`token_threshold`** (default: 150,000, minimum: 50,000): Compaction triggers when input tokens exceed this value. - **`instructions`**: Custom instructions for how the summary should be generated. - **`pause_after_compaction`**: When `True`, the response stops after the compaction block with `stop_reason='compaction'`, allowing explicit handling before continuing. Alternatively, you can configure compaction directly via model settings using [`anthropic_context_management`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_context_management): anthropic\_compaction\_settings.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent('anthropic:claude-sonnet-4-6') result = agent.run_sync( 'Hello!', model_settings=AnthropicModelSettings( anthropic_context_management={ 'edits': [{'type': 'compact_20260112', 'trigger': {'type': 'input_tokens', 'value': 100_000}}] } ), ) ``` Note Compaction blocks returned by Anthropic contain readable text summaries. They are automatically round-tripped in subsequent requests when included in the message history. --- # [Bedrock](https://pydantic.dev/docs/ai/models/bedrock/) # Bedrock ## Install To use `BedrockConverseModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `bedrock` optional group: - [pip](#tab-panel-66) - [uv](#tab-panel-67) Terminal ```bash pip install "pydantic-ai-slim[bedrock]" ``` Terminal ```bash uv add "pydantic-ai-slim[bedrock]" ``` ## Configuration To use [AWS Bedrock](https://aws.amazon.com/bedrock/), you'll need an AWS account with Bedrock enabled and appropriate credentials. You can use either AWS credentials directly or a pre-configured boto3 client. `BedrockModelName` contains a list of available Bedrock models, including models from Anthropic, Amazon, Cohere, Meta, and Mistral. ## Environment variables You can set your AWS credentials as environment variables ([among other options](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#using-environment-variables)): Terminal ```bash export AWS_BEARER_TOKEN_BEDROCK='your-api-key' # or: export AWS_ACCESS_KEY_ID='your-access-key' export AWS_SECRET_ACCESS_KEY='your-secret-key' export AWS_DEFAULT_REGION='us-east-1' # or your preferred region ``` You can then use `BedrockConverseModel` by name: ```python from pydantic_ai import Agent agent = Agent('bedrock:anthropic.claude-sonnet-4-5-20250929-v1:0') ... ``` Or initialize the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel model = BedrockConverseModel('anthropic.claude-sonnet-4-5-20250929-v1:0') agent = Agent(model) ... ``` ## Customizing Bedrock Runtime API You can customize the Bedrock Runtime API calls by adding additional parameters, such as [guardrail configurations](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) and [performance settings](https://docs.aws.amazon.com/bedrock/latest/userguide/latency-optimized-inference.html). For a complete list of configurable parameters, refer to the documentation for [`BedrockModelSettings`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings). customize\_bedrock\_model\_settings.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings # Define Bedrock model settings with guardrail and performance configurations bedrock_model_settings = BedrockModelSettings( bedrock_guardrail_config={ 'guardrailIdentifier': 'v1', 'guardrailVersion': 'v1', 'trace': 'enabled' }, bedrock_performance_configuration={ 'latency': 'optimized' } ) model = BedrockConverseModel(model_name='us.amazon.nova-pro-v1:0') agent = Agent(model=model, model_settings=bedrock_model_settings) ``` ## Prompt Caching Bedrock supports [prompt caching](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) on Anthropic models so you can reuse expensive context across requests. Pydantic AI provides four ways to use prompt caching: 1. **Cache User Messages with [`CachePoint`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.CachePoint)**: Insert a `CachePoint` marker to cache everything before it in the current user message. 2. **Cache System Instructions**: Enable [`BedrockModelSettings.bedrock_cache_instructions`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_cache_instructions) to append a cache point after the system prompt. When you have both static and dynamic [instructions](/docs/ai/core-concepts/agent#instructions), the cache point is placed after the last static instruction, so dynamic instructions can change without invalidating the static cache. 3. **Cache Tool Definitions**: Enable [`BedrockModelSettings.bedrock_cache_tool_definitions`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_cache_tool_definitions) to cache your tool schemas. 4. **Cache All Messages**: Set [`BedrockModelSettings.bedrock_cache_messages`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_cache_messages) to `True` to automatically cache the last user message. No TTL Support Unlike the direct Anthropic API, Bedrock manages cache TTL automatically. All cache settings are boolean only -- no `'5m'` or `'1h'` options. Minimum Token Threshold AWS only serves cached content once a segment crosses the provider-specific minimum token thresholds (see the [Bedrock prompt caching docs](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html)). Short prompts or tool definitions below those limits will bypass the cache, so don't expect savings for tiny payloads. ### Example 1: Automatic Message Caching Use `bedrock_cache_messages` to automatically cache the last user message: ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockModelSettings agent = Agent( 'bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0', system_prompt='You are a helpful assistant.', model_settings=BedrockModelSettings( bedrock_cache_messages=True, # Automatically caches the last message ), ) # The last message is automatically cached - no need for manual CachePoint result1 = agent.run_sync('What is the capital of France?') # Subsequent calls with similar conversation benefit from cache result2 = agent.run_sync('What is the capital of Germany?') print(f'Cache write: {result1.usage().cache_write_tokens}') print(f'Cache read: {result2.usage().cache_read_tokens}') ``` ### Example 2: Comprehensive Caching Strategy Combine multiple cache settings for maximum savings: ```python from pydantic_ai import Agent, RunContext from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings model = BedrockConverseModel('us.anthropic.claude-sonnet-4-5-20250929-v1:0') agent = Agent( model, system_prompt='Detailed instructions...', model_settings=BedrockModelSettings( bedrock_cache_instructions=True, # Cache system instructions bedrock_cache_tool_definitions=True, # Cache tool definitions bedrock_cache_messages=True, # Also cache the last message ), ) @agent.tool def search_docs(ctx: RunContext, query: str) -> str: """Search documentation.""" return f'Results for {query}' result = agent.run_sync('Search for Python best practices') print(result.output) ``` ### Example 3: Fine-Grained Control with CachePoint Use manual `CachePoint` markers to control cache locations precisely: ```python from pydantic_ai import Agent, CachePoint agent = Agent( 'bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0', system_prompt='Instructions...', ) # Manually control cache points for specific content blocks result = agent.run_sync([ 'Long context from documentation...', CachePoint(), # Cache everything up to this point 'First question' ]) print(result.output) ``` ### Accessing Cache Usage Statistics Access cache usage statistics via [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage): ```python from pydantic_ai import Agent, CachePoint agent = Agent('bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0') async def main(): result = await agent.run( [ 'Reference material...', CachePoint(), 'What changed since last time?', ] ) usage = result.usage() print(f'Cache writes: {usage.cache_write_tokens}') print(f'Cache reads: {usage.cache_read_tokens}') ``` ### Cache Point Limits Bedrock enforces a maximum of 4 cache points per request. Pydantic AI automatically manages this limit to ensure your requests always comply without errors. #### How Cache Points Are Allocated Cache points can be placed in three locations: 1. **System Prompt**: Via `bedrock_cache_instructions` setting (adds cache point to last system prompt block) 2. **Tool Definitions**: Via `bedrock_cache_tool_definitions` setting (adds cache point to last tool definition) 3. **Messages**: Via `CachePoint` markers or `bedrock_cache_messages` setting (adds cache points to message content) Each setting uses **at most 1 cache point**, but you can combine them. #### Automatic Cache Point Limiting When cache points from all sources (settings + `CachePoint` markers) exceed 4, Pydantic AI automatically removes excess cache points from **older message content** (keeping the most recent ones). ```python from pydantic_ai import Agent, CachePoint from pydantic_ai.models.bedrock import BedrockModelSettings agent = Agent( 'bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0', system_prompt='Instructions...', model_settings=BedrockModelSettings( bedrock_cache_instructions=True, # 1 cache point bedrock_cache_tool_definitions=True, # 1 cache point ), ) @agent.tool_plain def search() -> str: return 'data' # Already using 2 cache points (instructions + tools) # Can add 2 more CachePoint markers (4 total limit) result = agent.run_sync([ 'Context 1', CachePoint(), # Oldest - will be removed 'Context 2', CachePoint(), # Will be kept (3rd point) 'Context 3', CachePoint(), # Will be kept (4th point) 'Question' ]) # Final cache points: instructions + tools + Context 2 + Context 3 = 4 print(result.output) ``` **Key Points**: - System and tool cache points are **always preserved** - The cache point created by `bedrock_cache_messages` is **always preserved** (as it's the newest message cache point) - Additional `CachePoint` markers in messages are removed from oldest to newest when the limit is exceeded - This ensures critical caching (instructions/tools) is maintained while still benefiting from message-level caching ## `provider` argument You can provide a custom `BedrockProvider` via the `provider` argument. This is useful when you want to specify credentials directly or use a custom boto3 client: ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider # Using AWS credentials directly model = BedrockConverseModel( 'anthropic.claude-sonnet-4-5-20250929-v1:0', provider=BedrockProvider( region_name='us-east-1', aws_access_key_id='your-access-key', aws_secret_access_key='your-secret-key', ), ) agent = Agent(model) ... ``` You can also pass a pre-configured boto3 client: ```python import boto3 from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider # Using a pre-configured boto3 client bedrock_client = boto3.client('bedrock-runtime', region_name='us-east-1') model = BedrockConverseModel( 'anthropic.claude-sonnet-4-5-20250929-v1:0', provider=BedrockProvider(bedrock_client=bedrock_client), ) agent = Agent(model) ... ``` ## Using AWS Application Inference Profiles AWS Bedrock supports [custom application inference profiles](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-create.html) for cost tracking and resource management. Set [`bedrock_inference_profile`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_inference_profile) to route requests through an inference profile while keeping the base model name for detecting model capabilities: ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider provider = BedrockProvider(region_name='us-east-2') model = BedrockConverseModel( 'us.anthropic.claude-opus-4-5-20251101-v1:0', provider=provider, settings={ 'bedrock_inference_profile': 'arn:aws:bedrock:us-east-2:123456789012:application-inference-profile/my-profile', }, ) agent = Agent(model) ``` ## Configuring Retries Bedrock uses boto3's built-in retry mechanisms. You can configure retry behavior by passing a custom boto3 client with retry settings: ```python import boto3 from botocore.config import Config from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider # Configure retry settings config = Config( retries={ 'max_attempts': 5, 'mode': 'adaptive' # Recommended for rate limiting } ) bedrock_client = boto3.client( 'bedrock-runtime', region_name='us-east-1', config=config ) model = BedrockConverseModel( 'us.amazon.nova-micro-v1:0', provider=BedrockProvider(bedrock_client=bedrock_client), ) agent = Agent(model) ``` ### Retry Modes - `'legacy'` (default): 5 attempts, basic retry behavior - `'standard'`: 3 attempts, more comprehensive error coverage - `'adaptive'`: 3 attempts with client-side rate limiting (recommended for handling `ThrottlingException`) For more details on boto3 retry configuration, see the [AWS boto3 documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html). Note Unlike other providers that use httpx for HTTP requests, Bedrock uses boto3's native retry mechanisms. The retry strategies described in [HTTP Request Retries](/docs/ai/advanced-features/retries) do not apply to Bedrock. --- # [Cerebras](https://pydantic.dev/docs/ai/models/cerebras/) # Cerebras ## Install To use `CerebrasModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `cerebras` optional group: - [pip](#tab-panel-68) - [uv](#tab-panel-69) Terminal ```bash pip install "pydantic-ai-slim[cerebras]" ``` Terminal ```bash uv add "pydantic-ai-slim[cerebras]" ``` ## Configuration To use [Cerebras](https://cerebras.ai/) through their API, go to [cloud.cerebras.ai](https://cloud.cerebras.ai/?utm_source=3pi_pydantic-ai&utm_campaign=partner_doc) and generate an API key. For a list of available models, see the [Cerebras models documentation](https://inference-docs.cerebras.ai/models). ## Environment variable Once you have the API key, you can set it as an environment variable: Terminal ```bash export CEREBRAS_API_KEY='your-api-key' ``` You can then use `CerebrasModel` by name: ```python from pydantic_ai import Agent agent = Agent('cerebras:llama-3.3-70b') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.cerebras import CerebrasModel model = CerebrasModel('llama-3.3-70b') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.cerebras import CerebrasModel from pydantic_ai.providers.cerebras import CerebrasProvider model = CerebrasModel( 'llama-3.3-70b', provider=CerebrasProvider(api_key='your-api-key') ) agent = Agent(model) ... ``` You can also customize the `CerebrasProvider` with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.cerebras import CerebrasModel from pydantic_ai.providers.cerebras import CerebrasProvider custom_http_client = AsyncClient(timeout=30) model = CerebrasModel( 'llama-3.3-70b', provider=CerebrasProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` --- # [Cohere](https://pydantic.dev/docs/ai/models/cohere/) # Cohere ## Install To use `CohereModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `cohere` optional group: - [pip](#tab-panel-70) - [uv](#tab-panel-71) Terminal ```bash pip install "pydantic-ai-slim[cohere]" ``` Terminal ```bash uv add "pydantic-ai-slim[cohere]" ``` ## Configuration To use [Cohere](https://cohere.com/) through their API, go to [dashboard.cohere.com/api-keys](https://dashboard.cohere.com/api-keys) and follow your nose until you find the place to generate an API key. `CohereModelName` contains a list of the most popular Cohere models. ## Environment variable Once you have the API key, you can set it as an environment variable: Terminal ```bash export CO_API_KEY='your-api-key' ``` You can then use `CohereModel` by name: ```python from pydantic_ai import Agent agent = Agent('cohere:command-r7b-12-2024') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel model = CohereModel('command-r7b-12-2024') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel from pydantic_ai.providers.cohere import CohereProvider model = CohereModel('command-r7b-12-2024', provider=CohereProvider(api_key='your-api-key')) agent = Agent(model) ... ``` You can also customize the `CohereProvider` with a custom `http_client`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel from pydantic_ai.providers.cohere import CohereProvider custom_http_client = AsyncClient(timeout=30) model = CohereModel( 'command-r7b-12-2024', provider=CohereProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` --- # [Google](https://pydantic.dev/docs/ai/models/google/) # Google The `GoogleModel` is a model that uses the [`google-genai`](https://pypi.org/project/google-genai/) package under the hood to access Google's Gemini models via both the Generative Language API and Vertex AI. ## Install To use `GoogleModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `google` optional group: - [pip](#tab-panel-60) - [uv](#tab-panel-61) Terminal ```bash pip install "pydantic-ai-slim[google]" ``` Terminal ```bash uv add "pydantic-ai-slim[google]" ``` ## Configuration `GoogleModel` lets you use Google's Gemini models through their [Generative Language API](https://ai.google.dev/api/all-methods) (`generativelanguage.googleapis.com`) or [Vertex AI API](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models) (`*-aiplatform.googleapis.com`). ### API Key (Generative Language API) To use Gemini via the Generative Language API, go to [aistudio.google.com](https://aistudio.google.com/apikey) and create an API key. Once you have the API key, set it as an environment variable: Terminal ```bash export GOOGLE_API_KEY=your-api-key ``` You can then use `GoogleModel` by name (where GLA stands for Generative Language API): ```python from pydantic_ai import Agent agent = Agent('google-gla:gemini-3-pro-preview') ... ``` Or you can explicitly create the provider: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(api_key='your-api-key') model = GoogleModel('gemini-3-pro-preview', provider=provider) agent = Agent(model) ... ``` ### Vertex AI (Enterprise/Cloud) If you are an enterprise user, you can also use `GoogleModel` to access Gemini via Vertex AI. This interface has a number of advantages over the Generative Language API: 1. The VertexAI API comes with more enterprise readiness guarantees. 2. You can [purchase provisioned throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/provisioned-throughput#purchase-provisioned-throughput) with Vertex AI to guarantee capacity. 3. If you're running Pydantic AI inside GCP, you don't need to set up authentication, it should "just work". 4. You can decide which region to use, which might be important from a regulatory perspective, and might improve latency. You can authenticate using [application default credentials](https://cloud.google.com/docs/authentication/application-default-credentials), a service account, or an [API key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/api-keys?usertype=expressmode). Whichever way you authenticate, you'll need to have Vertex AI enabled in your GCP account. #### Application Default Credentials If you have the [`gcloud` CLI](https://cloud.google.com/sdk/gcloud) installed and configured, you can use `GoogleProvider` in Vertex AI mode by name: ```python from pydantic_ai import Agent agent = Agent('google-vertex:gemini-3-pro-preview') ... ``` Or you can explicitly create the provider and model: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(vertexai=True) model = GoogleModel('gemini-3-pro-preview', provider=provider) agent = Agent(model) ... ``` #### Service Account To use a service account JSON file, explicitly create the provider and model: google\_model\_service\_account.py ```python from google.oauth2 import service_account from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider credentials = service_account.Credentials.from_service_account_file( 'path/to/service-account.json', scopes=['https://www.googleapis.com/auth/cloud-platform'], ) provider = GoogleProvider(credentials=credentials, project='your-project-id') model = GoogleModel('gemini-3-flash-preview', provider=provider) agent = Agent(model) ... ``` #### API Key To use Vertex AI with an API key, [create a key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/api-keys?usertype=expressmode) and set it as an environment variable: Terminal ```bash export GOOGLE_API_KEY=your-api-key ``` You can then use `GoogleModel` in Vertex AI mode by name: ```python from pydantic_ai import Agent agent = Agent('google-vertex:gemini-3-pro-preview') ... ``` Or you can explicitly create the provider and model: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(vertexai=True, api_key='your-api-key') model = GoogleModel('gemini-3-pro-preview', provider=provider) agent = Agent(model) ... ``` #### Customizing Location or Project You can specify the location and/or project when using Vertex AI: google\_model\_location.py ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(vertexai=True, location='asia-east1', project='your-gcp-project-id') model = GoogleModel('gemini-3-pro-preview', provider=provider) agent = Agent(model) ... ``` #### Vertex AI service tier (`google_service_tier`) On **Vertex AI**, optional HTTP headers control how each request uses [Provisioned Throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/provisioned-throughput/use-provisioned-throughput) (PT) and [Flex PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/flex-paygo) pricing. Set the [`google_service_tier`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings.google_service_tier) field on [`GoogleModelSettings`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings) to one of the [`GoogleServiceTier`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleServiceTier) values. **Flex PayGo example** ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(location='global') model = GoogleModel('gemini-3-flash-preview', provider=provider) agent = Agent(model) result = agent.run_sync( 'Hello!', model_settings=GoogleModelSettings(google_service_tier='pt_then_flex'), ) ``` After a Flex request, you can inspect [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) `provider_details.get('traffic_type')` (e.g. `ON_DEMAND_FLEX` when Flex was used) if the API returns it. #### Model Garden You can access models from the [Model Garden](https://cloud.google.com/model-garden?hl=en) that support the `generateContent` API and are available under your GCP project, including but not limited to Gemini, using one of the following `model_name` patterns: - `{model_id}` for Gemini models - `{publisher}/{model_id}` - `publishers/{publisher}/models/{model_id}` - `projects/{project}/locations/{location}/publishers/{publisher}/models/{model_id}` ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider( project='your-gcp-project-id', location='us-central1', # the region where the model is available ) model = GoogleModel('meta/llama-3.3-70b-instruct-maas', provider=provider) agent = Agent(model) ... ``` ## Custom HTTP Client You can customize the `GoogleProvider` with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider custom_http_client = AsyncClient(timeout=30) model = GoogleModel( 'gemini-3-pro-preview', provider=GoogleProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` ## Document, Image, Audio, and Video Input `GoogleModel` supports multi-modal input, including documents, images, audio, and video. YouTube video URLs can be passed directly to Google models: youtube\_input.py ```py from pydantic_ai import Agent, VideoUrl from pydantic_ai.models.google import GoogleModel agent = Agent(GoogleModel('gemini-3-flash-preview')) result = agent.run_sync( [ 'What is this video about?', VideoUrl(url='https://www.youtube.com/watch?v=dQw4w9WgXcQ'), ] ) print(result.output) ``` Files can be uploaded via the [Files API](https://ai.google.dev/gemini-api/docs/files) and passed as URLs: file\_upload.py ```py from pydantic_ai import Agent, DocumentUrl from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider() file = provider.client.files.upload(file='pydantic-ai-logo.png') assert file.uri is not None agent = Agent(GoogleModel('gemini-3-flash-preview', provider=provider)) result = agent.run_sync( [ 'What company is this logo from?', DocumentUrl(url=file.uri, media_type=file.mime_type), ] ) print(result.output) ``` See the [input documentation](/docs/ai/advanced-features/input) for more details and examples. ## Model settings You can customize model behavior using [`GoogleModelSettings`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings): ```python from google.genai.types import HarmBlockThreshold, HarmCategory from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings settings = GoogleModelSettings( temperature=0.2, max_tokens=1024, google_thinking_config={'thinking_level': 'low'}, google_safety_settings=[ { 'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH, 'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE, } ] ) model = GoogleModel('gemini-3-pro-preview') agent = Agent(model, model_settings=settings) ... ``` ### Configure thinking Gemini 3 models use `thinking_level` to control thinking behavior: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings # Set thinking level for Gemini 3 models model_settings = GoogleModelSettings(google_thinking_config={'thinking_level': 'low'}) # 'low' or 'high' model = GoogleModel('gemini-3-flash-preview') agent = Agent(model, model_settings=model_settings) ... ``` For older models (pre-Gemini 3), you can use `thinking_budget` instead: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings # Disable thinking on older models by setting budget to 0 model_settings = GoogleModelSettings(google_thinking_config={'thinking_budget': 0}) model = GoogleModel('gemini-2.5-flash') # Older model agent = Agent(model, model_settings=model_settings) ... ``` Check out the [Gemini API docs](https://ai.google.dev/gemini-api/docs/thinking) for more on thinking. ### Safety settings You can customize the safety settings by setting the `google_safety_settings` field. ```python from google.genai.types import HarmBlockThreshold, HarmCategory from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings model_settings = GoogleModelSettings( google_safety_settings=[ { 'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH, 'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE, } ] ) model = GoogleModel('gemini-3-flash-preview') agent = Agent(model, model_settings=model_settings) ... ``` See the [Gemini API docs](https://ai.google.dev/gemini-api/docs/safety-settings) for more on safety settings. ### Logprobs You can return logprobs from the model in your response by setting `google_logprobs` and `google_top_logprobs` in the [`GoogleModelSettings`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings). This feature is only supported for non-streaming requests and Vertex AI. ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings from pydantic_ai.providers.google import GoogleProvider model_settings = GoogleModelSettings( google_logprobs=True, google_top_logprobs=2, ) model = GoogleModel( model_name='gemini-2.5-flash', provider=GoogleProvider(location='europe-west1', vertexai=True), ) agent = Agent(model, model_settings=model_settings) result = agent.run_sync('Your prompt here') # Access logprobs from provider_details logprobs = result.response.provider_details.get('logprobs') avg_logprobs = result.response.provider_details.get('avg_logprobs') ``` See the [Google Dev Blog](https://developers.googleblog.com/unlock-gemini-reasoning-with-logprobs-on-vertex-ai/) for more information. --- # [Groq](https://pydantic.dev/docs/ai/models/groq/) # Groq ## Install To use `GroqModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `groq` optional group: - [pip](#tab-panel-72) - [uv](#tab-panel-73) Terminal ```bash pip install "pydantic-ai-slim[groq]" ``` Terminal ```bash uv add "pydantic-ai-slim[groq]" ``` ## Configuration To use [Groq](https://groq.com/) through their API, go to [console.groq.com/keys](https://console.groq.com/keys) and follow your nose until you find the place to generate an API key. `GroqModelName` contains a list of available Groq models. ## Environment variable Once you have the API key, you can set it as an environment variable: Terminal ```bash export GROQ_API_KEY='your-api-key' ``` You can then use `GroqModel` by name: ```python from pydantic_ai import Agent agent = Agent('groq:llama-3.3-70b-versatile') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel model = GroqModel('llama-3.3-70b-versatile') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel from pydantic_ai.providers.groq import GroqProvider model = GroqModel( 'llama-3.3-70b-versatile', provider=GroqProvider(api_key='your-api-key') ) agent = Agent(model) ... ``` You can also customize the `GroqProvider` with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel from pydantic_ai.providers.groq import GroqProvider custom_http_client = AsyncClient(timeout=30) model = GroqModel( 'llama-3.3-70b-versatile', provider=GroqProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` --- # [Hugging Face](https://pydantic.dev/docs/ai/models/huggingface/) # Hugging Face [Hugging Face](https://huggingface.co/) is an AI platform with all major open source models, datasets, MCPs, and demos. You can use [Inference Providers](https://huggingface.co/docs/inference-providers) to run open source models like DeepSeek R1 on scalable serverless infrastructure. ## Install To use `HuggingFaceModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `huggingface` optional group: - [pip](#tab-panel-74) - [uv](#tab-panel-75) Terminal ```bash pip install "pydantic-ai-slim[huggingface]" ``` Terminal ```bash uv add "pydantic-ai-slim[huggingface]" ``` ## Configuration To use [Hugging Face](https://huggingface.co/) inference, you'll need to set up an account which will give you [free tier](https://huggingface.co/docs/inference-providers/pricing) allowance on [Inference Providers](https://huggingface.co/docs/inference-providers). To setup inference, follow these steps: 1. Go to [Hugging Face](https://huggingface.co/join) and sign up for an account. 2. Create a new access token in [Hugging Face](https://huggingface.co/settings/tokens). 3. Set the `HF_TOKEN` environment variable to the token you just created. Once you have a Hugging Face access token, you can set it as an environment variable: Terminal ```bash export HF_TOKEN='hf_token' ``` ## Usage You can then use [`HuggingFaceModel`](/docs/ai/api/models/huggingface/#pydantic_ai.models.huggingface.HuggingFaceModel) by name: ```python from pydantic_ai import Agent agent = Agent('huggingface:Qwen/Qwen3-235B-A22B') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel model = HuggingFaceModel('Qwen/Qwen3-235B-A22B') agent = Agent(model) ... ``` By default, the [`HuggingFaceModel`](/docs/ai/api/models/huggingface/#pydantic_ai.models.huggingface.HuggingFaceModel) uses the [`HuggingFaceProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.huggingface.HuggingFaceProvider) that will select automatically the first of the inference providers (Cerebras, Together AI, Cohere..etc) available for the model, sorted by your preferred order in [https://hf.co/settings/inference-providers](https://hf.co/settings/inference-providers). ## Configure the provider If you want to pass parameters in code to the provider, you can programmatically instantiate the [`HuggingFaceProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.huggingface.HuggingFaceProvider) and pass it to the model: ```python from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel from pydantic_ai.providers.huggingface import HuggingFaceProvider model = HuggingFaceModel('Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(api_key='hf_token', provider_name='nebius')) agent = Agent(model) ... ``` ## Custom Hugging Face client [`HuggingFaceProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.huggingface.HuggingFaceProvider) also accepts a custom [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/v0.29.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client via the `hf_client` parameter, so you can customise the `headers`, `bill_to` (billing to an HF organization you're a member of), `base_url` etc. as defined in the [Hugging Face Hub python library docs](https://huggingface.co/docs/huggingface_hub/package_reference/inference_client). ```python from huggingface_hub import AsyncInferenceClient from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel from pydantic_ai.providers.huggingface import HuggingFaceProvider client = AsyncInferenceClient( bill_to='openai', api_key='hf_token', provider='fireworks-ai', ) model = HuggingFaceModel( 'Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(hf_client=client), ) agent = Agent(model) ... ``` --- # [Mistral](https://pydantic.dev/docs/ai/models/mistral/) # Mistral ## Install To use `MistralModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `mistral` optional group: - [pip](#tab-panel-76) - [uv](#tab-panel-77) Terminal ```bash pip install "pydantic-ai-slim[mistral]" ``` Terminal ```bash uv add "pydantic-ai-slim[mistral]" ``` ## Configuration To use [Mistral](https://mistral.ai) through their API, go to [console.mistral.ai/api-keys/](https://console.mistral.ai/api-keys/) and follow your nose until you find the place to generate an API key. `LatestMistralModelNames` contains a list of the most popular Mistral models. ## Environment variable Once you have the API key, you can set it as an environment variable: Terminal ```bash export MISTRAL_API_KEY='your-api-key' ``` You can then use `MistralModel` by name: ```python from pydantic_ai import Agent agent = Agent('mistral:mistral-large-latest') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel model = MistralModel('mistral-small-latest') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel from pydantic_ai.providers.mistral import MistralProvider model = MistralModel( 'mistral-large-latest', provider=MistralProvider(api_key='your-api-key', base_url='https://') ) agent = Agent(model) ... ``` You can also customize the provider with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel from pydantic_ai.providers.mistral import MistralProvider custom_http_client = AsyncClient(timeout=30) model = MistralModel( 'mistral-large-latest', provider=MistralProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` --- # [OpenAI](https://pydantic.dev/docs/ai/models/openai/) # OpenAI ## Install To use OpenAI models or OpenAI-compatible APIs, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openai` optional group: - [pip](#tab-panel-58) - [uv](#tab-panel-59) Terminal ```bash pip install "pydantic-ai-slim[openai]" ``` Terminal ```bash uv add "pydantic-ai-slim[openai]" ``` ## Configuration To use `OpenAIChatModel` with the OpenAI API, go to [platform.openai.com](https://platform.openai.com/) and follow your nose until you find the place to generate an API key. ## Environment variable Once you have the API key, you can set it as an environment variable: Terminal ```bash export OPENAI_API_KEY='your-api-key' ``` You can then use `OpenAIChatModel` by name: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel model = OpenAIChatModel('gpt-5.2') agent = Agent(model) ... ``` By default, the `OpenAIChatModel` uses the `OpenAIProvider` with the `base_url` set to `https://api.openai.com/v1`. ## Configure the provider If you want to pass parameters in code to the provider, you can programmatically instantiate the [OpenAIProvider](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.openai.OpenAIProvider) and pass it to the model: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(api_key='your-api-key')) agent = Agent(model) ... ``` ## Custom OpenAI Client `OpenAIProvider` also accepts a custom `AsyncOpenAI` client via the `openai_client` parameter, so you can customise the `organization`, `project`, `base_url` etc. as defined in the [OpenAI API docs](https://platform.openai.com/docs/api-reference). custom\_openai\_client.py ```python from openai import AsyncOpenAI from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider client = AsyncOpenAI(max_retries=3) model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(openai_client=client)) agent = Agent(model) ... ``` You could also use the [`AsyncAzureOpenAI`](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/switching-endpoints) client to use the Azure OpenAI API. Note that the `AsyncAzureOpenAI` is a subclass of `AsyncOpenAI`. ```python from openai import AsyncAzureOpenAI from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider client = AsyncAzureOpenAI( azure_endpoint='...', api_version='2024-07-01-preview', api_key='your-api-key', ) model = OpenAIChatModel( 'gpt-5.2', provider=OpenAIProvider(openai_client=client), ) agent = Agent(model) ... ``` ## OpenAI Responses API Pydantic AI also supports OpenAI's [Responses API](https://platform.openai.com/docs/api-reference/responses) through [`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel): ```python from pydantic_ai import Agent agent = Agent('openai-responses:gpt-5.2') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel model = OpenAIResponsesModel('gpt-5.2') agent = Agent(model) ... ``` You can learn more about the differences between the Responses API and Chat Completions API in the [OpenAI API docs](https://platform.openai.com/docs/guides/migrate-to-responses). ### Built-in tools The Responses API has built-in tools that you can use instead of building your own: - [Web search](https://platform.openai.com/docs/guides/tools-web-search): allow models to search the web for the latest information before generating a response. - [Code interpreter](https://platform.openai.com/docs/guides/tools-code-interpreter): allow models to write and run Python code in a sandboxed environment before generating a response. - [Image generation](https://platform.openai.com/docs/guides/tools-image-generation): allow models to generate images based on a text prompt. - [File search](https://platform.openai.com/docs/guides/tools-file-search): allow models to search your files for relevant information before generating a response. - [Computer use](https://platform.openai.com/docs/guides/tools-computer-use): allow models to use a computer to perform tasks on your behalf. Web search, Code interpreter, Image generation, and File search are natively supported through the [Built-in tools](/docs/ai/tools-toolsets/builtin-tools) feature. Computer use can be enabled by passing an [`openai.types.responses.ComputerToolParam`](https://github.com/openai/openai-python/blob/main/src/openai/types/responses/computer_tool_param.py) in the `openai_builtin_tools` setting on [`OpenAIResponsesModelSettings`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings). It doesn't currently generate [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) or [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) parts in the message history, or streamed events; please submit an issue if you need native support for this built-in tool. computer\_use\_tool.py ```python from openai.types.responses import ComputerToolParam from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model_settings = OpenAIResponsesModelSettings( openai_builtin_tools=[ ComputerToolParam( type='computer_use', ) ], ) model = OpenAIResponsesModel('gpt-5.2') agent = Agent(model=model, model_settings=model_settings) result = agent.run_sync('Open a new browser tab') print(result.output) ``` #### Referencing earlier responses The Responses API supports referencing earlier model responses in a new request using a `previous_response_id` parameter, to ensure the full [conversation state](https://platform.openai.com/docs/guides/conversation-state?api-mode=responses#passing-context-from-the-previous-response) including [reasoning items](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) are kept in context. This is available through the `openai_previous_response_id` field in [`OpenAIResponsesModelSettings`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings). ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model = OpenAIResponsesModel('gpt-5.2') agent = Agent(model=model) result = agent.run_sync('The secret is 1234') model_settings = OpenAIResponsesModelSettings( openai_previous_response_id=result.all_messages()[-1].provider_response_id ) result = agent.run_sync('What is the secret code?', model_settings=model_settings) print(result.output) #> 1234 ``` By passing the `provider_response_id` from an earlier run, you can allow the model to build on its own prior reasoning without needing to resend the full message history. ##### Automatically referencing earlier responses When the `openai_previous_response_id` field is set to `'auto'`, Pydantic AI will automatically select the most recent `provider_response_id` from message history and omit messages that came before it, letting the OpenAI API leverage server-side history instead for improved efficiency. ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model = OpenAIResponsesModel('gpt-5.2') agent = Agent(model=model) result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. # When set to 'auto', the most recent provider_response_id # and messages after it are sent as request. model_settings = OpenAIResponsesModelSettings(openai_previous_response_id='auto') result2 = agent.run_sync( 'Explain?', message_history=result1.new_messages(), model_settings=model_settings ) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. ``` #### Message Compaction The Responses API supports [compacting message history](https://platform.openai.com/docs/guides/conversation-state#compaction-advanced) to reduce token usage in long conversations. Compaction produces an encrypted summary that replaces older messages while preserving context. The easiest way to enable compaction is with the [`OpenAICompaction`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAICompaction) capability, which automatically calls the compact endpoint when the message count exceeds a threshold: openai\_compaction.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAICompaction agent = Agent( 'openai-responses:gpt-4o', capabilities=[OpenAICompaction(message_count_threshold=10)], ) message_history = [] for question in ['What is 2+2?', 'And 3+3?', 'And 4+4?']: result = agent.run_sync(question, message_history=message_history) message_history = result.all_messages() # After 10 messages, history is automatically compacted before each request ``` For advanced use cases, you can call [`compact_messages`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel.compact_messages) directly on the model. Tip For best results, combine compaction with [`openai_previous_response_id='auto'`](#automatically-referencing-earlier-responses) in your model settings. This lets OpenAI's server-side history work alongside compaction for improved context continuity. Note that this sends conversation data to OpenAI's server-side storage. ## OpenAI-compatible Models Many providers and models are compatible with the OpenAI API, and can be used with `OpenAIChatModel` in Pydantic AI. Before getting started, check the [installation and configuration](#install) instructions above. To use another OpenAI-compatible API, you can set the `OPENAI_BASE_URL` and `OPENAI_API_KEY` environment variables, or make use of the `base_url` and `api_key` arguments from [`OpenAIProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.openai.OpenAIProvider): ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'model_name', provider=OpenAIProvider( base_url='https://', api_key='your-api-key' ), ) agent = Agent(model) ... ``` Various providers also have their own provider classes so that you don't need to specify the base URL yourself and you can use the standard `_API_KEY` environment variable to set the API key. When a provider has its own provider class, you can use the `Agent(":")` shorthand, e.g. `Agent("deepseek:deepseek-chat")` or `Agent("moonshotai:kimi-k2-0711-preview")`, instead of building the `OpenAIChatModel` explicitly. Similarly, you can pass the provider name as a string to the `provider` argument on `OpenAIChatModel` instead of instantiating the provider class explicitly. ### Model Profile Sometimes, the provider or model you're using will have slightly different requirements than OpenAI's API or models, like having different restrictions on JSON schemas for tool definitions, or not supporting tool definitions to be marked as strict. When using an alternative provider class provided by Pydantic AI, an appropriate model profile is typically selected automatically based on the model name. If the model you're using is not working correctly out of the box, you can tweak various aspects of how model requests are constructed by providing your own [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) (for behaviors shared among all model classes) or [`OpenAIModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.openai.OpenAIModelProfile) (for behaviors specific to `OpenAIChatModel`): ```py from pydantic_ai import Agent, InlineDefsJsonSchemaTransformer from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.profiles.openai import OpenAIModelProfile from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'model_name', provider=OpenAIProvider( base_url='https://.com', api_key='your-api-key' ), profile=OpenAIModelProfile( json_schema_transformer=InlineDefsJsonSchemaTransformer, # Supported by any model class on a plain ModelProfile openai_supports_strict_tool_definition=False # Supported by OpenAIModel only, requires OpenAIModelProfile ) ) agent = Agent(model) ``` ### DeepSeek To use the [DeepSeek](https://deepseek.com) provider, first create an API key by following the [Quick Start guide](https://api-docs.deepseek.com/). You can then set the `DEEPSEEK_API_KEY` environment variable and use [`DeepSeekProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.deepseek.DeepSeekProvider) by name: ```python from pydantic_ai import Agent agent = Agent('deepseek:deepseek-chat') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.deepseek import DeepSeekProvider model = OpenAIChatModel( 'deepseek-chat', provider=DeepSeekProvider(api_key='your-deepseek-api-key'), ) agent = Agent(model) ... ``` You can also customize any provider with a custom `http_client`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.deepseek import DeepSeekProvider custom_http_client = AsyncClient(timeout=30) model = OpenAIChatModel( 'deepseek-chat', provider=DeepSeekProvider( api_key='your-deepseek-api-key', http_client=custom_http_client ), ) agent = Agent(model) ... ``` ### Alibaba Cloud Model Studio (DashScope) To use Qwen models via [Alibaba Cloud Model Studio (DashScope)](https://www.alibabacloud.com/en/product/modelstudio), you can set the `ALIBABA_API_KEY` (or `DASHSCOPE_API_KEY`) environment variable and use [`AlibabaProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.alibaba.AlibabaProvider) by name: ```python from pydantic_ai import Agent agent = Agent('alibaba:qwen-max') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.alibaba import AlibabaProvider model = OpenAIChatModel( 'qwen-max', provider=AlibabaProvider(api_key='your-api-key'), ) agent = Agent(model) ... ``` The `AlibabaProvider` uses the international DashScope compatible endpoint `https://dashscope-intl.aliyuncs.com/compatible-mode/v1` by default. You can override this by passing a custom `base_url`: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.alibaba import AlibabaProvider model = OpenAIChatModel( 'qwen-max', provider=AlibabaProvider( api_key='your-api-key', base_url='https://dashscope.aliyuncs.com/compatible-mode/v1', # China region ), ) agent = Agent(model) ... ``` ### Ollama Pydantic AI supports both self-hosted [Ollama](https://ollama.com/) servers (running locally or remotely) and [Ollama Cloud](https://ollama.com/cloud). For servers running locally, use the `http://localhost:11434/v1` base URL. For Ollama Cloud, use `https://ollama.com/v1` and ensure an API key is set. You can set the `OLLAMA_BASE_URL` and (optionally) `OLLAMA_API_KEY` environment variables and use [`OllamaProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.ollama.OllamaProvider) by name: ```python from pydantic_ai import Agent agent = Agent('ollama:gpt-oss:20b') ... ``` Or initialise the model and provider directly: ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.ollama import OllamaProvider class CityLocation(BaseModel): city: str country: str ollama_model = OpenAIChatModel( model_name='gpt-oss:20b', provider=OllamaProvider(base_url='http://localhost:11434/v1'), # (1) ) agent = Agent(ollama_model, output_type=CityLocation) result = agent.run_sync('Where were the olympics held in 2012?') print(result.output) #> city='London' country='United Kingdom' print(result.usage()) #> RunUsage(input_tokens=57, output_tokens=8, requests=1) ``` For Ollama Cloud, use the `base_url='https://ollama.com/v1'` and set the `OLLAMA_API_KEY` environment variable. ### Azure AI Foundry To use [Azure AI Foundry](https://ai.azure.com/) as your provider, set `AZURE_OPENAI_ENDPOINT` to a URL whose path ends in `/v1` (for example `https://.openai.azure.com/openai/v1/` or `https://.services.ai.azure.com/openai/v1/`), set `AZURE_OPENAI_API_KEY`, and use [`AzureProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.azure.AzureProvider) by name: ```python from pydantic_ai import Agent agent = Agent('azure:gpt-5.2') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.azure import AzureProvider model = OpenAIChatModel( 'gpt-5.2', provider=AzureProvider( azure_endpoint='https://your-resource.openai.azure.com/openai/v1/', api_key='your-api-key', ), ) agent = Agent(model) ... ``` This targets the [Azure OpenAI v1 API](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle), which Microsoft recommends for all new projects. It also pairs naturally with the Responses API -- see [Using Azure with the Responses API](#using-azure-with-the-responses-api) below. [`AzureProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.azure.AzureProvider) also recognises [Azure AI Foundry serverless model deployments](https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-models/concepts/endpoints) at `https://..models.ai.azure.com` and connects to them the same way. #### Connecting to an existing `api-version`\-based deployment If your resource still uses the dated `api-version` API, pass `api_version` (or set the `OPENAI_API_VERSION` environment variable) and point `azure_endpoint` at the resource root instead: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.azure import AzureProvider model = OpenAIChatModel( 'gpt-5.2', provider=AzureProvider( azure_endpoint='https://your-resource.openai.azure.com/', api_version='2024-12-01-preview', api_key='your-api-key', ), ) agent = Agent(model) ... ``` #### Using Azure with the Responses API Azure AI Foundry also supports the OpenAI Responses API through [`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel). This is particularly recommended when working with document inputs ([`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl) and [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent)), as Azure's Chat Completions API does not support these input types. Document processing with Azure using Responses API ```python from pydantic_ai import Agent, BinaryContent from pydantic_ai.models.openai import OpenAIResponsesModel from pydantic_ai.providers.azure import AzureProvider pdf_bytes = b'%PDF-1.4 ...' # Your PDF content model = OpenAIResponsesModel( 'gpt-5.2', provider=AzureProvider( azure_endpoint='https://your-resource.openai.azure.com/openai/v1/', api_key='your-api-key', ), ) agent = Agent(model) result = agent.run_sync([ 'Summarize this document', BinaryContent(data=pdf_bytes, media_type='application/pdf'), ]) ``` ### Vercel AI Gateway To use [Vercel's AI Gateway](https://vercel.com/docs/ai-gateway), first follow the [documentation](https://vercel.com/docs/ai-gateway) instructions on obtaining an API key or OIDC token. You can set the `VERCEL_AI_GATEWAY_API_KEY` and `VERCEL_OIDC_TOKEN` environment variables and use [`VercelProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.vercel.VercelProvider) by name: ```python from pydantic_ai import Agent agent = Agent('vercel:anthropic/claude-sonnet-4-5') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.vercel import VercelProvider model = OpenAIChatModel( 'anthropic/claude-sonnet-4-5', provider=VercelProvider(api_key='your-vercel-ai-gateway-api-key'), ) agent = Agent(model) ... ``` ### MoonshotAI Create an API key in the [Moonshot Console](https://platform.moonshot.ai/console). You can set the `MOONSHOTAI_API_KEY` environment variable and use [`MoonshotAIProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.moonshotai.MoonshotAIProvider) by name: ```python from pydantic_ai import Agent agent = Agent('moonshotai:kimi-k2-0711-preview') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.moonshotai import MoonshotAIProvider model = OpenAIChatModel( 'kimi-k2-0711-preview', provider=MoonshotAIProvider(api_key='your-moonshot-api-key'), ) agent = Agent(model) ... ``` ### GitHub Models To use [GitHub Models](https://docs.github.com/en/github-models), you'll need a GitHub personal access token with the `models: read` permission. You can set the `GITHUB_API_KEY` environment variable and use [`GitHubProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.github.GitHubProvider) by name: ```python from pydantic_ai import Agent agent = Agent('github:xai/grok-3-mini') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.github import GitHubProvider model = OpenAIChatModel( 'xai/grok-3-mini', # GitHub Models uses prefixed model names provider=GitHubProvider(api_key='your-github-token'), ) agent = Agent(model) ... ``` GitHub Models supports various model families with different prefixes. You can see the full list on the [GitHub Marketplace](https://github.com/marketplace?type=models) or the public [catalog endpoint](https://models.github.ai/catalog/models). ### Perplexity Follow the Perplexity [getting started](https://docs.perplexity.ai/guides/getting-started) guide to create an API key, then initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'sonar-pro', provider=OpenAIProvider( base_url='https://api.perplexity.ai', api_key='your-perplexity-api-key', ), ) agent = Agent(model) ... ``` ### Fireworks AI Go to [Fireworks.AI](https://fireworks.ai/) and create an API key in your account settings. You can set the `FIREWORKS_API_KEY` environment variable and use [`FireworksProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.fireworks.FireworksProvider) by name: ```python from pydantic_ai import Agent agent = Agent('fireworks:accounts/fireworks/models/qwq-32b') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.fireworks import FireworksProvider model = OpenAIChatModel( 'accounts/fireworks/models/qwq-32b', # model library available at https://fireworks.ai/models provider=FireworksProvider(api_key='your-fireworks-api-key'), ) agent = Agent(model) ... ``` ### Together AI Go to [Together.ai](https://www.together.ai/) and create an API key in your account settings. You can set the `TOGETHER_API_KEY` environment variable and use [`TogetherProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.together.TogetherProvider) by name: ```python from pydantic_ai import Agent agent = Agent('together:meta-llama/Llama-3.3-70B-Instruct-Turbo-Free') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.together import TogetherProvider model = OpenAIChatModel( 'meta-llama/Llama-3.3-70B-Instruct-Turbo-Free', # model library available at https://www.together.ai/models provider=TogetherProvider(api_key='your-together-api-key'), ) agent = Agent(model) ... ``` ### Heroku AI To use [Heroku AI](https://www.heroku.com/ai), first create an API key. You can set the `HEROKU_INFERENCE_KEY` and (optionally) `HEROKU_INFERENCE_URL` environment variables and use [`HerokuProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.heroku.HerokuProvider) by name: ```python from pydantic_ai import Agent agent = Agent('heroku:claude-sonnet-4-5') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.heroku import HerokuProvider model = OpenAIChatModel( 'claude-sonnet-4-5', provider=HerokuProvider(api_key='your-heroku-inference-key'), ) agent = Agent(model) ... ``` ### LiteLLM To use [LiteLLM](https://www.litellm.ai/), set the configs as outlined in the [doc](https://docs.litellm.ai/docs/set_keys). In `LiteLLMProvider`, you can pass `api_base` and `api_key`. The value of these configs will depend on your setup. For example, if you are using OpenAI models, then you need to pass `https://api.openai.com/v1` as the `api_base` and your OpenAI API key as the `api_key`. If you are using a LiteLLM proxy server running on your local machine, then you need to pass `http://localhost:` as the `api_base` and your LiteLLM API key (or a placeholder) as the `api_key`. To use custom LLMs, use `custom/` prefix in the model name. Once you have the configs, use the [`LiteLLMProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.litellm.LiteLLMProvider) as follows: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.litellm import LiteLLMProvider model = OpenAIChatModel( 'openai/gpt-5.2', provider=LiteLLMProvider( api_base='', api_key='' ) ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ... ``` ### Nebius AI Studio Go to [Nebius AI Studio](https://studio.nebius.com/) and create an API key. You can set the `NEBIUS_API_KEY` environment variable and use [`NebiusProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.nebius.NebiusProvider) by name: ```python from pydantic_ai import Agent agent = Agent('nebius:Qwen/Qwen3-32B-fast') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.nebius import NebiusProvider model = OpenAIChatModel( 'Qwen/Qwen3-32B-fast', provider=NebiusProvider(api_key='your-nebius-api-key'), ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` ### OVHcloud AI Endpoints To use OVHcloud AI Endpoints, you need to create a new API key. To do so, go to the [OVHcloud manager](https://ovh.com/manager), then in Public Cloud > AI Endpoints > API keys. Click on `Create a new API key` and copy your new key. You can explore the [catalog](https://endpoints.ai.cloud.ovh.net/catalog) to find which models are available. You can set the `OVHCLOUD_API_KEY` environment variable and use [`OVHcloudProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.ovhcloud.OVHcloudProvider) by name: ```python from pydantic_ai import Agent agent = Agent('ovhcloud:gpt-oss-120b') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` If you need to configure the provider, you can use the [`OVHcloudProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.ovhcloud.OVHcloudProvider) class: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.ovhcloud import OVHcloudProvider model = OpenAIChatModel( 'gpt-oss-120b', provider=OVHcloudProvider(api_key='your-api-key'), ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` ### SambaNova To use [SambaNova Cloud](https://cloud.sambanova.ai/), you need to obtain an API key from the [SambaNova Cloud dashboard](https://cloud.sambanova.ai/dashboard). SambaNova provides access to multiple model families including Meta Llama, DeepSeek, Qwen, and Mistral models with fast inference speeds. You can set the `SAMBANOVA_API_KEY` environment variable and use [`SambaNovaProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.sambanova.SambaNovaProvider) by name: ```python from pydantic_ai import Agent agent = Agent('sambanova:Meta-Llama-3.1-8B-Instruct') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.sambanova import SambaNovaProvider model = OpenAIChatModel( 'Meta-Llama-3.1-8B-Instruct', provider=SambaNovaProvider(api_key='your-api-key'), ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` For a complete list of available models, see the [SambaNova supported models documentation](https://docs.sambanova.ai/docs/en/models/sambacloud-models). You can customize the base URL if needed: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.sambanova import SambaNovaProvider model = OpenAIChatModel( 'DeepSeek-R1-0528', provider=SambaNovaProvider( api_key='your-api-key', base_url='https://custom.endpoint.com/v1', ), ) agent = Agent(model) ... ``` --- # [OpenRouter](https://pydantic.dev/docs/ai/models/openrouter/) # OpenRouter ## Install To use `OpenRouterModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openrouter` optional group: - [pip](#tab-panel-78) - [uv](#tab-panel-79) Terminal ```bash pip install "pydantic-ai-slim[openrouter]" ``` Terminal ```bash uv add "pydantic-ai-slim[openrouter]" ``` ## Configuration To use [OpenRouter](https://openrouter.ai), first create an API key at [openrouter.ai/keys](https://openrouter.ai/keys). You can set the `OPENROUTER_API_KEY` environment variable and use [`OpenRouterProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.openrouter.OpenRouterProvider) by name: ```python from pydantic_ai import Agent agent = Agent('openrouter:anthropic/claude-sonnet-4-5') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openrouter import OpenRouterModel from pydantic_ai.providers.openrouter import OpenRouterProvider model = OpenRouterModel( 'anthropic/claude-sonnet-4-5', provider=OpenRouterProvider(api_key='your-openrouter-api-key'), ) agent = Agent(model) ... ``` ## App Attribution OpenRouter has an [app attribution](https://openrouter.ai/docs/app-attribution) feature to track your application in their public ranking and analytics. You can pass in an `app_url` and `app_title` when initializing the provider to enable app attribution. ```python from pydantic_ai.providers.openrouter import OpenRouterProvider provider=OpenRouterProvider( api_key='your-openrouter-api-key', app_url='https://your-app.com', app_title='Your App', ), ... ``` ## Model Settings You can customize model behavior using [`OpenRouterModelSettings`](/docs/ai/api/models/openrouter/#pydantic_ai.models.openrouter.OpenRouterModelSettings): ```python from pydantic_ai import Agent from pydantic_ai.models.openrouter import OpenRouterModel, OpenRouterModelSettings settings = OpenRouterModelSettings( openrouter_reasoning={ 'effort': 'high', }, openrouter_usage={ 'include': True, } ) model = OpenRouterModel('openai/gpt-5.2') agent = Agent(model, model_settings=settings) ... ``` ## Web Search OpenRouter supports web search via its [plugins](https://openrouter.ai/docs/guides/features/plugins/web-search). You can enable it using the [`WebSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchTool). ### Web Search Parameters You can customize the web search behavior using the `search_context_size` parameter on [`WebSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchTool): ```python from pydantic_ai import Agent from pydantic_ai.builtin_tools import WebSearchTool from pydantic_ai.models.openrouter import OpenRouterModel tool = WebSearchTool(search_context_size='high') model = OpenRouterModel('openai/gpt-4.1') agent = Agent( model, builtin_tools=[tool] ) result = agent.run_sync('What is the latest news in AI?') ``` --- # [Outlines](https://pydantic.dev/docs/ai/models/outlines/) # Outlines ## Install As Outlines is a library allowing you to run models from various different providers, it does not include the necessary dependencies for any provider by default. As a result, to use the [`OutlinesModel`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel), you must install `pydantic-ai-slim` with an optional group composed of outlines, a dash, and the name of the specific model provider you would use through Outlines. For instance: - [pip](#tab-panel-80) - [uv](#tab-panel-81) Terminal ```bash pip install "pydantic-ai-slim[outlines-transformers]" ``` Terminal ```bash uv add "pydantic-ai-slim[outlines-transformers]" ``` Or - [pip](#tab-panel-82) - [uv](#tab-panel-83) Terminal ```bash pip install "pydantic-ai-slim[outlines-mlxlm]" ``` Terminal ```bash uv add "pydantic-ai-slim[outlines-mlxlm]" ``` There are 5 optional groups for the 5 model providers supported through Outlines: - `outlines-transformers` - `outlines-llamacpp` - `outlines-mlxlm` - `outlines-sglang` - `outlines-vllm-offline` ## Model Initialization As Outlines is not an inference provider, but instead a library allowing you to run both local and API-based models, instantiating the model is a bit different from the other models available on Pydantic AI. To initialize the `OutlinesModel` through the `__init__` method, the first argument you must provide has to be an `outlines.Model` or an `outlines.AsyncModel` instance. For instance: ```python import outlines from transformers import AutoModelForCausalLM, AutoTokenizer from pydantic_ai.models.outlines import OutlinesModel outlines_model = outlines.from_transformers( AutoModelForCausalLM.from_pretrained('erwanf/gpt2-mini'), AutoTokenizer.from_pretrained('erwanf/gpt2-mini') ) model = OutlinesModel(outlines_model) ``` As you already providing an Outlines model instance, there is no need to provide an `OutlinesProvider` yourself. ### Model Loading Methods Alternatively, you can use some `OutlinesModel` class methods made to load a specific type of Outlines model directly. To do so, you must provide as arguments the same arguments you would have given to the associated Outlines model loading function (except in the case of SGLang). There are methods for the 5 Outlines models that are officially supported in the integration into Pydantic AI: - [`from_transformers`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel.from_transformers) - [`from_llamacpp`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel.from_llamacpp) - [`from_mlxlm`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel.from_mlxlm) - [`from_sglang`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel.from_sglang) - [`from_vllm_offline`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel.from_vllm_offline) #### Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer from pydantic_ai.models.outlines import OutlinesModel model = OutlinesModel.from_transformers( AutoModelForCausalLM.from_pretrained('microsoft/Phi-3-mini-4k-instruct'), AutoTokenizer.from_pretrained('microsoft/Phi-3-mini-4k-instruct') ) ``` #### LlamaCpp ```python from llama_cpp import Llama from pydantic_ai.models.outlines import OutlinesModel model = OutlinesModel.from_llamacpp( Llama.from_pretrained( repo_id='TheBloke/Mistral-7B-Instruct-v0.2-GGUF', filename='mistral-7b-instruct-v0.2.Q5_K_M.gguf', ) ) ``` #### MLXLM ```python from mlx_lm import load from pydantic_ai.models.outlines import OutlinesModel model = OutlinesModel.from_mlxlm( *load('mlx-community/TinyLlama-1.1B-Chat-v1.0-4bit') ) ``` #### SGLang ```python from pydantic_ai.models.outlines import OutlinesModel model = OutlinesModel.from_sglang( 'http://localhost:11434', 'api_key', 'meta-llama/Llama-3.1-8B' ) ``` #### vLLM Offline ```python from vllm import LLM from pydantic_ai.models.outlines import OutlinesModel model = OutlinesModel.from_vllm_offline( LLM('microsoft/Phi-3-mini-4k-instruct') ) ``` ## Running the model Once you have initialized an `OutlinesModel`, you can use it with an Agent as with all other Pydantic AI models. As Outlines is focused on structured output, this provider supports the `output_type` component through the [`NativeOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.NativeOutput) format. There is not need to include information on the required output format in your prompt, instructions based on the `output_type` will be included automatically. ```python from pydantic import BaseModel from transformers import AutoModelForCausalLM, AutoTokenizer from pydantic_ai import Agent, ModelSettings from pydantic_ai.models.outlines import OutlinesModel class Box(BaseModel): """Class representing a box""" width: int height: int depth: int units: str model = OutlinesModel.from_transformers( AutoModelForCausalLM.from_pretrained('microsoft/Phi-3-mini-4k-instruct'), AutoTokenizer.from_pretrained('microsoft/Phi-3-mini-4k-instruct') ) agent = Agent(model, output_type=Box) result = agent.run_sync( 'Give me the dimensions of a box', model_settings=ModelSettings(extra_body={'max_new_tokens': 100}) ) print(result.output) # width=20 height=30 depth=40 units='cm' ``` Outlines does not support tools yet, but support for that feature will be added in the near future. ## Multimodal models If the model you are running through Outlines and the provider selected supports it, you can include images in your prompts using [`ImageUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ImageUrl) or [`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage). In that case, the prompt you provide when running the agent should be a list containing a string and one or several images. See the [input documentation](/docs/ai/advanced-features/input) for details and examples on using assets in model inputs. This feature is supported in Outlines for the `SGLang` and `Transformers` models. If you want to run a multimodal model through `transformers`, you must provide a processor instead of a tokenizer as the second argument when initializing the model with the `OutlinesModel.from_transformers` method. ```python from datetime import date from typing import Literal import torch from pydantic import BaseModel from transformers import AutoProcessor, Qwen2VLForConditionalGeneration from pydantic_ai import Agent, ModelSettings from pydantic_ai.messages import ImageUrl from pydantic_ai.models.outlines import OutlinesModel MODEL_NAME = 'Qwen/Qwen2-VL-7B-Instruct' class Item(BaseModel): name: str quantity: int | None price_per_unit: float | None total_price: float | None class ReceiptSummary(BaseModel): store_name: str store_address: str store_number: int | None items: list[Item] tax: float | None total: float | None date: date payment_method: Literal['cash', 'credit', 'debit', 'check', 'other'] tf_model = Qwen2VLForConditionalGeneration.from_pretrained( MODEL_NAME, device_map='auto', dtype=torch.bfloat16 ) tf_processor = AutoProcessor.from_pretrained( MODEL_NAME, device_map='auto' ) model = OutlinesModel.from_transformers(tf_model, tf_processor) agent = Agent(model, output_type=ReceiptSummary) result = agent.run_sync( [ 'You are an expert at extracting information from receipts. Please extract the information from the receipt. Be as detailed as possible, do not miss any information', ImageUrl('https://raw.githubusercontent.com/dottxt-ai/outlines/refs/heads/main/docs/examples/images/trader-joes-receipt.jpg') ], model_settings=ModelSettings(extra_body={'max_new_tokens': 1000}) ) print(result.output) # store_name="Trader Joe's" # store_address='401 Bay Street, San Francisco, CA 94133' # store_number=0 # items=[ # Item(name='BANANA EACH', quantity=7, price_per_unit=0.23, total_price=1.61), # Item(name='BAREBELLS CHOCOLATE DOUG',quantity=1, price_per_unit=2.29, total_price=2.29), # Item(name='BAREBELLS CREAMY CRISP', quantity=1, price_per_unit=2.29, total_price=2.29), # Item(name='BAREBELLS CHOCOLATE DOUG', quantity=1, price_per_unit=2.29, total_price=2.29), # Item(name='BAREBELLS CARAMEL CASHEW', quantity=2, price_per_unit=2.29, total_price=4.58), # Item(name='BAREBELLS CREAMY CRISP', quantity=1, price_per_unit=2.29, total_price=2.29), # Item(name='T SPINDRIFT ORANGE MANGO 8', quantity=1, price_per_unit=7.49, total_price=7.49), # Item(name='T Bottle Deposit', quantity=8, price_per_unit=0.05, total_price=0.4), # Item(name='MILK ORGANIC GALLON WHOL', quantity=1, price_per_unit=6.79, total_price=6.79), # Item(name='CLASSIC GREEK SALAD', quantity=1, price_per_unit=3.49, total_price=3.49), # Item(name='COBB SALAD', quantity=1, price_per_unit=5.99, total_price=5.99), # Item(name='PEPPER BELL RED XL EACH', quantity=1, price_per_unit=1.29, total_price=1.29), # Item(name='BAG FEE.', quantity=1, price_per_unit=0.25, total_price=0.25), # Item(name='BAG FEE.', quantity=1, price_per_unit=0.25, total_price=0.25)] # tax=7.89 # total=41.98 # date='2023-04-01' # payment_method='credit' ``` --- # [Overview](https://pydantic.dev/docs/ai/models/overview/) # Overview Pydantic AI is model-agnostic and has built-in support for multiple model providers: - [OpenAI](/docs/ai/models/openai) - [Anthropic](/docs/ai/models/anthropic) - [Gemini](/docs/ai/models/google) (via two different APIs: Generative Language API and VertexAI API) - [xAI](/docs/ai/models/xai) - [Bedrock](/docs/ai/models/bedrock) - [Cerebras](/docs/ai/models/cerebras) - [Cohere](/docs/ai/models/cohere) - [Groq](/docs/ai/models/groq) - [Hugging Face](/docs/ai/models/huggingface) - [Mistral](/docs/ai/models/mistral) - [OpenRouter](/docs/ai/models/openrouter) - [Outlines](/docs/ai/models/outlines) ## OpenAI-compatible Providers In addition, many providers are compatible with the OpenAI API, and can be used with `OpenAIChatModel` in Pydantic AI: - [Alibaba Cloud Model Studio (DashScope)](/docs/ai/models/openai#alibaba-cloud-model-studio-dashscope) - [Azure AI Foundry](/docs/ai/models/openai#azure-ai-foundry) - [DeepSeek](/docs/ai/models/openai#deepseek) - [Fireworks AI](/docs/ai/models/openai#fireworks-ai) - [GitHub Models](/docs/ai/models/openai#github-models) - [Heroku](/docs/ai/models/openai#heroku-ai) - [LiteLLM](/docs/ai/models/openai#litellm) - [Nebius AI Studio](/docs/ai/models/openai#nebius-ai-studio) - [Ollama](/docs/ai/models/openai#ollama) - [OVHcloud AI Endpoints](/docs/ai/models/openai#ovhcloud-ai-endpoints) - [Perplexity](/docs/ai/models/openai#perplexity) - [SambaNova](/docs/ai/models/openai#sambanova) - [Together AI](/docs/ai/models/openai#together-ai) - [Vercel AI Gateway](/docs/ai/models/openai#vercel-ai-gateway) Pydantic AI also comes with [`TestModel`](/docs/ai/models/test) and [`FunctionModel`](/docs/ai/models/function) for testing and development. To use each model provider, you need to configure your local environment and make sure you have the right packages installed. If you try to use the model without having done so, you'll be told what to install. ## Models and Providers Pydantic AI uses a few key terms to describe how it interacts with different LLMs: - **Model**: This refers to the Pydantic AI class used to make requests following a specific LLM API (generally by wrapping a vendor-provided SDK, like the `openai` python SDK). These classes implement a vendor-SDK-agnostic API, ensuring a single Pydantic AI agent is portable to different LLM vendors without any other code changes just by swapping out the Model it uses. Model classes are named roughly in the format `Model`, for example, we have `OpenAIChatModel`, `AnthropicModel`, `GoogleModel`, etc. When using a Model class, you specify the actual LLM model name (e.g., `gpt-5`, `claude-sonnet-4-5`, `gemini-3-flash-preview`) as a parameter. - **Provider**: This refers to provider-specific classes which handle the authentication and connections to an LLM vendor. Passing a non-default _Provider_ as a parameter to a Model is how you can ensure that your agent will make requests to a specific endpoint, or make use of a specific approach to authentication (e.g., you can use Azure auth with the `OpenAIChatModel` by way of the `AzureProvider`). In particular, this is how you can make use of an AI gateway, or an LLM vendor that offers API compatibility with the vendor SDK used by an existing Model (such as `OpenAIChatModel`). - **Profile**: This refers to a description of how requests to a specific model or family of models need to be constructed to get the best results, independent of the model and provider classes used. For example, different models have different restrictions on the JSON schemas that can be used for tools, and the same schema transformer needs to be used for Gemini models whether you're using `GoogleModel` with model name `gemini-3-pro-preview`, or `OpenAIChatModel` with `OpenRouterProvider` and model name `google/gemini-3-pro-preview`. When you instantiate an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) with just a name formatted as `:`, e.g. `openai:gpt-5.2` or `openrouter:google/gemini-3-pro-preview`, Pydantic AI will automatically select the appropriate model class, provider, and profile. If you want to use a different provider or profile, you can instantiate a model class directly and pass in `provider` and/or `profile` arguments. ## HTTP Client Lifecycle When a [`Provider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.Provider) creates its own HTTP client (i.e. you don't pass a custom `http_client`), it owns that client's lifecycle. Using the [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) as an async context manager ensures the HTTP client is closed cleanly on exit: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent: result = await agent.run('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` You can also use a [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) or [`Provider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.Provider) directly as an async context manager for the same effect. If you provide your own `http_client`, you are responsible for closing it yourself. ## Custom Models Note If a model API is compatible with the OpenAI API, you do not need a custom model class and can provide your own [custom provider](/docs/ai/models/openai#openai-compatible-models) instead. To implement support for a model API that's not already supported, you will need to subclass the [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) abstract base class. For streaming, you'll also need to implement the [`StreamedResponse`](/docs/ai/api/models/base/#pydantic_ai.models.StreamedResponse) abstract base class. The best place to start is to review the source code for existing implementations, e.g. [`OpenAIChatModel`](https://github.com/pydantic/pydantic-ai/blob/main/pydantic_ai_slim/pydantic_ai/models/openai.py). For details on when we'll accept contributions adding new models to Pydantic AI, see the [contributing guidelines](/docs/ai/project/contributing#new-model-rules). ## HTTP Request Concurrency You can limit the number of concurrent HTTP requests to a model using the [`ConcurrencyLimitedModel`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimitedModel) wrapper. This is useful for respecting rate limits or managing resource usage when running many agents in parallel. model\_concurrency.py ```python import asyncio from pydantic_ai import Agent, ConcurrencyLimitedModel # Wrap a model with concurrency limiting model = ConcurrencyLimitedModel('openai:gpt-4o', limiter=5) # Multiple agents can share this rate-limited model agent = Agent(model) async def main(): # These will be rate-limited to 5 concurrent HTTP requests results = await asyncio.gather( *[agent.run(f'Question {i}') for i in range(20)] ) print(len(results)) #> 20 ``` The `limiter` parameter accepts: - An integer for simple limiting (e.g., `limiter=5`) - A [`ConcurrencyLimit`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimit) for advanced configuration with backpressure control - A [`ConcurrencyLimiter`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimiter) for sharing limits across multiple models ### Shared Concurrency Limits To share a concurrency limit across multiple models (e.g., different models from the same provider), you can create a [`ConcurrencyLimiter`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimiter) and pass it to multiple `ConcurrencyLimitedModel` instances: shared\_concurrency.py ```python import asyncio from pydantic_ai import Agent, ConcurrencyLimitedModel, ConcurrencyLimiter # Create a shared limiter with a descriptive name shared_limiter = ConcurrencyLimiter(max_running=10, name='openai-pool') # Both models share the same concurrency limit model1 = ConcurrencyLimitedModel('openai:gpt-4o', limiter=shared_limiter) model2 = ConcurrencyLimitedModel('openai:gpt-4o-mini', limiter=shared_limiter) agent1 = Agent(model1) agent2 = Agent(model2) async def main(): # Total concurrent requests across both agents limited to 10 results = await asyncio.gather( *[agent1.run(f'Question {i}') for i in range(10)], *[agent2.run(f'Question {i}') for i in range(10)], ) print(len(results)) #> 20 ``` When instrumentation is enabled, requests waiting for a concurrency slot appear as spans with attributes showing the queue depth and configured limits. The `name` parameter on `ConcurrencyLimiter` helps identify shared limiters in traces. ## Fallback Model You can use [`FallbackModel`](/docs/ai/api/models/fallback/#pydantic_ai.models.fallback.FallbackModel) to attempt multiple models in sequence until one succeeds. Pydantic AI can switch to the next model when the current model raises an exception (like a 4xx/5xx API error) **or** when the response content indicates a semantic failure (like a truncated response or a failed built-in tool call). By default, fallback triggers on [`ModelAPIError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelAPIError) (4xx/5xx API errors), so you don't need to configure anything for the most common use case. This behavior is controlled by the `fallback_on` parameter (see [`FallbackModel`](/docs/ai/api/models/fallback/#pydantic_ai.models.fallback.FallbackModel)), which accepts exception types, exception handlers, and response handlers -- all of which can be sync or async. Note The provider SDKs on which Models are based (like OpenAI, Anthropic, etc.) often have built-in retry logic that can delay the `FallbackModel` from activating. When using `FallbackModel`, it's recommended to disable provider SDK retries to ensure immediate fallback, for example by setting `max_retries=0` on a [custom OpenAI client](/docs/ai/models/openai#custom-openai-client). In the following example, the agent first makes a request to the OpenAI model (which fails due to an invalid API key), and then falls back to the Anthropic model. fallback\_model.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel openai_model = OpenAIChatModel('gpt-5.2') anthropic_model = AnthropicModel('claude-sonnet-4-5') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) response = agent.run_sync('What is the capital of France?') print(response.data) #> Paris print(response.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), part_kind='user-prompt', ) ], kind='request', ), ModelResponse( parts=[TextPart(content='Paris', part_kind='text')], model_name='claude-sonnet-4-5', timestamp=datetime.datetime(...), kind='response', provider_response_id=None, ), ] """ ``` The `ModelResponse` message above indicates in the `model_name` field that the output was returned by the Anthropic model, which is the second model specified in the `FallbackModel`. Note Each model's options should be configured individually. For example, `base_url`, `api_key`, and custom clients should be set on each model itself, not on the `FallbackModel`. ### Per-Model Settings You can configure different [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) for each model in a fallback chain by passing the `settings` parameter when creating each model. This is particularly useful when different providers have different optimal configurations: fallback\_model\_per\_settings.py ```python from pydantic_ai import Agent, ModelSettings from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel # Configure each model with provider-specific optimal settings openai_model = OpenAIChatModel( 'gpt-5.2', settings=ModelSettings(temperature=0.7, max_tokens=1000) # Higher creativity for OpenAI ) anthropic_model = AnthropicModel( 'claude-sonnet-4-5', settings=ModelSettings(temperature=0.2, max_tokens=1000) # Lower temperature for consistency ) fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) result = agent.run_sync('Write a creative story about space exploration') print(result.output) """ In the year 2157, Captain Maya Chen piloted her spacecraft through the vast expanse of the Andromeda Galaxy. As she discovered a planet with crystalline mountains that sang in harmony with the cosmic winds, she realized that space exploration was not just about finding new worlds, but about finding new ways to understand the universe and our place within it. """ ``` In this example, if the OpenAI model fails, the agent will automatically fall back to the Anthropic model with its own configured settings. The `FallbackModel` itself doesn't have settings - it uses the individual settings of whichever model successfully handles the request. ### Exception Handling The next example demonstrates the exception-handling capabilities of `FallbackModel`. If all models fail, a [`FallbackExceptionGroup`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.FallbackExceptionGroup) is raised, which contains all the exceptions encountered during the `run` execution. - [Python >=3.11](#tab-panel-54) - [Python <3.11](#tab-panel-55) fallback\_model\_failure.py ```python from pydantic_ai import Agent, ModelAPIError from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel openai_model = OpenAIChatModel('gpt-5.2') anthropic_model = AnthropicModel('claude-sonnet-4-5') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) try: response = agent.run_sync('What is the capital of France?') except* ModelAPIError as exc_group: for exc in exc_group.exceptions: print(exc) ``` Since [`except*`](https://docs.python.org/3/reference/compound_stmts.html#except-star) is only supported in Python 3.11+, we use the [`exceptiongroup`](https://github.com/agronholm/exceptiongroup) backport package for earlier Python versions: fallback\_model\_failure.py ```python from exceptiongroup import catch from pydantic_ai import Agent, ModelAPIError from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel def model_status_error_handler(exc_group: BaseExceptionGroup) -> None: for exc in exc_group.exceptions: print(exc) openai_model = OpenAIChatModel('gpt-5.2') anthropic_model = AnthropicModel('claude-sonnet-4-5') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) with catch({ModelAPIError: model_status_error_handler}): response = agent.run_sync('What is the capital of France?') ``` By default, the `FallbackModel` only moves on to the next model if the current model raises a [`ModelAPIError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelAPIError), which includes [`ModelHTTPError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelHTTPError). You can customize this behavior by passing a custom `fallback_on` argument to the `FallbackModel` constructor. Note Validation errors (from [structured output](/docs/ai/core-concepts/output#structured-output) or [tool parameters](/docs/ai/tools-toolsets/tools)) do **not** trigger fallback. These errors use the [retry mechanism](/docs/ai/core-concepts/agent#reflection-and-self-correction) instead, which re-prompts the same model to try again. This is intentional: validation errors stem from the non-deterministic nature of LLMs and may succeed on retry, whereas API errors (4xx/5xx) generally indicate issues that won't resolve by retrying the same request. ### Response-Based Fallback In addition to exception-based fallback, you can also trigger fallback based on the **content** of a model's response. This is useful when a model returns a successful HTTP response (no exception), but the response content indicates a semantic failure -- for example, an unexpected finish reason or a built-in tool reporting failure. Non-streaming only Response-based fallback currently only works with non-streaming requests (`agent.run()` and `agent.run_sync()`). For streaming requests (`agent.run_stream()`), only exception-based fallback is supported. The `fallback_on` parameter accepts: - A tuple of exception types: `(ModelAPIError, ModelHTTPError)` - An exception handler (sync or async): `lambda exc: isinstance(exc, MyError)` - A response handler (sync or async): `def check(r: ModelResponse) -> bool` - A list mixing all of the above: `[ModelAPIError, exc_handler, response_handler]` Handler type is auto-detected by inspecting type hints on the first parameter. If the first parameter is hinted as [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse), it's a response handler. Otherwise (including untyped handlers and lambdas), it's an exception handler. #### Finish Reason Example A simple use case is checking the model's finish reason -- for example, falling back if the response was truncated due to length limits: fallback\_on\_finish\_reason.py ```python from pydantic_ai import Agent from pydantic_ai.messages import FinishReason, ModelResponse from pydantic_ai.models.fallback import FallbackModel def bad_finish_reason(response: ModelResponse) -> bool: """Fallback if the model stopped due to length limit, content filter, or error.""" reason: FinishReason | None = response.finish_reason # Trigger fallback for problematic finish reasons return reason in ('length', 'content_filter', 'error') fallback_model = FallbackModel( 'openai:gpt-5.2', 'anthropic:claude-sonnet-4-5', fallback_on=bad_finish_reason, ) agent = Agent(fallback_model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` Solo response handlers replace default exception fallback When you pass a single response handler as `fallback_on` (as above), it **replaces** the default `(ModelAPIError,)` exception fallback entirely. This means API errors (4xx/5xx) will propagate as exceptions instead of triggering fallback to the next model. To keep exception-based fallback alongside a response handler, pass them together as a list -- see the [mixed example below](#combining-handlers). Note Note that Pydantic AI already handles some finish reasons automatically in the [agent loop](/docs/ai/core-concepts/agent): responses with a `'length'` or `'content_filter'` finish reason raise exceptions (which `FallbackModel` catches by default), and empty responses are retried. A response handler is useful for custom checks beyond these built-in behaviors. #### Built-in Tool Failure Example A more complex use case is when using built-in tools like web search or URL fetching. For example, Google's [`WebFetchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebFetchTool) may return a successful response with a status indicating the URL fetch failed: fallback\_on\_builtin\_tool.py ```python from pydantic_ai import Agent from pydantic_ai.messages import ModelResponse from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.google import GoogleModel def web_fetch_failed(response: ModelResponse) -> bool: """Check if a web_fetch built-in tool failed to retrieve content.""" for call, result in response.builtin_tool_calls: if call.tool_name != 'web_fetch': continue if not isinstance(result.content, list): continue for item in result.content: if isinstance(item, dict): status = item.get('url_retrieval_status', '') if status and status != 'URL_RETRIEVAL_STATUS_SUCCESS': return True return False google_model = GoogleModel('gemini-2.5-flash') anthropic_model = AnthropicModel('claude-sonnet-4-5') # Auto-detected as response handler via type hint fallback_model = FallbackModel( google_model, anthropic_model, fallback_on=web_fetch_failed, ) agent = Agent(fallback_model) # If Google's web_fetch fails, automatically falls back to Anthropic result = agent.run_sync('Summarize https://ai.pydantic.dev') print(result.output) """ Pydantic AI is a Python agent framework for building production-grade LLM applications. """ ``` Response handlers receive the [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) returned by the model and should return `True` to trigger fallback to the next model, or `False` to accept the response. #### Combining Handlers You can combine exception types, exception handlers, and response handlers in a single list: fallback\_on\_mixed.py ```python from pydantic_ai.exceptions import ModelAPIError from pydantic_ai.models.fallback import FallbackModel from fallback_on_builtin_tool import anthropic_model, google_model, web_fetch_failed fallback_model = FallbackModel( google_model, anthropic_model, fallback_on=[ ModelAPIError, # Exception type lambda exc: 'rate limit' in str(exc).lower(), # Exception handler (untyped lambda) web_fetch_failed, # Response handler (auto-detected via type hint) ], ) ``` ### Exception Handling in Middleware and Decorators When using `FallbackModel`, it's important to understand that [`FallbackExceptionGroup`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.FallbackExceptionGroup) inherits from Python's [`ExceptionGroup`](https://docs.python.org/3/library/exceptions.html#ExceptionGroup). This means that existing exception handling code that catches specific exceptions (like `ModelAPIError`) won't automatically catch the individual exceptions wrapped inside the group. For example, if you have middleware or a decorator that catches `ModelAPIError`: middleware\_without\_fallback.py ```python from collections.abc import Callable from functools import wraps from typing import TypeVar from pydantic_ai import ModelAPIError T = TypeVar('T') # This handler will NOT catch ModelAPIError when using FallbackModel! def handle_api_errors(func: Callable[..., T]) -> Callable[..., T]: @wraps(func) def wrapper(*args, **kwargs) -> T: try: return func(*args, **kwargs) except ModelAPIError as e: # Won't catch FallbackExceptionGroup print(f'API error: {e}') raise return wrapper ``` This decorator will miss `ModelAPIError` exceptions when using `FallbackModel`, because they're wrapped in a `FallbackExceptionGroup` containing one exception per failed model, in the order the models were tried. To handle both cases, you can use Python 3.11+ `except*` syntax, which catches matching exceptions from exception groups as well as bare exceptions. Note that `except*` always delivers the caught exceptions as an `ExceptionGroup` (even if the original was a bare exception), so re-raising will propagate an `ExceptionGroup` rather than the original exception type: - [Python >=3.11](#tab-panel-56) - [Python <3.11](#tab-panel-57) middleware\_with\_fallback.py ```python from collections.abc import Callable from functools import wraps from typing import TypeVar from pydantic_ai import ModelAPIError T = TypeVar('T') def handle_api_errors(func: Callable[..., T]) -> Callable[..., T]: @wraps(func) def wrapper(*args, **kwargs) -> T: try: return func(*args, **kwargs) except* ModelAPIError as exc_group: for exc in exc_group.exceptions: print(f'API error: {exc}') raise return wrapper ``` middleware\_with\_fallback.py ```python from collections.abc import Callable from functools import wraps from typing import TypeVar from pydantic_ai import FallbackExceptionGroup, ModelAPIError T = TypeVar('T') def handle_api_errors(func: Callable[..., T]) -> Callable[..., T]: @wraps(func) def wrapper(*args, **kwargs) -> T: try: return func(*args, **kwargs) except FallbackExceptionGroup as exc_group: for exc in exc_group.exceptions: if isinstance(exc, ModelAPIError): print(f'API error from fallback: {exc}') raise except ModelAPIError as e: print(f'API error: {e}') raise return wrapper ``` You can also catch `FallbackExceptionGroup` directly if you want to handle it specifically: catch\_fallback\_exception\_group.py ```python from pydantic_ai import Agent, FallbackExceptionGroup from pydantic_ai.models.fallback import FallbackModel agent = Agent(FallbackModel('openai:gpt-5-mini', 'anthropic:claude-sonnet-4-6')) try: response = agent.run_sync('What is the capital of France?') except FallbackExceptionGroup as exc_group: print(f'All {len(exc_group.exceptions)} models failed:') for exc in exc_group.exceptions: print(f' - {exc}') ``` --- # [xAI](https://pydantic.dev/docs/ai/models/xai/) # xAI ## Install To use [`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel), you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `xai` optional group: - [pip](#tab-panel-64) - [uv](#tab-panel-65) Terminal ```bash pip install "pydantic-ai-slim[xai]" ``` Terminal ```bash uv add "pydantic-ai-slim[xai]" ``` ## Configuration To use xAI models from [xAI](https://x.ai/api) through their API, go to [console.x.ai](https://console.x.ai/team/default/api-keys) to create an API key. [docs.x.ai](https://docs.x.ai/docs/models) contains a list of available xAI models. ## Environment variable Once you have the API key, you can set it as an environment variable: Terminal ```bash export XAI_API_KEY='your-api-key' ``` You can then use [`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel) by name: ```python from pydantic_ai import Agent agent = Agent('xai:grok-4-1-fast-non-reasoning') ... ``` Or initialise the model directly: ```python from pydantic_ai import Agent from pydantic_ai.models.xai import XaiModel # Uses XAI_API_KEY environment variable model = XaiModel('grok-4-1-fast-non-reasoning') agent = Agent(model) ... ``` You can also customize the [`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel) with a custom provider: ```python from pydantic_ai import Agent from pydantic_ai.models.xai import XaiModel from pydantic_ai.providers.xai import XaiProvider # Custom API key provider = XaiProvider(api_key='your-api-key') model = XaiModel('grok-4-1-fast-non-reasoning', provider=provider) agent = Agent(model) ... ``` Or with a custom `xai_sdk.AsyncClient`: ```python from xai_sdk import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.xai import XaiModel from pydantic_ai.providers.xai import XaiProvider xai_client = AsyncClient(api_key='your-api-key') provider = XaiProvider(xai_client=xai_client) model = XaiModel('grok-4-1-fast-non-reasoning', provider=provider) agent = Agent(model) ... ``` --- # [Coding Agent Skills](https://pydantic.dev/docs/ai/overview/coding-agent-skills/) # Coding Agent Skills If you're building Pydantic AI applications with a coding agent, you can install the Pydantic AI skill from the [`pydantic/skills`](https://github.com/pydantic/skills) repository to give your agent up-to-date framework knowledge. [Agent skills](https://agentskills.io) are packages of instructions and reference material that coding agents load on demand. With the skill installed, coding agents have access to Pydantic AI patterns, architecture guidance, and common task references covering [tools](/docs/ai/tools-toolsets/tools), [capabilities](/docs/ai/core-concepts/capabilities), [structured output](/docs/ai/core-concepts/output), [streaming](/docs/ai/core-concepts/agent#streaming-events-and-final-output), [testing](/docs/ai/guides/testing), [multi-agent delegation](/docs/ai/guides/multi-agent-applications), [hooks](/docs/ai/core-concepts/hooks), and [agent specs](/docs/ai/core-concepts/agent-spec). Note If you want to build agent skills for your Pydantic AI agent, see the [Agent Skills](/docs/ai/core-concepts/capabilities#agent-skills) entry in the Third-party capabilities section on the Capabilities page. ## Installation ### Claude Code Add the Pydantic marketplace and install the plugin: Terminal ```bash claude plugin marketplace add pydantic/skills claude plugin install ai@pydantic-skills ``` ### Cross-Agent (agentskills.io) Install the Pydantic AI skill using the [skills CLI](https://github.com/vercel-labs/skills): Terminal ```bash npx skills add pydantic/skills ``` This works with 30+ agents via the [agentskills.io](https://agentskills.io) standard, including Claude Code, Codex, Cursor, and Gemini CLI. ## See Also - [`pydantic/skills`](https://github.com/pydantic/skills): source repository - [agentskills.io](https://agentskills.io): the open standard for agent skills --- # [Pydantic AI Gateway](https://pydantic.dev/docs/ai/overview/gateway/) # Pydantic AI Gateway **[Pydantic AI Gateway](https://logfire.pydantic.dev/)** is a unified interface for accessing multiple AI providers with a single key, managed through [Pydantic Logfire](https://logfire.pydantic.dev/). Features include built-in OpenTelemetry observability, real-time cost monitoring, failover management, and native integration with the other tools in the [Pydantic stack](https://pydantic.dev/). Migrated to Pydantic Logfire The AI Gateway has moved from `gateway.pydantic.dev` to [Pydantic Logfire](https://logfire.pydantic.dev/). If you were using the standalone gateway, see [Pydantic AI Gateway is Moving to Pydantic Logfire](https://logfire.pydantic.dev/docs/gateway-migration/). Sign up at [logfire.pydantic.dev](https://logfire.pydantic.dev/). Questions? For questions and feedback, contact us on [Slack](https://logfire.pydantic.dev/docs/join-slack/). ## Documentation Integration To help you get started with Pydantic AI Gateway, some code examples on the Pydantic AI documentation include a "Via Pydantic AI Gateway" tab, alongside a "Direct to Provider API" tab with the standard Pydantic AI model string. The main difference between them is that when using Gateway, model strings use the `gateway/` prefix. ## Key features - **API key management**: Access multiple LLM providers with a single Gateway key. - **Cost Limits**: Set spending limits at project, user, and API key levels with daily, weekly, and monthly caps. - **BYOK and managed providers:** Bring your own API keys (BYOK) from LLM providers, or pay for inference directly through the platform. - **Multi-provider support:** Access models from OpenAI, Anthropic, Google Vertex, Groq, and AWS Bedrock. _More providers coming soon_. - **Routing groups:** Configure [routing groups](#routing-groups) to fail over between providers serving the same model, or load-balance traffic across them by weight. - **Backend observability:** Log every request through [Pydantic Logfire](https://pydantic.dev/logfire) or any OpenTelemetry backend (_coming soon_). - **Zero translation**: Unlike traditional AI gateways that translate everything to one common schema, **Pydantic AI Gateway** allows requests to flow through directly in each provider's native format. This gives you immediate access to new model features as soon as they are released. - **Enterprise ready**: Inherits Logfire's enterprise features -- including SSO, custom roles and permissions. hello\_world.py ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2') result = agent.run_sync('Where does "hello world" come from?') print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` ## Quick Start This section contains instructions on how to set up your account and run your app with Pydantic AI Gateway credentials. ### Create an account 1. Sign up at [logfire.pydantic.dev](https://logfire.pydantic.dev/) 2. Choose a region and create an account. 3. Activate the gateway in your organizations settings. ### Create Gateway API keys Go to your organization's Gateway settings in Logfire and create an API key. ## Usage After setting up your account with the instructions above, you will be able to make an AI model request with the Pydantic AI Gateway. The code snippets below show how you can use Pydantic AI Gateway with different frameworks and SDKs. To use different models, change the model string `gateway/:` to other models offered by the supported providers. Examples of providers and models that can be used are: **Provider** **API Format** **Example Model** OpenAI `openai` `gateway/openai:gpt-5.2` Anthropic `anthropic` `gateway/anthropic:claude-sonnet-4-6` Google Vertex `google-vertex` `gateway/google-vertex:gemini-3-flash-preview` Groq `groq` `gateway/groq:openai/gpt-oss-120b` AWS Bedrock `bedrock` `gateway/bedrock:amazon.nova-micro-v1:0` ### Pydantic AI Before you start, make sure you are on version 1.16 or later of `pydantic-ai`. To update to the latest version run: - [uv](#tab-panel-92) - [pip](#tab-panel-93) Terminal ```bash uv sync -P pydantic-ai ``` Terminal ```bash pip install -U pydantic-ai ``` Set the `PYDANTIC_AI_GATEWAY_API_KEY` environment variable to your Gateway API key: Terminal ```bash export PYDANTIC_AI_GATEWAY_API_KEY="pylf_v..." ``` You can access multiple models with the same API key, as shown in the code snippet below. hello\_world.py ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2') result = agent.run_sync('Where does "hello world" come from?') print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` #### Passing API Key directly Pass your API key directly using the [`gateway_provider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.gateway.gateway_provider): passing\_api\_key.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.gateway import gateway_provider provider = gateway_provider('openai', api_key='pylf_v...') model = OpenAIChatModel('gpt-5.2', provider=provider) agent = Agent(model) result = agent.run_sync('Where does "hello world" come from?') print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` #### Using a different upstream provider To use an alternate provider or routing group, you can specify it in the route parameter: routing\_via\_provider.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.gateway import gateway_provider provider = gateway_provider( 'openai', api_key='pylf_v...', route='builtin-openai' ) model = OpenAIChatModel('gpt-5.2', provider=provider) agent = Agent(model) result = agent.run_sync('Where does "hello world" come from?') print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` ### Claude Code Before you start, log out of Claude Code using `/logout`. Set your gateway credentials as environment variables, using the base URL that matches your Logfire region: - [US](#tab-panel-94) - [EU](#tab-panel-95) Terminal ```bash export ANTHROPIC_BASE_URL="https://gateway-us.pydantic.dev/proxy/anthropic" export ANTHROPIC_AUTH_TOKEN="YOUR_GATEWAY_API_KEY" ``` Terminal ```bash export ANTHROPIC_BASE_URL="https://gateway-eu.pydantic.dev/proxy/anthropic" export ANTHROPIC_AUTH_TOKEN="YOUR_GATEWAY_API_KEY" ``` Replace `YOUR_GATEWAY_API_KEY` with the API key from your Logfire organization's Gateway settings. Launch Claude Code by typing `claude`. All requests will now route through the Pydantic AI Gateway. ### Codex Codex uses the OpenAI Responses API, so it should use the Gateway's `openai-responses` route. Set your gateway API key as an environment variable: Terminal ```bash export PYDANTIC_AI_GATEWAY_API_KEY="YOUR_GATEWAY_API_KEY" ``` Then add the following configuration to `~/.codex/config.toml`, using the base URL that matches your Logfire region: - [US](#tab-panel-96) - [EU](#tab-panel-97) ```toml model = "gpt-5.4" model_provider = "pydantic_gateway" [model_providers.pydantic_gateway] name = "Pydantic AI Gateway" base_url = "https://gateway-us.pydantic.dev/proxy/openai-responses" env_key = "PYDANTIC_AI_GATEWAY_API_KEY" env_key_instructions = "Create a Gateway API key in your Logfire organization's Gateway settings." wire_api = "responses" ``` ```toml model = "gpt-5.4" model_provider = "pydantic_gateway" [model_providers.pydantic_gateway] name = "Pydantic AI Gateway" base_url = "https://gateway-eu.pydantic.dev/proxy/openai-responses" env_key = "PYDANTIC_AI_GATEWAY_API_KEY" env_key_instructions = "Create a Gateway API key in your Logfire organization's Gateway settings." wire_api = "responses" ``` For more details on configuring custom providers in Codex, see the [Codex custom model providers docs](https://developers.openai.com/codex/config-advanced#custom-model-providers) and the [Codex configuration reference](https://developers.openai.com/codex/config-reference/). If you already have a `~/.codex/config.toml`, add the `[model_providers.pydantic_gateway]` block and update `model_provider` instead of replacing the whole file. Replace `gpt-5.4` with whichever OpenAI Responses model you want Codex to use. Launch Codex by typing `codex`. All requests will now route through the Pydantic AI Gateway. ### SDKs #### OpenAI SDK Use the base URL that matches your Logfire region (`gateway-us` or `gateway-eu`). - [US](#tab-panel-98) - [EU](#tab-panel-99) openai\_sdk.py ```python import openai client = openai.Client( base_url='https://gateway-us.pydantic.dev/proxy/chat/', api_key='pylf_v...', ) response = client.chat.completions.create( model='gpt-5.2', messages=[{'role': 'user', 'content': 'Hello world'}], ) print(response.choices[0].message.content) #> Hello user ``` openai\_sdk.py ```python import openai client = openai.Client( base_url='https://gateway-eu.pydantic.dev/proxy/chat/', api_key='pylf_v...', ) response = client.chat.completions.create( model='gpt-5.2', messages=[{'role': 'user', 'content': 'Hello world'}], ) print(response.choices[0].message.content) #> Hello user ``` #### Anthropic SDK Use the base URL that matches your Logfire region (`gateway-us` or `gateway-eu`). - [US](#tab-panel-100) - [EU](#tab-panel-101) anthropic\_sdk.py ```python import anthropic client = anthropic.Anthropic( base_url='https://gateway-us.pydantic.dev/proxy/anthropic/', auth_token='pylf_v...', ) response = client.messages.create( max_tokens=1000, model='claude-sonnet-4-5', messages=[{'role': 'user', 'content': 'Hello world'}], ) print(response.content[0].text) #> Hello user ``` anthropic\_sdk.py ```python import anthropic client = anthropic.Anthropic( base_url='https://gateway-eu.pydantic.dev/proxy/anthropic/', auth_token='pylf_v...', ) response = client.messages.create( max_tokens=1000, model='claude-sonnet-4-5', messages=[{'role': 'user', 'content': 'Hello world'}], ) print(response.content[0].text) #> Hello user ``` #### Vercel AI SDK The [Vercel AI SDK](https://ai-sdk.dev/) can route through the Gateway by pointing each provider's `baseURL` at the matching proxy path (e.g. `/proxy/openai` or `/proxy/anthropic`). Use the base URL that matches your Logfire region (`gateway-us` or `gateway-eu`). - [US](#tab-panel-102) - [EU](#tab-panel-103) ```typescript import { createOpenAI } from "@ai-sdk/openai"; import { generateText } from "ai"; const apiKey = process.env.PYDANTIC_AI_GATEWAY_API_KEY; if (!apiKey) throw new Error("set PYDANTIC_AI_GATEWAY_API_KEY"); const openai = createOpenAI({ apiKey, baseURL: "https://gateway-us.pydantic.dev/proxy/openai", }); async function main() { const openaiResult = await generateText({ model: openai("gpt-5.2"), prompt: "what color is the sky? reply concisely", }); console.log("openai:", openaiResult.text); } main().catch((err) => { console.error(err); process.exit(1); }); ``` ```typescript import { createOpenAI } from "@ai-sdk/openai"; import { generateText } from "ai"; const apiKey = process.env.PYDANTIC_AI_GATEWAY_API_KEY; if (!apiKey) throw new Error("set PYDANTIC_AI_GATEWAY_API_KEY"); const openai = createOpenAI({ apiKey, baseURL: "https://gateway-eu.pydantic.dev/proxy/openai", }); async function main() { const openaiResult = await generateText({ model: openai("gpt-5.2"), prompt: "what color is the sky? reply concisely", }); console.log("openai:", openaiResult.text); } main().catch((err) => { console.error(err); process.exit(1); }); ``` ## Routing groups A **routing group** is a named collection of providers that all serve the same model. Each member has a **priority**, a **weight**, and an **active** flag, and those three values together let a single group express two different routing strategies: - **Failover / fallback**: Assign members different priorities. The Gateway always tries the highest-priority active member first, and only falls through to a lower-priority member when the higher one is unavailable (for example if it is down, rate-limited, or returns an error). - **Load balancing**: Assign two or more members the same priority and give each a weight. The Gateway splits traffic across those members in proportion to their weights. The two strategies compose: you can have, for example, a top priority tier with two providers load-balanced 70/30, and a second priority tier that only receives traffic when both top-tier providers fail. ### Creating a routing group Routing groups are managed from your organization's Gateway settings in Logfire: 1. Open **Gateway -> Routing Groups** and click **Add Routing Group**. 2. Give the group a slug (e.g. `anthropic-routing`) and an optional description. 3. Open the group's **Members** page and add one or more providers. For each member set: - **Priority** - higher values are tried first. Use different priorities across members for failover. - **Weight** - load-balancing weight used between members that share the same priority. - **Active** - inactive members are skipped during routing. ### Using a routing group Point the Gateway provider at the group via the `route` parameter (the group's slug): routing\_group.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.gateway import gateway_provider provider = gateway_provider( 'anthropic', api_key='pylf_v...', route='anthropic-routing', # (1) ) model = AnthropicModel('claude-sonnet-4-6', provider=provider) agent = Agent(model) result = agent.run_sync('Where does "hello world" come from?') print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` The slug of the routing group you created in Logfire. ## Troubleshooting ### Unable to calculate spend The gateway needs to know the cost of the request in order to provide insights about the spend, and to enforce spending limits. If it's unable to calculate the cost, it will return a 400 error with the message "Unable to calculate spend". When configuring a provider, you need to decide if you want the gateway to block the API key if it's unable to calculate the cost. If you choose to block the API key, any further requests using that API key will fail. We are actively working on supporting more providers, and models. If you have a specific provider that you would like to see supported, please let us know on [Slack](https://logfire.pydantic.dev/docs/join-slack/) or [open an issue on `genai-prices`](https://github.com/pydantic/genai-prices/issues/new). --- # [Getting Help](https://pydantic.dev/docs/ai/overview/help/) # Getting Help If you need help getting started with Pydantic AI or with advanced usage, the following sources may be useful. ## Slack Join the `#pydantic-ai` channel in the [Pydantic Slack](https://logfire.pydantic.dev/docs/join-slack/) to ask questions, get help, and chat about Pydantic AI. There's also channels for Pydantic, Logfire, and FastUI. If you're on a [Logfire](https://pydantic.dev/logfire) Pro plan, you can also get a dedicated private slack collab channel with us. ## GitHub Issues The [Pydantic AI GitHub Issues](https://github.com/pydantic/pydantic-ai/issues) are a great place to ask questions and give us feedback. --- # [Pydantic AI](https://pydantic.dev/docs/ai/overview/) # Pydantic AI ![Pydantic AI](/docs/ai/img/pydantic-ai-dark.svg) ![Pydantic AI](/docs/ai/img/pydantic-ai-light.svg) _GenAI Agent Framework, the Pydantic way_ [![CI](https://github.com/pydantic/pydantic-ai/actions/workflows/ci.yml/badge.svg?event=push)](https://github.com/pydantic/pydantic-ai/actions/workflows/ci.yml?query=branch%3Amain)[![Coverage](https://coverage-badge.samuelcolvin.workers.dev/pydantic/pydantic-ai.svg)](https://coverage-badge.samuelcolvin.workers.dev/redirect/pydantic/pydantic-ai)[![PyPI](https://img.shields.io/pypi/v/pydantic-ai.svg)](https://pypi.python.org/pypi/pydantic-ai)[![versions](https://img.shields.io/pypi/pyversions/pydantic-ai.svg)](https://github.com/pydantic/pydantic-ai)[![license](https://img.shields.io/github/license/pydantic/pydantic-ai.svg)](https://github.com/pydantic/pydantic-ai/blob/main/LICENSE)[![Join Slack](https://img.shields.io/badge/Slack-Join%20Slack-4A154B?logo=slack)](https://logfire.pydantic.dev/docs/join-slack/) Pydantic AI is a Python agent framework designed to help you quickly, confidently, and painlessly build production grade applications and workflows with Generative AI. FastAPI revolutionized web development by offering an innovative and ergonomic design, built on the foundation of [Pydantic Validation](https://docs.pydantic.dev) and modern Python features like type hints. Yet despite virtually every Python agent framework and LLM library using Pydantic Validation, when we began to use LLMs in [Pydantic Logfire](https://pydantic.dev/logfire), we couldn't find anything that gave us the same feeling. We built Pydantic AI with one simple aim: to bring that FastAPI feeling to GenAI app and agent development. ## Why use Pydantic AI 1. **Built by the Pydantic Team**: [Pydantic Validation](https://docs.pydantic.dev/latest/) is the validation layer of the OpenAI SDK, the Google ADK, the Anthropic SDK, LangChain, LlamaIndex, AutoGPT, Transformers, CrewAI, Instructor and many more. _Why use the derivative when you can go straight to the source?_ 😃 2. **Model-agnostic**: Supports virtually every [model](/docs/ai/models/overview) and provider: OpenAI, Anthropic, Gemini, DeepSeek, Grok, Cohere, Mistral, and Perplexity; Azure AI Foundry, Amazon Bedrock, Google Vertex AI, Ollama, LiteLLM, Groq, OpenRouter, Together AI, Fireworks AI, Cerebras, Hugging Face, GitHub, Heroku, Vercel, Nebius, OVHcloud, Alibaba Cloud, SambaNova, and Outlines. If your favorite model or provider is not listed, you can easily implement a [custom model](/docs/ai/models/overview#custom-models). 3. **Seamless Observability**: Tightly [integrates](/docs/ai/integrations/logfire) with [Pydantic Logfire](https://pydantic.dev/logfire), our general-purpose OpenTelemetry observability platform, for real-time debugging, evals-based performance monitoring, and behavior, tracing, and cost tracking. If you already have an observability platform that supports OTel, you can [use that too](/docs/ai/integrations/logfire#alternative-observability-backends). 4. **Fully Type-safe**: Designed to give your IDE or AI coding agent as much context as possible for auto-completion and [type checking](/docs/ai/core-concepts/agent#static-type-checking), moving entire classes of errors from runtime to write-time for a bit of that Rust "if it compiles, it works" feel. 5. **Powerful Evals**: Enables you to systematically test and [evaluate](/docs/ai/evals/) the performance and accuracy of the agentic systems you build, and monitor the performance over time in Pydantic Logfire. 6. **Extensible by Design**: Build agents from composable [capabilities](/docs/ai/core-concepts/capabilities) that bundle tools, hooks, instructions, and model settings into reusable units. Use built-in capabilities for [web search](/docs/ai/core-concepts/capabilities#provider-adaptive-tools), [thinking](/docs/ai/core-concepts/capabilities#thinking), and [MCP](/docs/ai/core-concepts/capabilities#provider-adaptive-tools), pick from the [Pydantic AI Harness](/docs/ai/harness/overview) capability library, build your own, or install [third-party capability packages](/docs/ai/guides/extensibility). Define agents entirely in [YAML/JSON](/docs/ai/core-concepts/agent-spec) -- no code required. 7. **MCP, A2A, and UI**: Integrates the [Model Context Protocol](/docs/ai/mcp/overview), [Agent2Agent](/docs/ai/integrations/a2a), and various [UI event stream](/docs/ai/integrations/ui/overview) standards to give your agent access to external tools and data, let it interoperate with other agents, and build interactive applications with streaming event-based communication. 8. **Human-in-the-Loop Tool Approval**: Easily lets you flag that certain tool calls [require approval](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) before they can proceed, possibly depending on tool call arguments, conversation history, or user preferences. 9. **Durable Execution**: Enables you to build [durable agents](/docs/ai/integrations/durable_execution/overview) that can preserve their progress across transient API failures and application errors or restarts, and handle long-running, asynchronous, and human-in-the-loop workflows with production-grade reliability. 10. **Streamed Outputs**: Provides the ability to [stream](/docs/ai/core-concepts/output#streamed-results) structured output continuously, with immediate validation, ensuring real time access to generated data. 11. **Graph Support**: Provides a powerful way to define [graphs](/docs/ai/graph/) using type hints, for use in complex applications where standard control flow can degrade to spaghetti code. Realistically though, no list is going to be as convincing as [giving it a try](#next-steps) and seeing how it makes you feel! **Sign up for our newsletter, _The Pydantic Stack_, with updates & tutorials on Pydantic AI, Logfire, and Pydantic:** Subscribe ## Hello World Example Here's a minimal example of Pydantic AI: hello\_world.py ```python from pydantic_ai import Agent agent = Agent( # (1) 'anthropic:claude-sonnet-4-6', instructions='Be concise, reply with one sentence.', # (2) ) result = agent.run_sync('Where does "hello world" come from?') # (3) print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` We configure the agent to use [Anthropic's Claude Sonnet 4.6](/docs/ai/models/anthropic) model, but you can also set the model when running the agent. Register static [instructions](/docs/ai/core-concepts/agent#instructions) using a keyword argument to the agent. [Run the agent](/docs/ai/core-concepts/agent#running-agents) synchronously, starting a conversation with the LLM. _(This example is complete, it can be run "as is", assuming you've [installed the `pydantic_ai` package](/docs/ai/overview/install))_ The exchange will be very short: Pydantic AI will send the instructions and the user prompt to the LLM, and the model will return a text response. Not very interesting yet, but we can easily add [tools](/docs/ai/tools-toolsets/tools), [dynamic instructions](/docs/ai/core-concepts/agent#instructions), [structured outputs](/docs/ai/core-concepts/output), or composable [capabilities](/docs/ai/core-concepts/capabilities) to build more powerful agents. Here's the same agent with [thinking](/docs/ai/core-concepts/capabilities#thinking) and [web search](/docs/ai/core-concepts/capabilities#provider-adaptive-tools) capabilities: hello\_world\_capabilities.py ```python from pydantic_ai import Agent from pydantic_ai.capabilities import Thinking, WebSearch agent = Agent( 'anthropic:claude-sonnet-4-6', instructions='Be concise, reply with one sentence.', capabilities=[Thinking(), WebSearch()], ) result = agent.run_sync('What was the mass of the largest meteorite found this year?') print(result.output) """ The largest meteorite recovered this year weighed approximately 7.6 kg, found in the Sahara Desert in January. """ ``` ## Tools & Dependency Injection Example Here is a concise example using Pydantic AI to build a support agent for a bank: bank\_support.py ```python from dataclasses import dataclass from pydantic import BaseModel, Field from pydantic_ai import Agent, RunContext from bank_database import DatabaseConn @dataclass class SupportDependencies: # (3) customer_id: int db: DatabaseConn # (12) class SupportOutput(BaseModel): # (13) support_advice: str = Field(description='Advice returned to the customer') block_card: bool = Field(description="Whether to block the customer's card") risk: int = Field(description='Risk level of query', ge=0, le=10) support_agent = Agent( # (1) 'openai:gpt-5.2', # (2) deps_type=SupportDependencies, output_type=SupportOutput, # (9) instructions=( # (4) 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query.' ), ) @support_agent.instructions # (5) async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str: customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id) return f"The customer's name is {customer_name!r}" @support_agent.tool # (6) async def customer_balance( ctx: RunContext[SupportDependencies], include_pending: bool ) -> float: """Returns the customer's current account balance.""" # (7) return await ctx.deps.db.customer_balance( id=ctx.deps.customer_id, include_pending=include_pending, ) ... # (11) async def main(): deps = SupportDependencies(customer_id=123, db=DatabaseConn()) result = await support_agent.run('What is my balance?', deps=deps) # (8) print(result.output) # (10) """ support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1 """ result = await support_agent.run('I just lost my card!', deps=deps) print(result.output) """ support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8 """ ``` This [agent](/docs/ai/core-concepts/agent) will act as first-tier support in a bank. Agents are generic in the type of dependencies they accept and the type of output they return. In this case, the support agent has type `Agent[SupportDependencies, SupportOutput]`. Here we configure the agent to use [OpenAI's GPT-5 model](/docs/ai/models/openai), you can also set the model when running the agent. The `SupportDependencies` dataclass is used to pass data, connections, and logic into the model that will be needed when running [instructions](/docs/ai/core-concepts/agent#instructions) and [tool](/docs/ai/tools-toolsets/tools) functions. Pydantic AI's system of dependency injection provides a [type-safe](/docs/ai/core-concepts/agent#static-type-checking) way to customise the behavior of your agents, and can be especially useful when running [unit tests](/docs/ai/guides/testing) and evals. Static [instructions](/docs/ai/core-concepts/agent#instructions) can be registered with the [`instructions` keyword argument](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) to the agent. Dynamic [instructions](/docs/ai/core-concepts/agent#instructions) can be registered with the [`@agent.instructions`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instructions) decorator, and can make use of dependency injection. Dependencies are carried via the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) argument, which is parameterized with the `deps_type` from above. If the type annotation here is wrong, static type checkers will catch it. The [`@agent.tool`](/docs/ai/tools-toolsets/tools) decorator let you register functions which the LLM may call while responding to a user. Again, dependencies are carried via [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), any other arguments become the tool schema passed to the LLM. Pydantic is used to validate these arguments, and errors are passed back to the LLM so it can retry. The docstring of a tool is also passed to the LLM as the description of the tool. Parameter descriptions are [extracted](/docs/ai/tools-toolsets/tools#function-tools-and-schema) from the docstring and added to the parameter schema sent to the LLM. [Run the agent](/docs/ai/core-concepts/agent#running-agents) asynchronously, conducting a conversation with the LLM until a final response is reached. Even in this fairly simple case, the agent will exchange multiple messages with the LLM as tools are called to retrieve an output. The response from the agent will be guaranteed to be a `SupportOutput`. If validation fails [reflection](/docs/ai/core-concepts/agent#reflection-and-self-correction), the agent is prompted to try again. The output will be validated with Pydantic to guarantee it is a `SupportOutput`, since the agent is generic, it'll also be typed as a `SupportOutput` to aid with static type checking. In a real use case, you'd add more tools and longer instructions to the agent to extend the context it's equipped with and support it can provide. This is a simple sketch of a database connection, used to keep the example short and readable. In reality, you'd be connecting to an external database (e.g. PostgreSQL) to get information about customers. This [Pydantic](https://docs.pydantic.dev) model is used to constrain the structured data returned by the agent. From this simple definition, Pydantic builds the JSON Schema that tells the LLM how to return the data, and performs validation to guarantee the data is correct at the end of the run. Complete `bank_support.py` example The code included here is incomplete for the sake of brevity (the definition of `DatabaseConn` is missing); you can find the complete `bank_support.py` example [here](/docs/ai/examples/bank-support). ## Instrumentation with Pydantic Logfire Even a simple agent with just a handful of tools can result in a lot of back-and-forth with the LLM, making it nearly impossible to be confident of what's going on just from reading the code. To understand the flow of the above runs, we can watch the agent in action using Pydantic Logfire. To do this, we need to [set up Logfire](/docs/ai/integrations/logfire#using-logfire), and add the following to our code: bank\_support\_with\_logfire.py ```python ... from pydantic_ai import Agent, RunContext from bank_database import DatabaseConn import logfire logfire.configure() # (1) logfire.instrument_pydantic_ai() # (2) logfire.instrument_sqlite3() # (3) ... support_agent = Agent( 'openai:gpt-5.2', deps_type=SupportDependencies, output_type=SupportOutput, instructions=( 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query.' ), ) ``` Configure the Logfire SDK, this will fail if project is not set up. This will instrument all Pydantic AI agents used from here on out. If you want to instrument only a specific agent, you can pass the [`instrument=True` keyword argument](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) to the agent. In our demo, `DatabaseConn` uses [`sqlite3`](https://docs.python.org/3/library/sqlite3.html#module-sqlite3) to connect to a PostgreSQL database, so [`logfire.instrument_sqlite3()`](https://logfire.pydantic.dev/docs/integrations/databases/sqlite3/) is used to log the database queries. That's enough to get the following view of your agent in action: Logfire instrumentation for the bank agent -- [View in Logfire](https://logfire-eu.pydantic.dev/public-trace/a2957caa-b7b7-4883-a529-777742649004?spanId=31aade41ab896144) See [Monitoring and Performance](/docs/ai/integrations/logfire) to learn more. ## `llms.txt` The Pydantic AI documentation is available in the [llms.txt](https://llmstxt.org/) format. This format is defined in Markdown and suited for LLMs and AI coding assistants and agents. Two formats are available: - [`llms.txt`](https://ai.pydantic.dev/llms.txt): a file containing a brief description of the project, along with links to the different sections of the documentation. The structure of this file is described in details [here](https://llmstxt.org/#format). - [`llms-full.txt`](https://ai.pydantic.dev/llms-full.txt): Similar to the `llms.txt` file, but every link content is included. Note that this file may be too large for some LLMs. As of today, these files are not automatically leveraged by IDEs or coding agents, but they will use it if you provide a link or the full text. ## Next Steps To try Pydantic AI for yourself, [install it](/docs/ai/overview/install) and follow the instructions [in the examples](/docs/ai/examples/setup). Read the [docs](/docs/ai/core-concepts/agent) to learn more about building applications with Pydantic AI. Read the [API Reference](/docs/ai/api/pydantic-ai/agent) to understand Pydantic AI's interface. Join [Slack](https://logfire.pydantic.dev/docs/join-slack/) or file an issue on [GitHub](https://github.com/pydantic/pydantic-ai/issues) if you have any questions. --- # [Installation](https://pydantic.dev/docs/ai/overview/install/) # Installation Pydantic AI is available on PyPI as [`pydantic-ai`](https://pypi.org/project/pydantic-ai/) so installation is as simple as: - [pip](#tab-panel-84) - [uv](#tab-panel-85) Terminal ```bash pip install pydantic-ai ``` Terminal ```bash uv add pydantic-ai ``` (Requires Python 3.10+) This installs the `pydantic_ai` package, core dependencies, and libraries required to use all the models included in Pydantic AI. If you want to install only those dependencies required to use a specific model, you can install the ["slim"](#slim-install) version of Pydantic AI. ## Use with Pydantic Logfire Pydantic AI has an excellent (but completely optional) integration with [Pydantic Logfire](https://pydantic.dev/logfire) to help you view and understand agent runs. Logfire comes included with `pydantic-ai` (but not the ["slim" version](#slim-install)), so you can typically start using it immediately by following the [Logfire setup docs](/docs/ai/integrations/logfire#using-logfire). ## Running Examples We distribute the [`pydantic_ai_examples`](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples) directory as a separate PyPI package ([`pydantic-ai-examples`](https://pypi.org/project/pydantic-ai-examples/)) to make examples extremely easy to customize and run. To install examples, use the `examples` optional group: - [pip](#tab-panel-86) - [uv](#tab-panel-87) Terminal ```bash pip install "pydantic-ai[examples]" ``` Terminal ```bash uv add "pydantic-ai[examples]" ``` To run the examples, follow instructions in the [examples docs](/docs/ai/examples/setup). ## Slim Install If you know which model you're going to use and want to avoid installing superfluous packages, you can use the [`pydantic-ai-slim`](https://pypi.org/project/pydantic-ai-slim/) package. For example, if you're using just [`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel), you would run: - [pip](#tab-panel-88) - [uv](#tab-panel-89) Terminal ```bash pip install "pydantic-ai-slim[openai]" ``` Terminal ```bash uv add "pydantic-ai-slim[openai]" ``` `pydantic-ai-slim` has the following optional groups: - `logfire` -- installs [Pydantic Logfire](/docs/ai/integrations/logfire) dependency `logfire` [PyPI ↗](https://pypi.org/project/logfire) - `evals` -- installs [Pydantic Evals](/docs/ai/evals/) dependency `pydantic-evals` [PyPI ↗](https://pypi.org/project/pydantic-evals) - `openai` -- installs [OpenAI Model](/docs/ai/models/openai) dependency `openai` [PyPI ↗](https://pypi.org/project/openai) - `vertexai` -- installs `GoogleVertexProvider` dependencies `google-auth` [PyPI ↗](https://pypi.org/project/google-auth) and `requests` [PyPI ↗](https://pypi.org/project/requests) - `google` -- installs [Google Model](/docs/ai/models/google) dependency `google-genai` [PyPI ↗](https://pypi.org/project/google-genai) - `anthropic` -- installs [Anthropic Model](/docs/ai/models/anthropic) dependency `anthropic` [PyPI ↗](https://pypi.org/project/anthropic) - `groq` -- installs [Groq Model](/docs/ai/models/groq) dependency `groq` [PyPI ↗](https://pypi.org/project/groq) - `mistral` -- installs [Mistral Model](/docs/ai/models/mistral) dependency `mistralai` [PyPI ↗](https://pypi.org/project/mistralai) - `cohere` - installs [Cohere Model](/docs/ai/models/cohere) dependency `cohere` [PyPI ↗](https://pypi.org/project/cohere) - `bedrock` - installs [Bedrock Model](/docs/ai/models/bedrock) dependency `boto3` [PyPI ↗](https://pypi.org/project/boto3) - `huggingface` - installs [Hugging Face Model](/docs/ai/models/huggingface) dependency `huggingface-hub` [PyPI ↗](https://pypi.org/project/huggingface-hub) - `outlines-transformers` - installs [Outlines Model](/docs/ai/models/outlines) dependency `outlines[transformers]` [PyPI ↗](https://pypi.org/project/outlines) - `outlines-llamacpp` - installs [Outlines Model](/docs/ai/models/outlines) dependency `outlines[llamacpp]` [PyPI ↗](https://pypi.org/project/outlines) - `outlines-mlxlm` - installs [Outlines Model](/docs/ai/models/outlines) dependency `outlines[mlxlm]` [PyPI ↗](https://pypi.org/project/outlines) - `outlines-sglang` - installs [Outlines Model](/docs/ai/models/outlines) dependency `outlines[sglang]` [PyPI ↗](https://pypi.org/project/outlines) - `outlines-vllm-offline` - installs [Outlines Model](/docs/ai/models/outlines) dependencies `outlines` [PyPI ↗](https://pypi.org/project/outlines) and `vllm` [PyPI ↗](https://pypi.org/project/vllm) - `duckduckgo` - installs [DuckDuckGo Search Tool](/docs/ai/tools-toolsets/common-tools#duckduckgo-search-tool) dependency `ddgs` [PyPI ↗](https://pypi.org/project/ddgs) - `tavily` - installs [Tavily Search Tool](/docs/ai/tools-toolsets/common-tools#tavily-search-tool) dependency `tavily-python` [PyPI ↗](https://pypi.org/project/tavily-python) - `exa` - installs [Exa Search Tool](/docs/ai/tools-toolsets/common-tools#exa-search-tool) dependency `exa-py` [PyPI ↗](https://pypi.org/project/exa-py) - `web-fetch` - installs [Web Fetch Tool](/docs/ai/tools-toolsets/common-tools#web-fetch-tool) dependency `markdownify` [PyPI ↗](https://pypi.org/project/markdownify) - `cli` - installs [CLI](/docs/ai/integrations/cli) dependencies `rich` [PyPI ↗](https://pypi.org/project/rich), `prompt-toolkit` [PyPI ↗](https://pypi.org/project/prompt-toolkit), and `argcomplete` [PyPI ↗](https://pypi.org/project/argcomplete) - `mcp` - installs [MCP](/docs/ai/mcp/client) dependency `mcp` [PyPI ↗](https://pypi.org/project/mcp) - `fastmcp` - installs [FastMCP](/docs/ai/mcp/fastmcp-client) dependency `fastmcp` [PyPI ↗](https://pypi.org/project/fastmcp) - `a2a` - installs [A2A](/docs/ai/integrations/a2a) dependency `fasta2a` [PyPI ↗](https://pypi.org/project/fasta2a) - `ui` - installs [UI Event Streams](/docs/ai/integrations/ui/overview) dependency `starlette` [PyPI ↗](https://pypi.org/project/starlette) - `ag-ui` - installs [AG-UI Event Stream Protocol](/docs/ai/integrations/ui/ag-ui) dependencies `ag-ui-protocol` [PyPI ↗](https://pypi.org/project/ag-ui-protocol) and `starlette` [PyPI ↗](https://pypi.org/project/starlette) - `dbos` - installs [DBOS Durable Execution](/docs/ai/integrations/durable_execution/dbos) dependency `dbos` [PyPI ↗](https://pypi.org/project/dbos) - `prefect` - installs [Prefect Durable Execution](/docs/ai/integrations/durable_execution/prefect) dependency `prefect` [PyPI ↗](https://pypi.org/project/prefect) You can also install dependencies for multiple models and use cases, for example: - [pip](#tab-panel-90) - [uv](#tab-panel-91) Terminal ```bash pip install "pydantic-ai-slim[openai,google,logfire]" ``` Terminal ```bash uv add "pydantic-ai-slim[openai,google,logfire]" ``` --- # [Troubleshooting](https://pydantic.dev/docs/ai/overview/troubleshooting/) # Troubleshooting Below are suggestions on how to fix some common errors you might encounter while using Pydantic AI. If the issue you're experiencing is not listed below or addressed in the documentation, please feel free to ask in the [Pydantic Slack](/docs/ai/overview/help) or create an issue on [GitHub](https://github.com/pydantic/pydantic-ai/issues). ## Jupyter Notebook Errors ### `RuntimeError: This event loop is already running` **Modern Jupyter/IPython (7.0+)**: This environment supports top-level `await` natively. You can use `Agent.run()` directly in notebook cells without additional setup: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result = await agent.run('Who let the dogs out?') ``` **Legacy environments or specific integrations**: If you encounter event loop conflicts, use [`nest-asyncio`](https://pypi.org/project/nest-asyncio/): ```python import nest_asyncio from pydantic_ai import Agent nest_asyncio.apply() agent = Agent('openai:gpt-5.2') result = agent.run_sync('Who let the dogs out?') ``` **Note**: This also applies to Google Colab and [Marimo](https://github.com/marimo-team/marimo) environments. ## API Key Configuration ### `UserError: API key must be provided or set in the [MODEL]_API_KEY environment variable` If you're running into issues with setting the API key for your model, visit the [Models](/docs/ai/models/overview) page to learn more about how to set an environment variable and/or pass in an `api_key` argument. ## Monitoring HTTPX Requests You can use custom `httpx` clients in your models in order to access specific requests, responses, and headers at runtime. It's particularly helpful to use `logfire`'s [HTTPX integration](/docs/ai/integrations/logfire#monitoring-http-requests) to monitor the above. --- # [Upgrade Guide](https://pydantic.dev/docs/ai/project/changelog/) # Upgrade Guide In September 2025, Pydantic AI reached V1, which means we're committed to API stability: we will not introduce changes that break your code until V2. For more information, review our [Version Policy](/docs/ai/project/version-policy). ## Breaking Changes Here's a filtered list of the breaking changes for each version to help you upgrade Pydantic AI. ### v1.0.1 (2025-09-05) The following breaking change was accidentally left out of v1.0.0: - See [#2808](https://github.com/pydantic/pydantic-ai/pull/2808) - Remove `Python` evaluator from `pydantic_evals` for security reasons ### v1.0.0 (2025-09-04) - See [#2725](https://github.com/pydantic/pydantic-ai/pull/2725) - Drop support for Python 3.9 - See [#2738](https://github.com/pydantic/pydantic-ai/pull/2738) - Make many dataclasses require keyword arguments - See [#2715](https://github.com/pydantic/pydantic-ai/pull/2715) - Remove `cases` and `averages` attributes from `pydantic_evals` spans - See [#2798](https://github.com/pydantic/pydantic-ai/pull/2798) - Change `ModelRequest.parts` and `ModelResponse.parts` types from `list` to `Sequence` - See [#2726](https://github.com/pydantic/pydantic-ai/pull/2726) - Default `InstrumentationSettings` version to 2 - See [#2717](https://github.com/pydantic/pydantic-ai/pull/2717) - Remove errors when passing `AsyncRetrying` or `Retrying` object to `AsyncTenacityTransport` or `TenacityTransport` instead of `RetryConfig` ### v0.x.x Before V1, minor versions were used to introduce breaking changes: **v0.8.0 (2025-08-26)** See [#2689](https://github.com/pydantic/pydantic-ai/pull/2689) - `AgentStreamEvent` was expanded to be a union of `ModelResponseStreamEvent` and `HandleResponseEvent`, simplifying the `event_stream_handler` function signature. Existing code accepting `AgentStreamEvent | HandleResponseEvent` will continue to work. **v0.7.6 (2025-08-26)** The following breaking change was inadvertently released in a patch version rather than a minor version: See [#2670](https://github.com/pydantic/pydantic-ai/pull/2670) - `TenacityTransport` and `AsyncTenacityTransport` now require the use of `pydantic_ai.retries.RetryConfig` (which is just a `TypedDict` containing the kwargs to `tenacity.retry`) instead of `tenacity.Retrying` or `tenacity.AsyncRetrying`. **v0.7.0 (2025-08-12)** See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.StreamedResponse` now yields a `FinalResultEvent` along with the existing `PartStartEvent` and `PartDeltaEvent`. If you're using `pydantic_ai.direct.model_request_stream` or `pydantic_ai.direct.model_request_stream_sync`, you may need to update your code to account for this. See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.Model.request_stream` now receives a `run_context` argument. If you've implemented a custom `Model` subclass, you will need to account for this. See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.StreamedResponse` now requires a `model_request_parameters` field and constructor argument. If you've implemented a custom `Model` subclass and implemented `request_stream`, you will need to account for this. **v0.6.0 (2025-08-06)** This release was meant to clean some old deprecated code, so we can get a step closer to V1. See [#2440](https://github.com/pydantic/pydantic-ai/pull/2440) - The `next` method was removed from the `Graph` class. Use `async with graph.iter(...) as run: run.next()` instead. See [#2441](https://github.com/pydantic/pydantic-ai/pull/2441) - The `result_type`, `result_tool_name` and `result_tool_description` arguments were removed from the `Agent` class. Use `output_type` instead. See [#2441](https://github.com/pydantic/pydantic-ai/pull/2441) - The `result_retries` argument was also removed from the `Agent` class. Use `output_retries` instead. See [#2443](https://github.com/pydantic/pydantic-ai/pull/2443) - The `data` property was removed from the `FinalResult` class. Use `output` instead. See [#2445](https://github.com/pydantic/pydantic-ai/pull/2445) - The `get_data` and `validate_structured_result` methods were removed from the `StreamedRunResult` class. Use `get_output` and `validate_structured_output` instead. See [#2446](https://github.com/pydantic/pydantic-ai/pull/2446) - The `format_as_xml` function was moved to the `pydantic_ai.format_as_xml` module. Import it via `from pydantic_ai import format_as_xml` instead. See [#2451](https://github.com/pydantic/pydantic-ai/pull/2451) - Removed deprecated `Agent.result_validator` method, `Agent.last_run_messages` property, `AgentRunResult.data` property, and `result_tool_return_content` parameters from result classes. **v0.5.0 (2025-08-04)** See [#2388](https://github.com/pydantic/pydantic-ai/pull/2388) - The `source` field of an `EvaluationResult` is now of type `EvaluatorSpec` rather than the actual source `Evaluator` instance, to help with serialization/deserialization. See [#2163](https://github.com/pydantic/pydantic-ai/pull/2163) - The `EvaluationReport.print` and `EvaluationReport.console_table` methods now require most arguments be passed by keyword. **v0.4.0 (2025-07-08)** See [#1799](https://github.com/pydantic/pydantic-ai/pull/1799) - Pydantic Evals `EvaluationReport` and `ReportCase` are now generic dataclasses instead of Pydantic models. If you were serializing them using `model_dump()`, you will now need to use the `EvaluationReportAdapter` and `ReportCaseAdapter` type adapters instead. See [#1507](https://github.com/pydantic/pydantic-ai/pull/1507) - The `ToolDefinition` `description` argument is now optional and the order of positional arguments has changed from `name, description, parameters_json_schema, ...` to `name, parameters_json_schema, description, ...` to account for this. **v0.3.0 (2025-06-18)** See [#1142](https://github.com/pydantic/pydantic-ai/pull/1142) -- Adds support for thinking parts. We now convert the thinking blocks (`"...""`) in provider specific text parts to Pydantic AI `ThinkingPart`s. Also, as part of this release, we made the choice to not send back the `ThinkingPart`s to the provider - the idea is to save costs on behalf of the user. In the future, we intend to add a setting to customize this behavior. **v0.2.0 (2025-05-12)** See [#1647](https://github.com/pydantic/pydantic-ai/pull/1647) -- usage makes sense as part of `ModelResponse`, and could be really useful in "messages" (really a sequence of requests and response). In this PR: - Adds `usage` to `ModelResponse` (field has a default factory of `Usage()` so it'll work to load data that doesn't have usage) - changes the return type of `Model.request` to just `ModelResponse` instead of `tuple[ModelResponse, Usage]` **v0.1.0 (2025-04-15)** See [#1248](https://github.com/pydantic/pydantic-ai/pull/1248) -- the attribute/parameter name `result` was renamed to `output` in many places. Hopefully all changes keep a deprecated attribute or parameter with the old name, so you should get many deprecation warnings. See [#1484](https://github.com/pydantic/pydantic-ai/pull/1484) -- `format_as_xml` was moved and made available to import from the package root, e.g. `from pydantic_ai import format_as_xml`. ## Full Changelog For the full changelog, see [GitHub Releases](https://github.com/pydantic/pydantic-ai/releases). --- # [Contributing](https://pydantic.dev/docs/ai/project/contributing/) # Contributing We'd love you to contribute to Pydantic AI! ## How we work -- the short version Pydantic AI is maintained by a small team. We set our own priorities based on what benefits the most users, and we work through issues and PRs in that order -- not in the order they arrive. - **Found a bug?** Open an issue with a clear description and a minimal reproducible example. Including a [Logfire](https://logfire.pydantic.dev/) trace link helps us debug dramatically faster. - **Want a feature or API change?** Open an issue describing the problem you're solving. Do not start with code. - **Want to help build a feature?** Comment on the issue explaining why you need it and what context you bring. We call this being a "champion" -- more on that below. - **Have a fix or code to share?** Make sure a maintainer has agreed to the approach on the issue and assigned you. Then open a PR. The rest of this page explains why we work this way and what to expect. ## Before you write code For anything non-trivial, align with a maintainer on the approach before writing code. A pre-aligned PR is much faster to land than one we're seeing cold. ### Trivial fixes Typos, broken links, small doc improvements, obvious one-line fixes: just open a PR. No issue needed. ### Bug fixes If the fix could reasonably go more than one way, or you're unsure it's actually a bug: open an issue first. Include a minimal reproducible example and ideally a [Logfire trace link](https://logfire.pydantic.dev/) showing the problem. For well-scoped bugs, we may generate a fix internally -- the most valuable thing you can do is file a clear report and then validate that the fix works for your use case. ### Features, integrations, or API changes Before writing code, ask whether the change needs to live in core at all. Most new agent behaviors belong in [**Pydantic AI Harness**](https://github.com/pydantic/pydantic-ai-harness), the official capability library -- not in this repo. Pydantic AI core is for the agent loop, model providers, and capabilities that require model-specific support or are fundamental to the agent experience. Standalone capabilities -- guardrails, memory, context management, file system access, etc. -- belong in the harness, where they can iterate faster. See [What goes where?](/docs/ai/harness/overview#what-goes-where) for the full distinction. **If your idea is a capability**, open an issue on [pydantic-ai-harness](https://github.com/pydantic/pydantic-ai-harness/issues) instead. You can also publish capabilities as your own package using the `pydantic-ai-` convention -- see [Publishing capability packages](/docs/ai/guides/extensibility#publishing-capability-packages). Once a capability has real users and a stable shape, we can talk about upstreaming to harness or core. If it does belong in core: 1. **Search first.** If an existing issue covers your need, comment there. If the closest match is only related, open a new issue and link it. 2. **Describe the problem, not just the solution.** Tell us what you're building, what's blocking you, and what you've tried. This context matters more than code. 3. **Propose a plan before building.** Post the shape of the solution on the issue, or open a draft PR with just a `PLAN.md`. For larger features, we do short video calls with contributors to iterate on the design -- a 20-minute call often saves weeks of async review cycles. 4. **Wait for assignment.** A maintainer needs to agree on the approach and assign the issue to you before you open a PR. Unassigned PRs may be auto-closed. Caution Writing a large feature PR without prior alignment is the most common way for a contribution to stall or be closed. ## Champions A "champion" is someone who needs a feature, has context on the problem, and is willing to invest time to help us get it right. If you want to champion a feature: - Comment on the issue explaining: what you're building, why you need this, and what you can contribute (domain knowledge, testing, validation). - We prioritize features where one or more champions with production use cases have stepped up. A feature with no champion stays in the backlog until either we prioritize it ourselves or someone with real context shows up. - Being a champion doesn't mean writing the code. It means shaping the plan and validating the result. For significant features, we'll set up a call to iterate on the design together. Champions are credited as co-authors when the feature ships. ## What to expect during review ### We review PRs in our priority order, not submission order We do not automatically triage every new PR. PRs on issues we have not pre-aligned on are not in our review queue, regardless of how well written they are. If no maintainer has agreed to the change on an issue and assigned it to you, assume we have not seen your PR. Even for PRs with code we've previously engaged with: we treat all contributed code as a starting point, not a finished product. We review and prioritize PRs based on the feature's importance to the project, not on how much effort went into the code. This is a change from how open source traditionally worked, and we'd rather be honest about it than leave PRs sitting with no signal. **If you want to know where your PR stands**, the best thing to do is ping `#pydantic-ai` on [Pydantic Slack](https://logfire.pydantic.dev/docs/join-slack/). ### We may rewrite or supersede your code We treat contributed code as illustrative: a starting point that shows the shape of the change and proves the approach works, not the final form we merge. The most useful thing you can give us for a non-trivial change is a plan plus a working example -- not a polished, merge-ready implementation. On any PR, we may push commits to your branch, open a follow-up PR that supersedes yours, or rewrite from scratch. For security reasons, we lean toward rewriting contributed code rather than merging as-is. You will still be credited as the original author. Please don't spend effort chasing green CI, addressing every automated review comment, or rebasing for merge conflicts on a PR we haven't pre-aligned on. If we take the change forward, that polish gets thrown away when we rewrite. Get the approach working, then stop and ping us on Slack. ### Automated review is advisory, not a gate PRs are automatically reviewed by Devin and our own tooling. These reviews are advisory: - A bot approval does not mean your PR is ready to merge. Only a human maintainer's review counts. - A bot finding does not mean you must act on it. If you disagree, say so. - If automated review is generating noise on your PR, tell us. We use that feedback to retune the tooling. ### Priority We receive far more contributions than we can review, and we focus where it has the most impact. We cannot promise to get to every PR, even good ones, and we'd rather say so up front than leave your work open indefinitely with no signal. How we weigh priorities: - **User demand** -- features that more users need get priority. Champion-backed features with production use cases outrank speculative additions. - **Provider significance** -- work that affects frontier providers (Anthropic, OpenAI, Google) or providers we know are heavily used gets priority. A model integration for a niche provider will wait; a fix for Anthropic won't. - **Roadmap alignment** -- features that align with our current focus areas get priority. Right now that includes the capabilities/hooks API, provider-adaptive tools, and the [Pydantic AI Harness](https://github.com/pydantic/pydantic-ai-harness) capability library. - **Capabilities over core** -- features that could live as a [capability](/docs/ai/core-concepts/capabilities) should go to [Pydantic AI Harness](https://github.com/pydantic/pydantic-ai-harness) or ship as your own package -- that's often the fastest path. Once it has traction, come back and we can talk about upstreaming. ## If your PR or issue has gone quiet 1. Ping `#pydantic-ai` on [Pydantic Slack](https://logfire.pydantic.dev/docs/join-slack/) with a link. 2. Say what you need: "Can you take a look?", "I'm blocked -- is this on your radar?", or "Should I close this?" are all fine. 3. If you've been waiting weeks without any human response, flag it. That's a process failure on our side and we want to know. ## Installation and Setup Clone your fork and cd into the repo directory Terminal ```bash git clone git@github.com:/pydantic-ai.git cd pydantic-ai ``` Install `uv` (version 0.4.30 or later) and `pre-commit`: - [`uv` install docs](https://docs.astral.sh/uv/getting-started/installation/) - [`pre-commit` install docs](https://pre-commit.com/#install) To install `pre-commit` you can run the following command: Terminal ```bash uv tool install pre-commit ``` Install `pydantic-ai`, all dependencies and pre-commit hooks Terminal ```bash make install ``` ## Running Tests etc. We use `make` to manage most commands you'll need to run. For details on available commands, run: Terminal ```bash make help ``` To run code formatting, linting, static type checks, and tests with coverage report generation, run: Terminal ```bash make ``` ## Documentation Changes To run the documentation page locally, run: Terminal ```bash uv run mkdocs serve ``` ## Rules for adding new models to Pydantic AI To avoid an excessive workload for the maintainers of Pydantic AI, we can't accept all model contributions, so we're setting the following rules for when we'll accept new models and when we won't. This should hopefully reduce the chances of disappointment and wasted work. - To add a new model with an extra dependency, that dependency needs > 500k monthly downloads from PyPI consistently over 3 months or more - To add a new model which uses another models logic internally and has no extra dependencies, that model's GitHub org needs > 20k stars in total - For any other model that's just a custom URL and API key, we're happy to add a one-paragraph description with a link and instructions on the URL to use - For any other model that requires more logic, we recommend you release your own Python package `pydantic-ai-xxx`, which depends on [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) and implements a model that inherits from our [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) ABC If you're unsure about adding a model, please [create an issue](https://github.com/pydantic/pydantic-ai/issues). --- # [Version Policy](https://pydantic.dev/docs/ai/project/version-policy/) # Version Policy ## Version Policy We will not intentionally make breaking changes in minor releases of V1. V2 will be released in April 2026 at the earliest, 6 months after the release of V1 in September 2025. Once we release V2, we'll continue to provide security fixes for V1 for another 6 months minimum, so you have time to upgrade your applications. Functionality marked as deprecated will not be removed until V2. Of course, some apparently safe changes and bug fixes will inevitably break some users' code -- obligatory link to [xkcd](https://xkcd.com/1172/). The following changes will **NOT** be considered breaking changes, and may occur in minor releases: - Bug fixes that may result in existing code breaking, provided that such code was relying on undocumented features/constructs/assumptions. - Adding new [message parts](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages), [stream events](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent), or optional fields (including fields with default values) on existing message (part) and event types. Always code defensively when consuming message parts or event streams, and use the [`ModelMessagesTypeAdapter`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessagesTypeAdapter) to (de)serialize message histories. - Changing OpenTelemetry span attributes. Because different [observability platforms](/docs/ai/integrations/logfire#using-opentelemetry) support different versions of the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/), Pydantic AI lets you configure the [instrumentation version](/docs/ai/integrations/logfire#configuring-data-format), but the default version may change in a minor release. Span attributes for [Pydantic Evals](/docs/ai/evals/) may also change as we iterate on Evals support in [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/). - Changing how `__repr__` behaves, even of public classes. In all cases we will aim to minimize churn and do so only when justified by the increase of quality of Pydantic AI for users. ## Beta Features At Pydantic, we like to move quickly and innovate! To that end, minor releases may introduce beta features (indicated by a `beta` module) that are active works in progress. While in its beta phase, a feature's API and behaviors may not be stable, and it's very possible that changes made to the feature will not be backward-compatible. We aim to move beta features out of beta within a few months after initial release, once users have had a chance to provide feedback and test the feature in production. ## Support for Python versions Pydantic will drop support for a Python version when the following conditions are met: - The Python version has reached its [expected end of life](https://devguide.python.org/versions/). - less than 5% of downloads of the most recent minor release are using that version. --- # [Built-in Tools](https://pydantic.dev/docs/ai/tools-toolsets/builtin-tools/) # Built-in Tools Built-in tools are native tools provided by LLM providers that can be used to enhance your agent's capabilities. Unlike [common tools](/docs/ai/tools-toolsets/common-tools), which are custom implementations that Pydantic AI executes, built-in tools are executed directly by the model provider. Pydantic AI supports the following built-in tools: - **[`WebSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchTool)**: Allows agents to search the web - **[`CodeExecutionTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.CodeExecutionTool)**: Enables agents to execute code in a secure environment - **[`ImageGenerationTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool)**: Enables agents to generate images - **[`WebFetchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebFetchTool)**: Enables agents to fetch web pages - **[`MemoryTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MemoryTool)**: Enables agents to use memory - **[`MCPServerTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MCPServerTool)**: Enables agents to use remote MCP servers with communication handled by the model provider - **[`FileSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.FileSearchTool)**: Enables agents to search through uploaded files using vector search (RAG) These tools are passed to the agent via the `builtin_tools` parameter and are executed by the model provider's infrastructure. Provider Support Not all model providers support built-in tools. If you use a built-in tool with an unsupported provider, Pydantic AI will raise a [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) when you try to run the agent. If a provider supports a built-in tool that is not currently supported by Pydantic AI, please file an issue. Provider-adaptive capabilities For a higher-level, model-agnostic approach, consider the [provider-adaptive tool capabilities](/docs/ai/core-concepts/capabilities#provider-adaptive-tools): [`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch), [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch), [`ImageGeneration`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ImageGeneration), and [`MCP`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.MCP). These automatically use the model's native tool when supported and fall back to a local implementation, so your agent works across providers without code changes. ## Dynamic Configuration Sometimes you need to configure a built-in tool dynamically based on the [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) (e.g., user dependencies), or conditionally omit it. You can achieve this by passing a function to `builtin_tools` that takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as an argument and returns an [`AbstractBuiltinTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.AbstractBuiltinTool) or `None`. This is particularly useful for tools like [`WebSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchTool) where you might want to set the user's location based on the current request, or disable the tool if the user provides no location. dynamic\_builtin\_tool.py ```python from pydantic_ai import Agent, RunContext, WebSearchTool async def prepared_web_search(ctx: RunContext[dict]) -> WebSearchTool | None: if not ctx.deps.get('location'): return None return WebSearchTool( user_location={'city': ctx.deps['location']}, ) agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[prepared_web_search], deps_type=dict, ) # Run with location result = agent.run_sync( 'What is the weather like?', deps={'location': 'London'}, ) print(result.output) #> It's currently raining in London. # Run without location (tool will be omitted) result = agent.run_sync( 'What is the capital of France?', deps={'location': None}, ) print(result.output) #> The capital of France is Paris. ``` ## Web Search Tool Tip For a model-agnostic approach with automatic local fallback, see the [`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch) [capability](/docs/ai/core-concepts/capabilities#provider-adaptive-tools). The [`WebSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchTool) allows your agent to search the web, making it ideal for queries that require up-to-date data. ### Provider Support Provider Supported Notes OpenAI Responses ✅ Full feature support. To include search results on the [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) that's available via [`ModelResponse.builtin_tool_calls`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.builtin_tool_calls), enable the [`OpenAIResponsesModelSettings.openai_include_web_search_sources`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_include_web_search_sources) [model setting](/docs/ai/core-concepts/agent#model-run-settings). Anthropic ✅ Full feature support Google ✅ No parameter support. No [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) or [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) is generated when streaming. Using built-in tools and function tools (including [output tools](/docs/ai/core-concepts/output#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](/docs/ai/core-concepts/output#prompted-output) instead. xAI ✅ Supports `blocked_domains` and `allowed_domains` parameters. Groq ✅ Limited parameter support. To use web search capabilities with Groq, you need to use the [compound models](https://console.groq.com/docs/compound). OpenRouter ✅ Web search via [plugins](https://openrouter.ai/docs/features/web-search). Supports `search_context_size`. Uses native search for supported providers (OpenAI, Anthropic, Perplexity, xAI), Exa for others. OpenAI Chat Completions ❌ Not supported Bedrock ❌ Not supported Mistral ❌ Not supported Cohere ❌ Not supported HuggingFace ❌ Not supported Outlines ❌ Not supported ### Usage web\_search\_anthropic.py ```python from pydantic_ai import Agent, WebSearchTool agent = Agent('anthropic:claude-sonnet-4-6', builtin_tools=[WebSearchTool()]) result = agent.run_sync('Give me a sentence with the biggest news in AI this week.') print(result.output) #> Scientists have developed a universal AI detector that can identify deepfake videos. ``` _(This example is complete, it can be run "as is")_ With OpenAI, you must use their Responses API to access the web search tool. web\_search\_openai.py ```python from pydantic_ai import Agent, WebSearchTool agent = Agent('openai-responses:gpt-5.2', builtin_tools=[WebSearchTool()]) result = agent.run_sync('Give me a sentence with the biggest news in AI this week.') print(result.output) #> Scientists have developed a universal AI detector that can identify deepfake videos. ``` _(This example is complete, it can be run "as is")_ ### Configuration Options The `WebSearchTool` supports several configuration parameters: web\_search\_configured.py ```python from pydantic_ai import Agent, WebSearchTool, WebSearchUserLocation agent = Agent( 'anthropic:claude-sonnet-4-6', builtin_tools=[ WebSearchTool( search_context_size='high', user_location=WebSearchUserLocation( city='San Francisco', country='US', region='CA', timezone='America/Los_Angeles', ), blocked_domains=['example.com', 'spam-site.net'], allowed_domains=None, # Cannot use both blocked_domains and allowed_domains with Anthropic max_uses=5, # Anthropic only: limit tool usage ) ], ) result = agent.run_sync('Use the web to get the current time.') print(result.output) #> In San Francisco, it's 8:21:41 pm PDT on Wednesday, August 6, 2025. ``` _(This example is complete, it can be run "as is")_ #### Provider Support Parameter OpenAI Anthropic xAI Groq OpenRouter `search_context_size` ✅ ❌ ❌ ❌ ✅ `user_location` ✅ ✅ ❌ ❌ ❌ `blocked_domains` ❌ ✅ ✅ ✅ ❌ `allowed_domains` ✅ ✅ ✅ ✅ ❌ `max_uses` ❌ ✅ ❌ ❌ ❌ Anthropic Domain Filtering With Anthropic, you can only use either `blocked_domains` or `allowed_domains`, not both. ## Code Execution Tool The [`CodeExecutionTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.CodeExecutionTool) enables your agent to execute code in a secure environment, making it perfect for computational tasks, data analysis, and mathematical operations. ### Provider Support Provider Supported Notes OpenAI ✅ To include code execution output on the [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) that's available via [`ModelResponse.builtin_tool_calls`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.builtin_tool_calls), enable the [`OpenAIResponsesModelSettings.openai_include_code_execution_outputs`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_include_code_execution_outputs) [model setting](/docs/ai/core-concepts/agent#model-run-settings). If the code execution generated images, like charts, they will be available on [`ModelResponse.images`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.images) as [`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage) objects. The generated image can also be used as [image output](/docs/ai/core-concepts/output#image-output) for the agent run. Google ✅ Using built-in tools and function tools (including [output tools](/docs/ai/core-concepts/output#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](/docs/ai/core-concepts/output#prompted-output) instead. Anthropic ✅ xAI ✅ Full feature support. Groq ❌ Bedrock ✅ Only available for Nova 2.0 models. Mistral ❌ Cohere ❌ HuggingFace ❌ Outlines ❌ ### Usage code\_execution\_basic.py ```python from pydantic_ai import Agent, CodeExecutionTool agent = Agent('anthropic:claude-sonnet-4-6', builtin_tools=[CodeExecutionTool()]) result = agent.run_sync('Calculate the factorial of 15.') print(result.output) #> The factorial of 15 is **1,307,674,368,000**. print(result.response.builtin_tool_calls) """ [ ( BuiltinToolCallPart( tool_name='code_execution', args={ 'code': 'import math\n\n# Calculate factorial of 15\nresult = math.factorial(15)\nprint(f"15! = {result}")\n\n# Let\'s also show it in a more readable format with commas\nprint(f"15! = {result:,}")' }, tool_call_id='srvtoolu_017qRH1J3XrhnpjP2XtzPCmJ', provider_name='anthropic', ), BuiltinToolReturnPart( tool_name='code_execution', content={ 'content': [], 'return_code': 0, 'stderr': '', 'stdout': '15! = 1307674368000\n15! = 1,307,674,368,000', 'type': 'code_execution_result', }, tool_call_id='srvtoolu_017qRH1J3XrhnpjP2XtzPCmJ', timestamp=datetime.datetime(...), provider_name='anthropic', ), ) ] """ ``` _(This example is complete, it can be run "as is")_ In addition to text output, code execution with OpenAI can generate images as part of their response. Accessing this image via [`ModelResponse.images`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.images) or [image output](/docs/ai/core-concepts/output#image-output) requires the [`OpenAIResponsesModelSettings.openai_include_code_execution_outputs`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_include_code_execution_outputs) [model setting](/docs/ai/core-concepts/agent#model-run-settings) to be enabled. code\_execution\_openai.py ```python from pydantic_ai import Agent, BinaryImage, CodeExecutionTool from pydantic_ai.models.openai import OpenAIResponsesModelSettings agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[CodeExecutionTool()], output_type=BinaryImage, model_settings=OpenAIResponsesModelSettings(openai_include_code_execution_outputs=True), ) result = agent.run_sync('Generate a chart of y=x^2 for x=-5 to 5.') assert isinstance(result.output, BinaryImage) ``` _(This example is complete, it can be run "as is")_ ## Image Generation Tool Tip For a model-agnostic approach with automatic local fallback, see the [`ImageGeneration`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ImageGeneration) [capability](/docs/ai/core-concepts/capabilities#provider-adaptive-tools). The [`ImageGenerationTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool) enables your agent to generate images. ### Provider Support Provider Supported Notes OpenAI Responses ✅ Full feature support. Only supported by models newer than `gpt-5.2`. Metadata about the generated image, like the [`revised_prompt`](https://platform.openai.com/docs/guides/tools-image-generation#revised-prompt) sent to the underlying image model, is available on the [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) that's available via [`ModelResponse.builtin_tool_calls`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.builtin_tool_calls). Google ✅ Limited parameter support. Only supported by [image generation models](https://ai.google.dev/gemini-api/docs/image-generation) like `gemini-3-pro-image-preview` and `gemini-3-pro-image-preview`. These models do not support [function tools](/docs/ai/tools-toolsets/tools) and will always have the option of generating images, even if this built-in tool is not explicitly specified. Anthropic ❌ xAI ❌ Groq ❌ Bedrock ❌ Mistral ❌ Cohere ❌ HuggingFace ❌ ### Usage Generated images are available on [`ModelResponse.images`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.images) as [`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage) objects: image\_generation\_openai.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent('openai-responses:gpt-5.2', builtin_tools=[ImageGenerationTool()]) result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ assert isinstance(result.response.images[0], BinaryImage) ``` _(This example is complete, it can be run "as is")_ Image generation with Google [image generation models](https://ai.google.dev/gemini-api/docs/image-generation) does not require the `ImageGenerationTool` built-in tool to be explicitly specified: image\_generation\_google.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('google-gla:gemini-3-pro-image-preview') result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ assert isinstance(result.response.images[0], BinaryImage) ``` _(This example is complete, it can be run "as is")_ The `ImageGenerationTool` can be used together with `output_type=BinaryImage` to get [image output](/docs/ai/core-concepts/output#image-output). If the `ImageGenerationTool` built-in tool is not explicitly specified, it will be enabled automatically: image\_generation\_output.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('openai-responses:gpt-5.2', output_type=BinaryImage) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` _(This example is complete, it can be run "as is")_ ### Configuration Options The `ImageGenerationTool` supports several configuration parameters: image\_generation\_configured.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[ ImageGenerationTool( background='transparent', input_fidelity='high', moderation='low', output_compression=100, output_format='png', partial_images=3, quality='high', size='1024x1024', ) ], output_type=BinaryImage, ) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` _(This example is complete, it can be run "as is")_ OpenAI Responses models also respect the `aspect_ratio` parameter. Because the OpenAI API only exposes discrete image sizes, Pydantic AI maps `'1:1'` -> `1024x1024`, `'2:3'` -> `1024x1536`, and `'3:2'` -> `1536x1024`. Providing any other aspect ratio results in an error, and if you also set `size` it must match the computed value. To control the aspect ratio when using Gemini image models, include the `ImageGenerationTool` explicitly: image\_generation\_google\_aspect\_ratio.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent( 'google-gla:gemini-3-pro-image-preview', builtin_tools=[ImageGenerationTool(aspect_ratio='16:9')], output_type=BinaryImage, ) result = agent.run_sync('Generate a wide illustration of an axolotl city skyline.') assert isinstance(result.output, BinaryImage) ``` _(This example is complete, it can be run "as is")_ To control the image resolution with Google image generation models (starting with Gemini 3 Pro Image), use the `size` parameter: image\_generation\_google\_resolution.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent( 'google-gla:gemini-3-pro-image-preview', builtin_tools=[ImageGenerationTool(aspect_ratio='16:9', size='4K')], output_type=BinaryImage, ) result = agent.run_sync('Generate a high-resolution wide landscape illustration of an axolotl.') assert isinstance(result.output, BinaryImage) ``` _(This example is complete, it can be run "as is")_ For more details, check the [API documentation](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool). #### Provider Support Parameter OpenAI Google `background` ✅ ❌ `input_fidelity` ✅ ❌ `moderation` ✅ ❌ `output_compression` ✅ (100 (default), jpeg or webp only) ✅ (75 (default), jpeg only, Vertex AI only) `output_format` ✅ ✅ (Vertex AI only) `partial_images` ✅ ❌ `quality` ✅ ❌ `size` ✅ (auto (default), 1024x1024, 1024x1536, 1536x1024) ✅ (512, 1K (default), 2K, 4K) `aspect_ratio` ✅ (1:1, 2:3, 3:2) ✅ (1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9) Notes - **OpenAI**: `auto` lets the model select the value. - **Google (Vertex AI)**: Setting `output_compression` will default `output_format` to `jpeg` if not specified. ## Web Fetch Tool Tip For a model-agnostic approach with automatic local fallback, see the [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch) [capability](/docs/ai/core-concepts/capabilities#provider-adaptive-tools). The [`WebFetchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebFetchTool) enables your agent to pull URL contents into its context, allowing it to pull up-to-date information from the web. ### Provider Support Provider Supported Notes Anthropic ✅ Full feature support. Uses Anthropic's [Web Fetch Tool](https://docs.claude.com/en/docs/agents-and-tools/tool-use/web-fetch-tool) internally to retrieve URL contents. Google ✅ No parameter support. The limits are fixed at 20 URLs per request with a maximum of 34MB per URL. Using built-in tools and function tools (including [output tools](/docs/ai/core-concepts/output#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](/docs/ai/core-concepts/output#prompted-output) instead. xAI ❌ Web browsing is implemented as part of [`WebSearchTool`](#web-search-tool) with xAI. OpenAI ❌ Groq ❌ Bedrock ❌ Mistral ❌ Cohere ❌ HuggingFace ❌ Outlines ❌ ### Usage web\_fetch\_basic.py ```python from pydantic_ai import Agent, WebFetchTool agent = Agent('google-gla:gemini-3-flash-preview', builtin_tools=[WebFetchTool()]) result = agent.run_sync('What is this? https://ai.pydantic.dev') print(result.output) #> A Python agent framework for building Generative AI applications. ``` _(This example is complete, it can be run "as is")_ ### Configuration Options The `WebFetchTool` supports several configuration parameters: web\_fetch\_configured.py ```python from pydantic_ai import Agent, WebFetchTool agent = Agent( 'anthropic:claude-sonnet-4-6', builtin_tools=[ WebFetchTool( allowed_domains=['ai.pydantic.dev', 'docs.pydantic.dev'], max_uses=10, enable_citations=True, max_content_tokens=50000, ) ], ) result = agent.run_sync( 'Compare the documentation at https://ai.pydantic.dev and https://docs.pydantic.dev' ) print(result.output) """ Both sites provide comprehensive documentation for Pydantic projects. ai.pydantic.dev focuses on PydanticAI, a framework for building AI agents, while docs.pydantic.dev covers Pydantic, the data validation library. They share similar documentation styles and both emphasize type safety and developer experience. """ ``` _(This example is complete, it can be run "as is")_ #### Provider Support Parameter Anthropic Google `max_uses` ✅ ❌ `allowed_domains` ✅ ❌ `blocked_domains` ✅ ❌ `enable_citations` ✅ ❌ `max_content_tokens` ✅ ❌ Anthropic Domain Filtering With Anthropic, you can only use either `blocked_domains` or `allowed_domains`, not both. ## Memory Tool The [`MemoryTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MemoryTool) enables your agent to use memory. ### Provider Support Provider Supported Notes Anthropic ✅ Requires a tool named `memory` to be defined that implements [specific sub-commands](https://docs.claude.com/en/docs/agents-and-tools/tool-use/memory-tool#tool-commands). You can use a subclass of [`anthropic.lib.tools.BetaAbstractMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/tools/_beta_builtin_memory_tool.py) as documented below. Google ❌ OpenAI ❌ Groq ❌ Bedrock ❌ Mistral ❌ Cohere ❌ HuggingFace ❌ ### Usage The Anthropic SDK provides an abstract [`BetaAbstractMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/tools/_beta_builtin_memory_tool.py) class that you can subclass to create your own memory storage solution (e.g., database, cloud storage, encrypted files, etc.). Their [`LocalFilesystemMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/examples/memory/basic.py) example can serve as a starting point. The following example uses a subclass that hard-codes a specific memory. The bits specific to Pydantic AI are the `MemoryTool` built-in tool and the `memory` tool definition that forwards commands to the `call` method of the `BetaAbstractMemoryTool` subclass. anthropic\_memory.py ```python from typing import Any from anthropic.lib.tools import BetaAbstractMemoryTool from anthropic.types.beta import ( BetaMemoryTool20250818CreateCommand, BetaMemoryTool20250818DeleteCommand, BetaMemoryTool20250818InsertCommand, BetaMemoryTool20250818RenameCommand, BetaMemoryTool20250818StrReplaceCommand, BetaMemoryTool20250818ViewCommand, ) from pydantic_ai import Agent, MemoryTool class FakeMemoryTool(BetaAbstractMemoryTool): def view(self, command: BetaMemoryTool20250818ViewCommand) -> str: return 'The user lives in Mexico City.' def create(self, command: BetaMemoryTool20250818CreateCommand) -> str: return f'File created successfully at {command.path}' def str_replace(self, command: BetaMemoryTool20250818StrReplaceCommand) -> str: return f'File {command.path} has been edited' def insert(self, command: BetaMemoryTool20250818InsertCommand) -> str: return f'Text inserted at line {command.insert_line} in {command.path}' def delete(self, command: BetaMemoryTool20250818DeleteCommand) -> str: return f'File deleted: {command.path}' def rename(self, command: BetaMemoryTool20250818RenameCommand) -> str: return f'Renamed {command.old_path} to {command.new_path}' def clear_all_memory(self) -> str: return 'All memory cleared' fake_memory = FakeMemoryTool() agent = Agent('anthropic:claude-sonnet-4-6', builtin_tools=[MemoryTool()]) @agent.tool_plain def memory(**command: Any) -> Any: return fake_memory.call(command) result = agent.run_sync('Remember that I live in Mexico City') print(result.output) """ Got it! I've recorded that you live in Mexico City. I'll remember this for future reference. """ result = agent.run_sync('Where do I live?') print(result.output) #> You live in Mexico City. ``` _(This example is complete, it can be run "as is")_ ## MCP Server Tool Tip For a model-agnostic approach with automatic local fallback, see the [`MCP`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.MCP) [capability](/docs/ai/core-concepts/capabilities#provider-adaptive-tools). The [`MCPServerTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MCPServerTool) allows your agent to use remote MCP servers with communication handled by the model provider. This requires the MCP server to live at a public URL the provider can reach and does not support many of the advanced features of Pydantic AI's agent-side [MCP support](/docs/ai/mcp/client), but can result in optimized context use and caching, and faster performance due to the lack of a round-trip back to Pydantic AI. ### Provider Support Provider Supported Notes OpenAI Responses ✅ Full feature support. [Connectors](https://platform.openai.com/docs/guides/tools-connectors-mcp#connectors) can be used by specifying a special `x-openai-connector:` URL. Anthropic ✅ Full feature support xAI ✅ Full feature support Google ❌ Not supported Groq ❌ Not supported OpenAI Chat Completions ❌ Not supported Bedrock ❌ Not supported Mistral ❌ Not supported Cohere ❌ Not supported HuggingFace ❌ Not supported ### Usage mcp\_server\_anthropic.py ```python from pydantic_ai import Agent, MCPServerTool agent = Agent( 'anthropic:claude-sonnet-4-6', builtin_tools=[ MCPServerTool( id='deepwiki', url='https://mcp.deepwiki.com/mcp', # (1) ) ] ) result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.') print(result.output) """ The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications. """ ``` 1. The [DeepWiki MCP server](https://docs.devin.ai/work-with-devin/deepwiki-mcp) does not require authorization. _(This example is complete, it can be run "as is")_ With OpenAI, you must use their Responses API to access the MCP server tool: mcp\_server\_openai.py ```python from pydantic_ai import Agent, MCPServerTool agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[ MCPServerTool( id='deepwiki', url='https://mcp.deepwiki.com/mcp', # (1) ) ] ) result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.') print(result.output) """ The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications. """ ``` 1. The [DeepWiki MCP server](https://docs.devin.ai/work-with-devin/deepwiki-mcp) does not require authorization. _(This example is complete, it can be run "as is")_ ### Configuration Options The `MCPServerTool` supports several configuration parameters for custom MCP servers: mcp\_server\_configured\_url.py ```python import os from pydantic_ai import Agent, MCPServerTool agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[ MCPServerTool( id='github', url='https://api.githubcopilot.com/mcp/', authorization_token=os.getenv('GITHUB_ACCESS_TOKEN', 'mock-access-token'), # (1) allowed_tools=['search_repositories', 'list_commits'], description='GitHub MCP server', headers={'X-Custom-Header': 'custom-value'}, ) ] ) result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.') print(result.output) """ The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications. """ ``` 1. The [GitHub MCP server](https://github.com/github/github-mcp-server) requires an authorization token. For OpenAI Responses, you can use a [connector](https://platform.openai.com/docs/guides/tools-connectors-mcp#connectors) by specifying a special `x-openai-connector:` URL: _(This example is complete, it can be run "as is")_ mcp\_server\_configured\_connector\_id.py ```python import os from pydantic_ai import Agent, MCPServerTool agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[ MCPServerTool( id='google-calendar', url='x-openai-connector:connector_googlecalendar', authorization_token=os.getenv('GOOGLE_API_KEY', 'mock-api-key'), # (1) ) ] ) result = agent.run_sync('What do I have on my calendar today?') print(result.output) #> You're going to spend all day playing with Pydantic AI. ``` 1. OpenAI's Google Calendar connector requires an [authorization token](https://platform.openai.com/docs/guides/tools-connectors-mcp#authorizing-a-connector). _(This example is complete, it can be run "as is")_ #### Provider Support Parameter OpenAI Anthropic xAI `authorization_token` ✅ ✅ ✅ `allowed_tools` ✅ ✅ ✅ `description` ✅ ❌ ✅ `headers` ✅ ❌ ✅ ## File Search Tool The [`FileSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.FileSearchTool) enables your agent to search through uploaded files using vector search, providing a fully managed Retrieval-Augmented Generation (RAG) system. This tool handles file storage, chunking, embedding generation, and context injection into prompts. ### Provider Support Provider Supported Notes OpenAI Responses ✅ Full feature support. Requires files to be uploaded to vector stores via the [OpenAI Files API](https://platform.openai.com/docs/api-reference/files). To include search results on the [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) available via [`ModelResponse.builtin_tool_calls`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.builtin_tool_calls), enable the [`OpenAIResponsesModelSettings.openai_include_file_search_results`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_include_file_search_results) [model setting](/docs/ai/core-concepts/agent#model-run-settings). Google (Gemini) ✅ Requires files to be uploaded via the [Gemini Files API](https://ai.google.dev/gemini-api/docs/files). Files are automatically deleted after 48 hours. Supports up to 2 GB per file and 20 GB per project. Using built-in tools and function tools (including [output tools](/docs/ai/core-concepts/output#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](/docs/ai/core-concepts/output#prompted-output) instead. Google (Vertex AI) ❌ Anthropic ❌ Not supported Groq ❌ Not supported OpenAI Chat Completions ❌ Not supported Bedrock ❌ Not supported Mistral ❌ Not supported Cohere ❌ Not supported HuggingFace ❌ Not supported Outlines ❌ Not supported ### Usage #### OpenAI Responses With OpenAI, you need to first [upload files to a vector store](https://platform.openai.com/docs/assistants/tools/file-search), then reference the vector store IDs when using the `FileSearchTool`. file\_search\_openai\_upload.py ```py import asyncio from pydantic_ai import Agent, FileSearchTool from pydantic_ai.models.openai import OpenAIResponsesModel async def main(): model = OpenAIResponsesModel('gpt-5.2') with open('my_document.txt', 'rb') as f: file = await model.client.files.create(file=f, purpose='assistants') vector_store = await model.client.vector_stores.create(name='my-docs') await model.client.vector_stores.files.create( vector_store_id=vector_store.id, file_id=file.id ) agent = Agent( model, builtin_tools=[FileSearchTool(file_store_ids=[vector_store.id])] ) result = await agent.run('What information is in my documents about pydantic?') print(result.output) #> Based on your documents, Pydantic is a data validation library for Python... asyncio.run(main()) ``` #### Google (Gemini) With Gemini, you need to first [create a file search store via the Files API](https://ai.google.dev/gemini-api/docs/files), then reference the file search store names. file\_search\_google\_upload.py ```py import asyncio from pydantic_ai import Agent, FileSearchTool from pydantic_ai.models.google import GoogleModel async def main(): model = GoogleModel('gemini-3-flash-preview') store = await model.client.aio.file_search_stores.create( config={'display_name': 'my-docs'} ) with open('my_document.txt', 'rb') as f: await model.client.aio.file_search_stores.upload_to_file_search_store( file_search_store_name=store.name, file=f, config={'mime_type': 'text/plain'} ) agent = Agent( model, builtin_tools=[FileSearchTool(file_store_ids=[store.name])] ) result = await agent.run('Summarize the key points from my uploaded documents.') print(result.output) #> The documents discuss the following key points: ... asyncio.run(main()) ``` ## API Reference For complete API documentation, see the [API Reference](/docs/ai/api/pydantic-ai/builtin_tools). --- # [Common Tools](https://pydantic.dev/docs/ai/tools-toolsets/common-tools/) # Common Tools Pydantic AI ships with native tools that can be used to enhance your agent's capabilities. ## DuckDuckGo Search Tool The DuckDuckGo search tool allows you to search the web for information. It is built on top of the [DuckDuckGo API](https://github.com/deedy5/ddgs). ### Installation To use [`duckduckgo_search_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.duckduckgo.duckduckgo_search_tool), you need to install [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `duckduckgo` optional group: - [pip](#tab-panel-104) - [uv](#tab-panel-105) Terminal ```bash pip install "pydantic-ai-slim[duckduckgo]" ``` Terminal ```bash uv add "pydantic-ai-slim[duckduckgo]" ``` ### Usage Here's an example of how you can use the DuckDuckGo search tool with an agent: duckduckgo\_search.py ```python from pydantic_ai import Agent from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool agent = Agent( 'openai:gpt-5.2', tools=[duckduckgo_search_tool()], instructions='Search DuckDuckGo for the given query and return the results.', ) result = agent.run_sync( 'Can you list the top five highest-grossing animated films of 2025?' ) print(result.output) """ I looked into several sources on animated box‐office performance in 2025, and while detailed rankings can shift as more money is tallied, multiple independent reports have already highlighted a couple of record‐breaking shows. For example: • Ne Zha 2 - News outlets (Variety, Wikipedia's "List of animated feature films of 2025", and others) have reported that this Chinese title not only became the highest‑grossing animated film of 2025 but also broke records as the highest‑grossing non‑English animated film ever. One article noted its run exceeded US$1.7 billion. • Inside Out 2 - According to data shared on Statista and in industry news, this Pixar sequel has been on pace to set new records (with some sources even noting it as the highest‑grossing animated film ever, as of January 2025). Beyond those two, some entertainment trade sites (for example, a Just Jared article titled "Top 10 Highest-Earning Animated Films at the Box Office Revealed") have begun listing a broader top‑10. Although full consolidated figures can sometimes differ by source and are updated daily during a box‑office run, many of the industry trackers have begun to single out five films as the biggest earners so far in 2025. Unfortunately, although multiple articles discuss the "top animated films" of 2025, there isn't yet a single, universally accepted list with final numbers that names the complete top five. (Box‑office rankings, especially mid‑year, can be fluid as films continue to add to their totals.) Based on what several sources note so far, the two undisputed leaders are: 1. Ne Zha 2 2. Inside Out 2 The remaining top spots (3-5) are reported by some outlets in their "Top‑10 Animated Films" lists for 2025 but the titles and order can vary depending on the source and the exact cut‑off date of the data. For the most up‑to‑date and detailed ranking (including the 3rd, 4th, and 5th highest‑grossing films), I recommend checking resources like: • Wikipedia's "List of animated feature films of 2025" page • Box‑office tracking sites (such as Box Office Mojo or The Numbers) • Trade articles like the one on Just Jared To summarize with what is clear from the current reporting: 1. Ne Zha 2 2. Inside Out 2 3-5. Other animated films (yet to be definitively finalized across all reporting outlets) If you're looking for a final, consensus list of the top five, it may be best to wait until the 2025 year‑end box‑office tallies are in or to consult a regularly updated entertainment industry source. Would you like help finding a current source or additional details on where to look for the complete updated list? """ ``` ## Web Fetch Tool The web fetch tool allows your agent to fetch the content of web pages and convert them to markdown. It uses [SSRF protection](https://owasp.org/www-community/attacks/Server_Side_Request_Forgery) to prevent server-side request forgery attacks. ### Installation To use `web_fetch_tool`, you need to install [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `web-fetch` optional group: - [pip](#tab-panel-106) - [uv](#tab-panel-107) Terminal ```bash pip install "pydantic-ai-slim[web-fetch]" ``` Terminal ```bash uv add "pydantic-ai-slim[web-fetch]" ``` ### Usage Here's an example of how you can use the web fetch tool with an agent: web\_fetch.py ```python from pydantic_ai import Agent from pydantic_ai.common_tools.web_fetch import web_fetch_tool agent = Agent( 'openai:gpt-5.2', tools=[web_fetch_tool()], instructions='Fetch web pages and summarize their content.', ) result = agent.run_sync('What is on https://ai.pydantic.dev?') print(result.output) ``` Automatic fallback via WebFetch capability You don't need to use `web_fetch_tool` directly -- the [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch) capability automatically uses it as a local fallback when the model doesn't support builtin URL fetching. ## Tavily Search Tool Note Tavily is a paid service, but they have free credits to explore their product. You need to [sign up for an account](https://app.tavily.com/home) and get an API key to use the Tavily search tool. The Tavily search tool allows you to search the web for information. It is built on top of the [Tavily API](https://tavily.com/). ### Installation To use [`tavily_search_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.tavily.tavily_search_tool), you need to install [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `tavily` optional group: - [pip](#tab-panel-108) - [uv](#tab-panel-109) Terminal ```bash pip install "pydantic-ai-slim[tavily]" ``` Terminal ```bash uv add "pydantic-ai-slim[tavily]" ``` ### Usage Here's an example of how you can use the Tavily search tool with an agent: tavily\_search.py ```python import os from pydantic_ai import Agent from pydantic_ai.common_tools.tavily import tavily_search_tool api_key = os.getenv('TAVILY_API_KEY') assert api_key is not None agent = Agent( 'openai:gpt-5.2', tools=[tavily_search_tool(api_key)], instructions='Search Tavily for the given query and return the results.', ) result = agent.run_sync('Tell me the top news in the GenAI world, give me links.') print(result.output) """ Here are some of the top recent news articles related to GenAI: 1. How CLEAR users can improve risk analysis with GenAI - Thomson Reuters Read more: https://legal.thomsonreuters.com/blog/how-clear-users-can-improve-risk-analysis-with-genai/ (This article discusses how CLEAR's new GenAI-powered tool streamlines risk analysis by quickly summarizing key information from various public data sources.) 2. TELUS Digital Survey Reveals Enterprise Employees Are Entering Sensitive Data Into AI Assistants More Than You Think - FT.com Read more: https://markets.ft.com/data/announce/detail?dockey=600-202502260645BIZWIRE_USPRX____20250226_BW490609-1 (This news piece highlights findings from a TELUS Digital survey showing that many enterprise employees use public GenAI tools and sometimes even enter sensitive data.) 3. The Essential Guide to Generative AI - Virtualization Review Read more: https://virtualizationreview.com/Whitepapers/2025/02/SNOWFLAKE-The-Essential-Guide-to-Generative-AI.aspx (This guide provides insights into how GenAI is revolutionizing enterprise strategies and productivity, with input from industry leaders.) Feel free to click on the links to dive deeper into each story! """ ``` ### Configuring Parameters The `tavily_search_tool` factory accepts optional parameters that control search behavior. `max_results` is always developer-controlled and never appears in the LLM tool schema. Other parameters, when provided, are fixed for all searches and hidden from the LLM's tool schema. Parameters left unset remain available for the LLM to set per-call. For example, you can lock in `max_results` and `include_domains` at tool creation time while still letting the LLM control `exclude_domains`: tavily\_domain\_filtering.py ```python import os from pydantic_ai import Agent from pydantic_ai.common_tools.tavily import tavily_search_tool api_key = os.getenv('TAVILY_API_KEY') assert api_key is not None agent = Agent( 'openai:gpt-5.2', tools=[tavily_search_tool(api_key, max_results=5, include_domains=['arxiv.org'])], instructions='Search for information and return the results.', ) result = agent.run_sync( 'Find recent papers about transformer architectures' ) print(result.output) """ Here are some recent papers about transformer architectures from arxiv.org: 1. "Attention Is All You Need" - The foundational paper on the Transformer model. 2. "FlashAttention: Fast and Memory-Efficient Exact Attention" - Proposes an IO-aware attention algorithm. """ ``` ## Exa Search Tool Note Exa is a paid service with free credits to explore their product. You need to [sign up for an account](https://dashboard.exa.ai) and get an API key to use the Exa tools. Exa is a neural search engine that finds high-quality, relevant results across billions of web pages. It provides several tools including web search, finding similar pages, content retrieval, and AI-powered answers. ### Installation To use Exa tools, you need to install [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `exa` optional group: - [pip](#tab-panel-110) - [uv](#tab-panel-111) Terminal ```bash pip install "pydantic-ai-slim[exa]" ``` Terminal ```bash uv add "pydantic-ai-slim[exa]" ``` ### Usage You can use Exa tools individually or as a toolset. The following tools are available: - [`exa_search_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.exa.exa_search_tool): Search the web with various search types (auto, keyword, neural, fast, deep) - [`exa_find_similar_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.exa.exa_find_similar_tool): Find pages similar to a given URL - [`exa_get_contents_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.exa.exa_get_contents_tool): Get full text content from URLs - [`exa_answer_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.exa.exa_answer_tool): Get AI-powered answers with citations #### Using Individual Tools exa\_search.py ```python import os from pydantic_ai import Agent from pydantic_ai.common_tools.exa import exa_search_tool api_key = os.getenv('EXA_API_KEY') assert api_key is not None agent = Agent( 'openai:gpt-5.2', tools=[exa_search_tool(api_key, num_results=5, max_characters=1000)], system_prompt='Search the web for information using Exa.', ) result = agent.run_sync('What are the latest developments in quantum computing?') print(result.output) ``` #### Using ExaToolset For better efficiency when using multiple Exa tools, use [`ExaToolset`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.exa.ExaToolset) which shares a single API client across all tools. You can configure which tools to include: exa\_toolset.py ```python import os from pydantic_ai import Agent from pydantic_ai.common_tools.exa import ExaToolset api_key = os.getenv('EXA_API_KEY') assert api_key is not None toolset = ExaToolset( api_key, num_results=5, max_characters=1000, # Limit text content to control token usage include_search=True, # Include the search tool (default: True) include_find_similar=True, # Include the find_similar tool (default: True) include_get_contents=False, # Exclude the get_contents tool include_answer=True, # Include the answer tool (default: True) ) agent = Agent( 'openai:gpt-5.2', toolsets=[toolset], system_prompt='You have access to Exa search tools to find information on the web.', ) result = agent.run_sync('Find recent AI research papers and summarize the key findings.') print(result.output) ``` --- # [Deferred Tools](https://pydantic.dev/docs/ai/tools-toolsets/deferred-tools/) # Deferred Tools There are a few scenarios where the model should be able to call a tool that should not or cannot be executed during the same agent run inside the same Python process: - it may need to be approved by the user first - it may depend on an upstream service, frontend, or user to provide the result - the result could take longer to generate than it's reasonable to keep the agent process running To support these use cases, Pydantic AI provides the concept of deferred tools, which come in two flavors documented below: - tools that [require approval](#human-in-the-loop-tool-approval) - tools that are [executed externally](#external-tool-execution) When the model calls a deferred tool, the agent run will end with a [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output object containing information about the deferred tool calls. Once the approvals and/or results are ready, a new agent run can then be started with the original run's [message history](/docs/ai/core-concepts/message-history) plus a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object holding results for each tool call in `DeferredToolRequests`, which will continue the original run where it left off. Note that handling deferred tool calls requires `DeferredToolRequests` to be in the `Agent`'s [`output_type`](/docs/ai/core-concepts/output#structured-output) so that the possible types of the agent run output are correctly inferred. If your agent can also be used in a context where no deferred tools are available and you don't want to deal with that type everywhere you use the agent, you can instead pass the `output_type` argument when you run the agent using [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`agent.run_sync()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync), [`agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream), or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter). Note that the run-time `output_type` overrides the one specified at construction time (for type inference reasons), so you'll need to include the original output type explicitly. ## Human-in-the-Loop Tool Approval If a tool function always requires approval, you can pass the `requires_approval=True` argument to the [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) decorator, [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain) decorator, [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) class, [`FunctionToolset.tool`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.tool) decorator, or [`FunctionToolset.add_function()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.add_function) method. Inside the function, you can then assume that the tool call has been approved. If whether a tool function requires approval depends on the tool call arguments or the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) (e.g. [dependencies](/docs/ai/core-concepts/dependencies) or message history), you can raise the [`ApprovalRequired`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ApprovalRequired) exception from the tool function. The [`RunContext.tool_call_approved`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.tool_call_approved) property will be `True` if the tool call has already been approved. To require approval for calls to tools provided by a [toolset](/docs/ai/tools-toolsets/toolsets) (like an [MCP server](/docs/ai/mcp/client)), see the [`ApprovalRequiredToolset` documentation](/docs/ai/tools-toolsets/toolsets#requiring-tool-approval). When the model calls a tool that requires approval, the agent run will end with a [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output object with an `approvals` list holding [`ToolCallPart`s](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) containing the tool name, validated arguments, and a unique tool call ID. Once you've gathered the user's approvals or denials, you can build a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object with an `approvals` dictionary that maps each tool call ID to a boolean, a [`ToolApproved`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolApproved) object (with optional `override_args`), or a [`ToolDenied`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDenied) object (with an optional custom `message` to provide to the model). You can also provide a `metadata` dictionary on `DeferredToolResults` that maps each tool call ID to a dictionary of metadata that will be available in the tool's [`RunContext.tool_call_metadata`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.tool_call_metadata) attribute. This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](/docs/ai/core-concepts/message-history). Here's an example that shows how to require approval for all file deletions, and for updates of specific protected files: tool\_requires\_approval.py ```python from pydantic_ai import ( Agent, ApprovalRequired, DeferredToolRequests, DeferredToolResults, RunContext, ToolDenied, ) agent = Agent('openai:gpt-5.2', output_type=[str, DeferredToolRequests]) PROTECTED_FILES = {'.env'} @agent.tool def update_file(ctx: RunContext, path: str, content: str) -> str: if path in PROTECTED_FILES and not ctx.tool_call_approved: raise ApprovalRequired(metadata={'reason': 'protected'}) # (1) return f'File {path!r} updated: {content!r}' @agent.tool_plain(requires_approval=True) def delete_file(path: str) -> str: return f'File {path!r} deleted' result = agent.run_sync('Delete `__init__.py`, write `Hello, world!` to `README.md`, and clear `.env`') messages = result.all_messages() assert isinstance(result.output, DeferredToolRequests) requests = result.output print(requests) """ DeferredToolRequests( calls=[], approvals=[ ToolCallPart( tool_name='update_file', args={'path': '.env', 'content': ''}, tool_call_id='update_file_dotenv', ), ToolCallPart( tool_name='delete_file', args={'path': '__init__.py'}, tool_call_id='delete_file', ), ], metadata={'update_file_dotenv': {'reason': 'protected'}}, ) """ results = DeferredToolResults() for call in requests.approvals: result = False if call.tool_name == 'update_file': # Approve all updates result = True elif call.tool_name == 'delete_file': # deny all deletes result = ToolDenied('Deleting files is not allowed') results.approvals[call.tool_call_id] = result result = agent.run_sync( 'Now create a backup of README.md', # (2) message_history=messages, deferred_tool_results=results, ) print(result.output) """ Here's what I've done: - Attempted to delete __init__.py, but deletion is not allowed. - Updated README.md with: Hello, world! - Cleared .env (set to empty). - Created a backup at README.md.bak containing: Hello, world! If you want a different backup name or format (e.g., timestamped like README_2025-11-24.bak), let me know. """ print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Delete `__init__.py`, write `Hello, world!` to `README.md`, and clear `.env`', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='delete_file', args={'path': '__init__.py'}, tool_call_id='delete_file', ), ToolCallPart( tool_name='update_file', args={'path': 'README.md', 'content': 'Hello, world!'}, tool_call_id='update_file_readme', ), ToolCallPart( tool_name='update_file', args={'path': '.env', 'content': ''}, tool_call_id='update_file_dotenv', ), ], usage=RequestUsage(input_tokens=63, output_tokens=21), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='update_file', content="File 'README.md' updated: 'Hello, world!'", tool_call_id='update_file_readme', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='update_file', content="File '.env' updated: ''", tool_call_id='update_file_dotenv', timestamp=datetime.datetime(...), ), ToolReturnPart( tool_name='delete_file', content='Deleting files is not allowed', tool_call_id='delete_file', timestamp=datetime.datetime(...), outcome='denied', ), UserPromptPart( content='Now create a backup of README.md', timestamp=datetime.datetime(...), ), ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='update_file', args={'path': 'README.md.bak', 'content': 'Hello, world!'}, tool_call_id='update_file_backup', ) ], usage=RequestUsage(input_tokens=86, output_tokens=31), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='update_file', content="File 'README.md.bak' updated: 'Hello, world!'", tool_call_id='update_file_backup', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ TextPart( content="Here's what I've done: - Attempted to delete __init__.py, but deletion is not allowed. - Updated README.md with: Hello, world! - Cleared .env (set to empty). - Created a backup at README.md.bak containing: Hello, world! If you want a different backup name or format (e.g., timestamped like README_2025-11-24.bak), let me know." ) ], usage=RequestUsage(input_tokens=93, output_tokens=89), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` The optional `metadata` parameter can attach arbitrary context to deferred tool calls, accessible in `DeferredToolRequests.metadata` keyed by `tool_call_id`. This second agent run continues from where the first run left off, providing the tool approval results and optionally a new `user_prompt` to give the model additional instructions alongside the deferred results. _(This example is complete, it can be run "as is")_ ## External Tool Execution When the result of a tool call cannot be generated inside the same agent run in which it was called, the tool is considered to be external. Examples of external tools are client-side tools implemented by a web or app frontend, and slow tasks that are passed off to a background worker or external service instead of keeping the agent process running. If whether a tool call should be executed externally depends on the tool call arguments, the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) (e.g. [dependencies](/docs/ai/core-concepts/dependencies) or message history), or how long the task is expected to take, you can define a tool function and conditionally raise the [`CallDeferred`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.CallDeferred) exception. Before raising the exception, the tool function would typically schedule some background task and pass along the [`RunContext.tool_call_id`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.tool_call_id) so that the result can be matched to the deferred tool call later. If a tool is always executed externally and its definition is provided to your code along with a JSON schema for its arguments, you can use an [`ExternalToolset`](/docs/ai/tools-toolsets/toolsets#external-toolset). If the external tools are known up front and you don't have the arguments JSON schema handy, you can also define a tool function with the appropriate signature that does nothing but raise the [`CallDeferred`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.CallDeferred) exception. When the model calls an external tool, the agent run will end with a [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output object with a `calls` list holding [`ToolCallPart`s](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) containing the tool name, validated arguments, and a unique tool call ID. Once the tool call results are ready, you can build a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object with a `calls` dictionary that maps each tool call ID to an arbitrary value to be returned to the model, a [`ToolReturn`](/docs/ai/tools-toolsets/tools-advanced#advanced-tool-returns) object, or a [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exception in case the tool call failed and the model should [try again](/docs/ai/tools-toolsets/tools-advanced#tool-retries). This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](/docs/ai/core-concepts/message-history). Here's an example that shows how to move a task that takes a while to complete to the background and return the result to the model once the task is complete: external\_tool.py ```python import asyncio from dataclasses import dataclass from typing import Any from pydantic_ai import ( Agent, CallDeferred, DeferredToolRequests, DeferredToolResults, ModelRetry, RunContext, ) @dataclass class TaskResult: task_id: str result: Any async def calculate_answer_task(task_id: str, question: str) -> TaskResult: await asyncio.sleep(1) return TaskResult(task_id=task_id, result=42) agent = Agent('openai:gpt-5.2', output_type=[str, DeferredToolRequests]) tasks: list[asyncio.Task[TaskResult]] = [] @agent.tool async def calculate_answer(ctx: RunContext, question: str) -> str: task_id = f'task_{len(tasks)}' # (1) task = asyncio.create_task(calculate_answer_task(task_id, question)) tasks.append(task) raise CallDeferred(metadata={'task_id': task_id}) # (2) async def main(): result = await agent.run('Calculate the answer to the ultimate question of life, the universe, and everything') messages = result.all_messages() assert isinstance(result.output, DeferredToolRequests) requests = result.output print(requests) """ DeferredToolRequests( calls=[ ToolCallPart( tool_name='calculate_answer', args={ 'question': 'the ultimate question of life, the universe, and everything' }, tool_call_id='pyd_ai_tool_call_id', ) ], approvals=[], metadata={'pyd_ai_tool_call_id': {'task_id': 'task_0'}}, ) """ done, _ = await asyncio.wait(tasks) # (3) task_results = [task.result() for task in done] task_results_by_task_id = {result.task_id: result.result for result in task_results} results = DeferredToolResults() for call in requests.calls: try: task_id = requests.metadata[call.tool_call_id]['task_id'] result = task_results_by_task_id[task_id] except KeyError: result = ModelRetry('No result for this tool call was found.') results.calls[call.tool_call_id] = result result = await agent.run(message_history=messages, deferred_tool_results=results) print(result.output) #> The answer to the ultimate question of life, the universe, and everything is 42. print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Calculate the answer to the ultimate question of life, the universe, and everything', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='calculate_answer', args={ 'question': 'the ultimate question of life, the universe, and everything' }, tool_call_id='pyd_ai_tool_call_id', ) ], usage=RequestUsage(input_tokens=63, output_tokens=13), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='calculate_answer', content=42, tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ TextPart( content='The answer to the ultimate question of life, the universe, and everything is 42.' ) ], usage=RequestUsage(input_tokens=64, output_tokens=28), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` Generate a task ID that can be tracked independently of the tool call ID. The optional `metadata` parameter passes the `task_id` so it can be matched with results later, accessible in `DeferredToolRequests.metadata` keyed by `tool_call_id`. In reality, this would typically happen in a separate process that polls for the task status or is notified when all pending tasks are complete. _(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_ ## See Also - [Function Tools](/docs/ai/tools-toolsets/tools) - Basic tool concepts and registration - [Advanced Tool Features](/docs/ai/tools-toolsets/tools-advanced) - Custom schemas, dynamic tools, and execution details - [Toolsets](/docs/ai/tools-toolsets/toolsets) - Managing collections of tools, including `ExternalToolset` for external tools - [Message History](/docs/ai/core-concepts/message-history) - Understanding how to work with message history for deferred tools --- # [Third-Party Tools](https://pydantic.dev/docs/ai/tools-toolsets/third-party-tools/) # Third-Party Tools Pydantic AI supports integration with various third-party tool libraries, allowing you to leverage existing tool ecosystems in your agents. Third-party tools are also available as [capabilities](/docs/ai/core-concepts/capabilities#third-party-capabilities) -- see [Extensibility](/docs/ai/guides/extensibility) for the full ecosystem. ## MCP Tools See the [MCP Client](/docs/ai/mcp/client) documentation for how to use MCP servers with Pydantic AI as [toolsets](/docs/ai/tools-toolsets/toolsets). ## LangChain Tools If you'd like to use a tool from LangChain's [community tool library](https://python.langchain.com/docs/integrations/tools/) with Pydantic AI, you can use the [`tool_from_langchain`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.langchain.tool_from_langchain) convenience method. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the LangChain tool, and up to the LangChain tool to raise an error if the arguments are invalid. You will need to install the `langchain-community` package and any others required by the tool in question. Here is how you can use the LangChain `DuckDuckGoSearchRun` tool, which requires the `ddgs` package: ```python from langchain_community.tools import DuckDuckGoSearchRun from pydantic_ai import Agent from pydantic_ai.ext.langchain import tool_from_langchain search = DuckDuckGoSearchRun() search_tool = tool_from_langchain(search) agent = Agent( 'google-gla:gemini-3-flash-preview', tools=[search_tool], ) result = agent.run_sync('What is the release date of Elden Ring Nightreign?') # (1) print(result.output) #> Elden Ring Nightreign is planned to be released on May 30, 2025. ``` The release date of this game is the 30th of May 2025, which is after the knowledge cutoff for Gemini 2.0 (August 2024). If you'd like to use multiple LangChain tools or a LangChain [toolkit](https://python.langchain.com/docs/concepts/tools/#toolkits), you can use the [`LangChainToolset`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.langchain.LangChainToolset) [toolset](/docs/ai/tools-toolsets/toolsets) which takes a list of LangChain tools: ```python from langchain_community.agent_toolkits import SlackToolkit from pydantic_ai import Agent from pydantic_ai.ext.langchain import LangChainToolset toolkit = SlackToolkit() toolset = LangChainToolset(toolkit.get_tools()) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) # ... ``` ## ACI.dev Tools If you'd like to use a tool from the [ACI.dev tool library](https://www.aci.dev/tools) with Pydantic AI, you can use the [`tool_from_aci`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.aci.tool_from_aci) convenience method. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the ACI tool, and up to the ACI tool to raise an error if the arguments are invalid. You will need to install the `aci-sdk` package, set your ACI API key in the `ACI_API_KEY` environment variable, and pass your ACI "linked account owner ID" to the function. Here is how you can use the ACI.dev `TAVILY__SEARCH` tool: ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import tool_from_aci tavily_search = tool_from_aci( 'TAVILY__SEARCH', linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent( 'google-gla:gemini-3-flash-preview', tools=[tavily_search], ) result = agent.run_sync('What is the release date of Elden Ring Nightreign?') # (1) print(result.output) #> Elden Ring Nightreign is planned to be released on May 30, 2025. ``` The release date of this game is the 30th of May 2025, which is after the knowledge cutoff for Gemini 2.0 (August 2024). If you'd like to use multiple ACI.dev tools, you can use the [`ACIToolset`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.aci.ACIToolset) [toolset](/docs/ai/tools-toolsets/toolsets) which takes a list of ACI tool names as well as the `linked_account_owner_id`: ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import ACIToolset toolset = ACIToolset( [ 'OPEN_WEATHER_MAP__CURRENT_WEATHER', 'OPEN_WEATHER_MAP__FORECAST', ], linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) ``` ## See Also - [Function Tools](/docs/ai/tools-toolsets/tools) - Basic tool concepts and registration - [Toolsets](/docs/ai/tools-toolsets/toolsets) - Managing collections of tools - [MCP Client](/docs/ai/mcp/client) - Using MCP servers with Pydantic AI - [LangChain Toolsets](/docs/ai/tools-toolsets/toolsets#langchain-tools) - Using LangChain toolsets - [ACI.dev Toolsets](/docs/ai/tools-toolsets/toolsets#aci-tools) - Using ACI.dev toolsets --- # [Function Tools](https://pydantic.dev/docs/ai/tools-toolsets/tools/) # Function Tools Function tools provide a mechanism for models to perform actions and retrieve extra information to help them generate a response. They're useful when you want to enable the model to take some action and use the result, when it is impractical or impossible to put all the context an agent might need into the instructions, or when you want to make agents' behavior more deterministic or reliable by deferring some of the logic required to generate a response to another (not necessarily AI-powered) tool. If you want a model to be able to call a function as its final action, without the result being sent back to the model, you can use an [output function](/docs/ai/core-concepts/output#output-functions) instead. There are a number of ways to register tools with an agent: - via the [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) decorator -- for tools that need access to the agent [context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) - via the [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain) decorator -- for tools that do not need access to the agent [context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) - via the [`tools`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) keyword argument to `Agent` which can take either plain functions, or instances of [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) For more advanced use cases, the [toolsets](/docs/ai/tools-toolsets/toolsets) feature lets you manage collections of tools (built by you or provided by an [MCP server](/docs/ai/mcp/client) or other [third party](/docs/ai/tools-toolsets/third-party-tools#third-party-tools)) and register them with an agent in one go via the [`toolsets`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) keyword argument to `Agent`. Internally, all `tools` and `toolsets` are gathered into a single [combined toolset](/docs/ai/tools-toolsets/toolsets#combining-toolsets) that's made available to the model. Function tools vs. RAG Function tools are basically the "R" of RAG (Retrieval-Augmented Generation) -- they augment what the model can do by letting it request extra information. The main semantic difference between Pydantic AI Tools and RAG is RAG is synonymous with vector search, while Pydantic AI tools are more general-purpose. For vector search, you can use our [embeddings](/docs/ai/guides/embeddings) support to generate embeddings across multiple providers. Function Tools vs. Structured Outputs As the name suggests, function tools use the model's "tools" or "functions" API to let the model know what is available to call. Tools or functions are also used to define the schema(s) for [structured output](/docs/ai/core-concepts/output) when using the default [tool output mode](/docs/ai/core-concepts/output#tool-output), thus a model might have access to many tools, some of which call function tools while others end the run and produce a final output. ## Registering via Decorator `@agent.tool` is considered the default decorator since in the majority of cases tools will need access to the agent [context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext). Here's an example using both: dice\_game.py ```python import random from pydantic_ai import Agent, RunContext agent = Agent( 'google-gla:gemini-3-flash-preview', # (1) deps_type=str, # (2) instructions=( "You're a dice game, you should roll the die and see if the number " "you get back matches the user's guess. If so, tell them they're a winner. " "Use the player's name in the response." ), ) @agent.tool_plain # (3) def roll_dice() -> str: """Roll a six-sided die and return the result.""" return str(random.randint(1, 6)) @agent.tool # (4) def get_player_name(ctx: RunContext[str]) -> str: """Get the player's name.""" return ctx.deps dice_result = agent.run_sync('My guess is 4', deps='Anne') # (5) print(dice_result.output) #> Congratulations Anne, you guessed correctly! You're a winner! ``` This is a pretty simple task, so we can use the fast and cheap Gemini flash model. We pass the user's name as the dependency, to keep things simple we use just the name as a string as the dependency. This tool doesn't need any context, it just returns a random number. You could probably use dynamic instructions in this case. This tool needs the player's name, so it uses `RunContext` to access dependencies which are just the player's name in this case. Run the agent, passing the player's name as the dependency. _(This example is complete, it can be run "as is")_ Let's print the messages from that game to see what happened: dice\_game\_messages.py ```python from dice_game import dice_result print(dice_result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='My guess is 4', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.", run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='roll_dice', args={}, tool_call_id='pyd_ai_tool_call_id' ) ], usage=RequestUsage(input_tokens=54, output_tokens=2), model_name='gemini-3-flash-preview', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='roll_dice', content='4', tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.", run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='get_player_name', args={}, tool_call_id='pyd_ai_tool_call_id' ) ], usage=RequestUsage(input_tokens=55, output_tokens=4), model_name='gemini-3-flash-preview', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='get_player_name', content='Anne', tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.", run_id='...', ), ModelResponse( parts=[ TextPart( content="Congratulations Anne, you guessed correctly! You're a winner!" ) ], usage=RequestUsage(input_tokens=56, output_tokens=12), model_name='gemini-3-flash-preview', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` We can represent this with a diagram: ```mermaid sequenceDiagram participant Agent participant LLM Note over Agent: Send prompts Agent ->> LLM: System: "You're a dice game..."
User: "My guess is 4" activate LLM Note over LLM: LLM decides to use
a tool LLM ->> Agent: Call tool
roll_dice() deactivate LLM activate Agent Note over Agent: Rolls a six-sided die Agent -->> LLM: ToolReturn
"4" deactivate Agent activate LLM Note over LLM: LLM decides to use
another tool LLM ->> Agent: Call tool
get_player_name() deactivate LLM activate Agent Note over Agent: Retrieves player name Agent -->> LLM: ToolReturn
"Anne" deactivate Agent activate LLM Note over LLM: LLM constructs final response LLM ->> Agent: ModelResponse
"Congratulations Anne, ..." deactivate LLM Note over Agent: Game session complete ``` ## Registering via Agent Argument As well as using the decorators, we can register tools via the `tools` argument to the [`Agent` constructor](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__). This is useful when you want to reuse tools, and can also give more fine-grained control over the tools. dice\_game\_tool\_kwarg.py ```python import random from pydantic_ai import Agent, RunContext, Tool instructions = """You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response. """ def roll_dice() -> str: """Roll a six-sided die and return the result.""" return str(random.randint(1, 6)) def get_player_name(ctx: RunContext[str]) -> str: """Get the player's name.""" return ctx.deps agent_a = Agent( 'google-gla:gemini-3-flash-preview', deps_type=str, tools=[roll_dice, get_player_name], # (1) instructions=instructions, ) agent_b = Agent( 'google-gla:gemini-3-flash-preview', deps_type=str, tools=[ # (2) Tool(roll_dice, takes_ctx=False), Tool(get_player_name, takes_ctx=True), ], instructions=instructions, ) dice_result = {} dice_result['a'] = agent_a.run_sync('My guess is 6', deps='Yashar') dice_result['b'] = agent_b.run_sync('My guess is 4', deps='Anne') print(dice_result['a'].output) #> Tough luck, Yashar, you rolled a 4. Better luck next time. print(dice_result['b'].output) #> Congratulations Anne, you guessed correctly! You're a winner! ``` The simplest way to register tools via the `Agent` constructor is to pass a list of functions, the function signature is inspected to determine if the tool takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext). `agent_a` and `agent_b` are identical -- but we can use [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) to reuse tool definitions and give more fine-grained control over how tools are defined, e.g. setting their name or description, or using a custom [`prepare`](/docs/ai/tools-toolsets/tools-advanced#tool-prepare) method. _(This example is complete, it can be run "as is")_ ## Tool Output Tools can return anything that Pydantic can serialize to JSON. For advanced output options including multi-modal content and metadata, see [Advanced Tool Features](/docs/ai/tools-toolsets/tools-advanced#function-tool-output). ## Tool Schema Function parameters are extracted from the function signature, and all parameters except `RunContext` are used to build the schema for that tool call. Even better, Pydantic AI extracts the docstring from functions and (thanks to [griffe](https://mkdocstrings.github.io/griffe/)) extracts parameter descriptions from the docstring and adds them to the schema. [Griffe supports](https://mkdocstrings.github.io/griffe/reference/docstrings/#docstrings) extracting parameter descriptions from `google`, `numpy`, and `sphinx` style docstrings. Pydantic AI will infer the format to use based on the docstring, but you can explicitly set it using [`docstring_format`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). You can also enforce parameter requirements by setting `require_parameter_descriptions=True`. This will raise a [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) if a parameter description is missing. To demonstrate a tool's schema, here we use [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) to print the schema a model would receive: tool\_schema.py ```python from pydantic_ai import Agent, ModelMessage, ModelResponse, TextPart from pydantic_ai.models.function import AgentInfo, FunctionModel agent = Agent() @agent.tool_plain(docstring_format='google', require_parameter_descriptions=True) def foobar(a: int, b: str, c: dict[str, list[float]]) -> str: """Get me foobar. Args: a: apple pie b: banana cake c: carrot smoothie """ return f'{a} {b} {c}' def print_schema(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: tool = info.function_tools[0] print(tool.description) #> Get me foobar. print(tool.parameters_json_schema) """ { 'additionalProperties': False, 'properties': { 'a': {'description': 'apple pie', 'type': 'integer'}, 'b': {'description': 'banana cake', 'type': 'string'}, 'c': { 'additionalProperties': {'items': {'type': 'number'}, 'type': 'array'}, 'description': 'carrot smoothie', 'type': 'object', }, }, 'required': ['a', 'b', 'c'], 'type': 'object', } """ return ModelResponse(parts=[TextPart('foobar')]) agent.run_sync('hello', model=FunctionModel(print_schema)) ``` _(This example is complete, it can be run "as is")_ If a tool has a single parameter that can be represented as an object in JSON schema (e.g. dataclass, TypedDict, pydantic model), the schema for the tool is simplified to be just that object. Here's an example where we use [`TestModel.last_model_request_parameters`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel.last_model_request_parameters) to inspect the tool schema that would be passed to the model. single\_parameter\_tool.py ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.models.test import TestModel agent = Agent() class Foobar(BaseModel): """This is a Foobar""" x: int y: str z: float = 3.14 @agent.tool_plain def foobar(f: Foobar) -> str: return str(f) test_model = TestModel() result = agent.run_sync('hello', model=test_model) print(result.output) #> {"foobar":"x=0 y='a' z=3.14"} print(test_model.last_model_request_parameters.function_tools) """ [ ToolDefinition( name='foobar', parameters_json_schema={ 'properties': { 'x': {'type': 'integer'}, 'y': {'type': 'string'}, 'z': {'default': 3.14, 'type': 'number'}, }, 'required': ['x', 'y'], 'title': 'Foobar', 'type': 'object', }, description='This is a Foobar', ) ] """ ``` _(This example is complete, it can be run "as is")_ Debugging Tool Calls Understanding tool behavior is crucial for agent development. By instrumenting your agent with [Logfire](/docs/ai/integrations/logfire), you can see: - What arguments were passed to each tool - What each tool returned - How long each tool took to execute - Any errors that occurred This visibility helps you understand why an agent made specific decisions and identify issues in tool implementations. ## See Also For more tool features and integrations, see: - [Advanced Tool Features](/docs/ai/tools-toolsets/tools-advanced) - Custom schemas, dynamic tools, tool execution and retries - [Toolsets](/docs/ai/tools-toolsets/toolsets) - Managing collections of tools - [Builtin Tools](/docs/ai/tools-toolsets/builtin-tools) - Native tools provided by LLM providers - [Common Tools](/docs/ai/tools-toolsets/common-tools) - Ready-to-use tool implementations - [Third-Party Tools](/docs/ai/tools-toolsets/third-party-tools) - Integrations with MCP, LangChain, ACI.dev and other tool libraries - [Deferred Tools](/docs/ai/tools-toolsets/deferred-tools) - Tools requiring approval or external execution --- # [Advanced Tool Features](https://pydantic.dev/docs/ai/tools-toolsets/tools-advanced/) # Advanced Tool Features This page covers advanced features for function tools in Pydantic AI. For basic tool usage, see the [Function Tools](/docs/ai/tools-toolsets/tools) documentation. ## Tool Output Tools can return anything that Pydantic can serialize to JSON, as well as audio, video, image or document content depending on the types of [multi-modal input](/docs/ai/advanced-features/input) the model supports: function\_tool\_output.py ```python from datetime import datetime from pydantic import BaseModel from pydantic_ai import Agent, DocumentUrl, ImageUrl from pydantic_ai.models.openai import OpenAIResponsesModel class User(BaseModel): name: str age: int agent = Agent(model=OpenAIResponsesModel('gpt-5.2')) @agent.tool_plain def get_current_time() -> datetime: return datetime.now() @agent.tool_plain def get_user() -> User: return User(name='John', age=30) @agent.tool_plain def get_company_logo() -> ImageUrl: return ImageUrl(url='https://iili.io/3Hs4FMg.png') @agent.tool_plain def get_document() -> DocumentUrl: return DocumentUrl(url='https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf') result = agent.run_sync('What time is it?') print(result.output) #> The current time is 10:45 PM on April 17, 2025. result = agent.run_sync('What is the user name?') print(result.output) #> The user's name is John. result = agent.run_sync('What is the company name in the logo?') print(result.output) #> The company name in the logo is "Pydantic." result = agent.run_sync('What is the main content of the document?') print(result.output) #> The document contains just the text "Dummy PDF file." ``` _(This example is complete, it can be run "as is")_ Some models (e.g. Gemini) natively support semi-structured return values, while some expect text (OpenAI) but seem to be just as good at extracting meaning from the data. If a Python object is returned and the model expects a string, the value will be serialized to JSON. ### Advanced Tool Returns For scenarios where you need more control over both the tool's return value and the content sent to the model, you can use [`ToolReturn`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolReturn). This is particularly useful when you want to: - Separate the structured return value from additional content sent to the model - Explicitly send content as a separate user message (rather than in the tool result) - Include additional metadata that shouldn't be sent to the LLM Here's an example of a computer automation tool that captures screenshots and provides visual feedback: advanced\_tool\_return.py ```python from pydantic_ai import Agent, BinaryContent, ToolReturn from pydantic_ai.models.test import TestModel agent = Agent(TestModel()) @agent.tool_plain def click_and_capture(x: int, y: int) -> ToolReturn: """Click at coordinates and show before/after screenshots.""" before_screenshot = BinaryContent(data=b'\x89PNG', media_type='image/png') # perform_click(x, y) after_screenshot = BinaryContent(data=b'\x89PNG', media_type='image/png') return ToolReturn( return_value=f'Successfully clicked at ({x}, {y})', content=[ 'Before:', before_screenshot, 'After:', after_screenshot, ], metadata={ 'coordinates': {'x': x, 'y': y}, 'action_type': 'click_and_capture', }, ) # The model receives the rich visual content for analysis # while your application can access the structured return_value and metadata result = agent.run_sync('Click on the submit button and tell me what happened') print(result.output) #> {"click_and_capture":"Successfully clicked at (0, 0)"} ``` - **`return_value`**: The actual return value used in the tool response. This is what gets serialized and sent back to the model as the tool's result. Can include multimodal content directly (see [Tool Output](#function-tool-output) above). - **`content`**: Content sent as a **separate user message** after the tool result. Use this when you explicitly want content to appear outside the tool result, or when combining structured return values with rich content. - **`metadata`**: Optional metadata that your application can access but is not sent to the LLM. Useful for logging, debugging, or additional processing. Some other AI frameworks call this feature 'artifacts'. This separation allows you to provide rich context to the model while maintaining clean, structured return values for your application logic. For multimodal content that should be sent natively in the tool result (when supported by the model), return it directly from the tool function or include it in `return_value` (see [Tool Output](#function-tool-output) above). ## Custom Tool Schema If you have a function that lacks appropriate documentation (i.e. poorly named, no type information, poor docstring, use of \*args or \*\*kwargs and suchlike) then you can still turn it into a tool that can be effectively used by the agent with the [`Tool.from_schema`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool.from_schema) function. With this you provide the name, description, JSON schema, and whether the function takes a `RunContext` for the function directly: ```python from pydantic_ai import Agent, Tool from pydantic_ai.models.test import TestModel def foobar(**kwargs) -> str: return kwargs['a'] + kwargs['b'] tool = Tool.from_schema( function=foobar, name='sum', description='Sum two numbers.', json_schema={ 'additionalProperties': False, 'properties': { 'a': {'description': 'the first number', 'type': 'integer'}, 'b': {'description': 'the second number', 'type': 'integer'}, }, 'required': ['a', 'b'], 'type': 'object', }, takes_ctx=False, ) test_model = TestModel() agent = Agent(test_model, tools=[tool]) result = agent.run_sync('testing...') print(result.output) #> {"sum":0} ``` Please note that validation of the tool arguments will not be performed, and this will pass all arguments as keyword arguments. ## Dynamic Tools Tools can optionally be defined with another function: `prepare`, which is called at each step of a run to customize the definition of the tool passed to the model, or omit the tool completely from that step. A `prepare` method can be registered via the `prepare` kwarg to any of the tool registration mechanisms: - [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) decorator - [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain) decorator - [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) dataclass The `prepare` method, should be of type [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc), a function which takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and a pre-built [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition), and should either return that `ToolDefinition` with or without modifying it, return a new `ToolDefinition`, or return `None` to indicate this tools should not be registered for that step. Here's a simple `prepare` method that only includes the tool if the value of the dependency is `42`. As with the previous example, we use [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) to demonstrate the behavior without calling a real model. tool\_only\_if\_42.py ```python from pydantic_ai import Agent, RunContext, ToolDefinition agent = Agent('test') async def only_if_42( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: if ctx.deps == 42: return tool_def @agent.tool(prepare=only_if_42) def hitchhiker(ctx: RunContext[int], answer: str) -> str: return f'{ctx.deps} {answer}' result = agent.run_sync('testing...', deps=41) print(result.output) #> success (no tool calls) result = agent.run_sync('testing...', deps=42) print(result.output) #> {"hitchhiker":"42 a"} ``` _(This example is complete, it can be run "as is")_ Here's a more complex example where we change the description of the `name` parameter to based on the value of `deps` For the sake of variation, we create this tool using the [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) dataclass. customize\_name.py ```python from __future__ import annotations from typing import Literal from pydantic_ai import Agent, RunContext, Tool, ToolDefinition from pydantic_ai.models.test import TestModel def greet(name: str) -> str: return f'hello {name}' async def prepare_greet( ctx: RunContext[Literal['human', 'machine']], tool_def: ToolDefinition ) -> ToolDefinition | None: d = f'Name of the {ctx.deps} to greet.' tool_def.parameters_json_schema['properties']['name']['description'] = d return tool_def greet_tool = Tool(greet, prepare=prepare_greet) test_model = TestModel() agent = Agent(test_model, tools=[greet_tool], deps_type=Literal['human', 'machine']) result = agent.run_sync('testing...', deps='human') print(result.output) #> {"greet":"hello a"} print(test_model.last_model_request_parameters.function_tools) """ [ ToolDefinition( name='greet', parameters_json_schema={ 'additionalProperties': False, 'properties': { 'name': {'type': 'string', 'description': 'Name of the human to greet.'} }, 'required': ['name'], 'type': 'object', }, ) ] """ ``` _(This example is complete, it can be run "as is")_ ### Agent-wide Dynamic Tools In addition to per-tool `prepare` methods, you can also define an agent-wide `prepare_tools` function. This function is called at each step of a run and allows you to filter or modify the list of all tool definitions available to the agent for that step. This is especially useful if you want to enable or disable multiple tools at once, or apply global logic based on the current context. The `prepare_tools` function should be of type [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc), which takes the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and a list of [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition), and returns a new list of tool definitions (or `None` to disable all tools for that step). Note The list of tool definitions passed to `prepare_tools` includes both regular function tools and tools from any [toolsets](/docs/ai/tools-toolsets/toolsets) registered on the agent, but not [output tools](/docs/ai/core-concepts/output#tool-output). To modify output tools, you can set a `prepare_output_tools` function instead. Here's an example that makes all tools strict if the model is an OpenAI model: agent\_prepare\_tools\_customize.py ```python from dataclasses import replace from pydantic_ai import Agent, RunContext, ToolDefinition from pydantic_ai.models.test import TestModel async def turn_on_strict_if_openai( ctx: RunContext[None], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition] | None: if ctx.model.system == 'openai': return [replace(tool_def, strict=True) for tool_def in tool_defs] return tool_defs test_model = TestModel() agent = Agent(test_model, prepare_tools=turn_on_strict_if_openai) @agent.tool_plain def echo(message: str) -> str: return message agent.run_sync('testing...') assert test_model.last_model_request_parameters.function_tools[0].strict is None # Set the system attribute of the test_model to 'openai' test_model._system = 'openai' agent.run_sync('testing with openai...') assert test_model.last_model_request_parameters.function_tools[0].strict ``` _(This example is complete, it can be run "as is")_ Here's another example that conditionally filters out the tools by name if the dependency (`ctx.deps`) is `True`: agent\_prepare\_tools\_filter\_out.py ```python from pydantic_ai import Agent, RunContext, Tool, ToolDefinition def launch_potato(target: str) -> str: return f'Potato launched at {target}!' async def filter_out_tools_by_name( ctx: RunContext[bool], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition] | None: if ctx.deps: return [tool_def for tool_def in tool_defs if tool_def.name != 'launch_potato'] return tool_defs agent = Agent( 'test', tools=[Tool(launch_potato)], prepare_tools=filter_out_tools_by_name, deps_type=bool, ) result = agent.run_sync('testing...', deps=False) print(result.output) #> {"launch_potato":"Potato launched at a!"} result = agent.run_sync('testing...', deps=True) print(result.output) #> success (no tool calls) ``` _(This example is complete, it can be run "as is")_ You can use `prepare_tools` to: - Dynamically enable or disable tools based on the current model, dependencies, or other context - Modify tool definitions globally (e.g., set all tools to strict mode, change descriptions, etc.) If both per-tool `prepare` and agent-wide `prepare_tools` are used, the per-tool `prepare` is applied first to each tool, and then `prepare_tools` is called with the resulting list of tool definitions. ## Tool Execution and Retries When a tool is executed, its arguments (provided by the LLM) are first validated against the function's signature using Pydantic (with optional [validation context](/docs/ai/core-concepts/output#validation-context)). If validation fails (e.g., due to incorrect types or missing required arguments), a `ValidationError` is raised, and the framework automatically generates a [`RetryPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.RetryPromptPart) containing the validation details. This prompt is sent back to the LLM, informing it of the error and allowing it to correct the parameters and retry the tool call. Beyond automatic validation errors, the tool's own internal logic can also explicitly request a retry by raising the [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exception. This is useful for situations where the parameters were technically valid, but an issue occurred during execution (like a transient network error, or the tool determining the initial attempt needs modification). ```python from pydantic_ai import ModelRetry def my_flaky_tool(query: str) -> str: if query == 'bad': # Tell the LLM the query was bad and it should try again raise ModelRetry("The query 'bad' is not allowed. Please provide a different query.") # ... process query ... return 'Success!' ``` Raising `ModelRetry` also generates a `RetryPromptPart` containing the exception message, which is sent back to the LLM to guide its next attempt. Both `ValidationError` and `ModelRetry` respect the `retries` setting configured on the `Tool` or `Agent`. ### Tool Timeout You can set a timeout for tool execution to prevent tools from running indefinitely. If a tool exceeds its timeout, it is treated as a failure and a retry prompt is sent to the model (counting towards the retry limit). ```python import asyncio from pydantic_ai import Agent # Set a default timeout for all tools on the agent agent = Agent('test', tool_timeout=30) @agent.tool_plain async def slow_tool() -> str: """This tool will use the agent's default timeout (30 seconds).""" await asyncio.sleep(10) return 'Done' @agent.tool_plain(timeout=5) async def fast_tool() -> str: """This tool has its own timeout (5 seconds) that overrides the agent default.""" await asyncio.sleep(1) return 'Done' ``` - **Agent-level timeout**: Set `tool_timeout` on the [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) to apply a default timeout to all tools. - **Per-tool timeout**: Set `timeout` on individual tools via [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool), [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain), or the [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) dataclass. This overrides the agent-level default. When a timeout occurs, the tool is considered to have failed and the model receives a retry prompt with the message `"Timed out after {timeout} seconds."`. This counts towards the tool's retry limit just like validation errors or explicit [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exceptions. ### Custom Args Validator The `args_validator` parameter lets you define custom validation that runs after Pydantic schema validation but before the tool executes. This is useful for business logic validation, cross-field validation, or validating arguments before requesting [human approval](/docs/ai/tools-toolsets/deferred-tools) for deferred tools. The validator receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument, followed by the same parameters as the tool function. Return `None` on success, or raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on failure. args\_validator\_approval.py ```python from pydantic_ai import Agent, DeferredToolRequests, ModelRetry, RunContext agent = Agent('test', deps_type=int, output_type=[str, DeferredToolRequests]) def validate_sum_limit(ctx: RunContext[int], x: int, y: int) -> None: """Validate that the sum doesn't exceed the limit from deps.""" if x + y > ctx.deps: raise ModelRetry(f'Sum of x and y must not exceed {ctx.deps}') # Validation runs *before* approval is requested, so the model can # fix bad args without bothering the user. @agent.tool(requires_approval=True, args_validator=validate_sum_limit) def add_numbers(ctx: RunContext[int], x: int, y: int) -> int: """Add two numbers (sum must not exceed the configured limit).""" return x + y result = agent.run_sync('add 5 and 3', deps=100) assert isinstance(result.output, DeferredToolRequests) # The validated args are ready for the user to approve print(result.output.approvals[0].args) #> {'x': 0, 'y': 0} ``` _(This example is complete, it can be run "as is")_ When validation fails, the error message is sent back to the LLM as a retry prompt. This respects the `retries` setting on the tool. For [deferred tools](/docs/ai/tools-toolsets/deferred-tools), validation runs at deferral time -- only tool calls with valid arguments are deferred, while failed validation triggers a retry just like regular tools. The `args_validator` parameter is available on [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool), [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain), [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool), [`Tool.from_schema`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool.from_schema), and [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset). Validators can be sync or async functions. The validation result is exposed via the `args_valid` field on [`FunctionToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolCallEvent). This reflects all validation -- both schema validation and custom `args_validator` validation (if configured): `True` means all validation passed, `False` means validation failed, and `None` means validation was not performed (e.g. tool calls skipped due to the `'early'` end strategy, or deferred tool calls resolved without execution). ### Parallel tool calls & concurrency When a model returns multiple tool calls in one response, Pydantic AI schedules them concurrently using `asyncio.create_task`. If a tool requires sequential/serial execution, you can pass the [`sequential`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition.sequential) flag when registering the tool, or wrap the agent run in the [`with agent.parallel_tool_call_execution_mode('sequential')`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.parallel_tool_call_execution_mode) context manager. Async functions are run on the event loop, while sync functions are offloaded to threads. To get the best performance, _always_ use an async function _unless_ you're doing blocking I/O (and there's no way to use a non-blocking library instead) or CPU-bound work (like `numpy` or `scikit-learn` operations), so that simple functions are not offloaded to threads unnecessarily. #### Thread executor for long-running servers By default, sync functions are offloaded to threads using `anyio.to_thread.run_sync`, which creates ephemeral threads on demand. In long-running servers (e.g. FastAPI), these threads can accumulate under sustained traffic, leading to memory growth. To control thread lifecycle, provide a bounded [`ThreadPoolExecutor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor) using the [`ThreadExecutor`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ThreadExecutor) capability (per-agent) or the [`Agent.using_thread_executor()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.using_thread_executor) context manager (global): ```python from concurrent.futures import ThreadPoolExecutor from contextlib import asynccontextmanager from pydantic_ai import Agent from pydantic_ai.capabilities import ThreadExecutor # Per-agent: pass as a capability executor = ThreadPoolExecutor(max_workers=16, thread_name_prefix='agent-worker') agent = Agent('openai:gpt-5.2', capabilities=[ThreadExecutor(executor)]) # Global: wrap your server lifespan @asynccontextmanager async def lifespan(app): executor = ThreadPoolExecutor(max_workers=16) with Agent.using_thread_executor(executor): yield executor.shutdown(wait=True) ``` Limiting tool executions You can cap tool executions within a run using [`UsageLimits(tool_calls_limit=...)`](/docs/ai/core-concepts/agent#usage-limits). The counter increments only after a successful tool invocation. Output tools (used for [structured output](/docs/ai/core-concepts/output)) are not counted in the `tool_calls` metric. #### Output Tool Calls When a model calls an [output tool](/docs/ai/core-concepts/output#tool-output) in parallel with other tools, the agent's [`end_strategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.end_strategy) parameter controls how these tool calls are executed. The `'exhaustive'` strategy ensures all tools are executed even after a final result is found, which is useful when tools have side effects (like logging, sending notifications, or updating metrics) that should always execute. For more information of how `end_strategy` works with both function tools and output tools, see the [Output Tool](/docs/ai/core-concepts/output#parallel-output-tool-calls) docs. ## Tool Search Agents with many tools (e.g. [MCP servers](/docs/ai/mcp/client) exposing dozens of endpoints) can suffer from context bloat and degraded tool selection. Marking tools for deferred loading hides them from the model's initial context; a `search_tools` tool is automatically injected so the model can discover hidden tools by keyword when it needs them. This is inspired by Anthropic's [Tool Search Tool](https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool#limits-and-best-practices) for managing large tool collections. Tool search is implemented on the Pydantic AI side and works with any model. Native provider support is planned in [#4167](https://github.com/pydantic/pydantic-ai/issues/4167). For individual tools, set `defer_loading=True` on [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool), [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool), or [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain). For entire toolsets (including [MCP servers](/docs/ai/mcp/client) and [`FastMCPToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.fastmcp.FastMCPToolset)), use the [`.defer_loading()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.defer_loading) method -- pass a list of tool names to hide only specific tools, or `None` to hide all. tool\_search.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') @agent.tool_plain(defer_loading=True) def mortgage_calculator(principal: float, rate: float, years: int) -> str: """Calculate monthly mortgage payment for a home loan.""" monthly_rate = rate / 100 / 12 n_payments = years * 12 payment = principal * (monthly_rate * (1 + monthly_rate) ** n_payments) / ((1 + monthly_rate) ** n_payments - 1) return f'${payment:.2f}/month' ``` For MCP servers, use [`.defer_loading()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.defer_loading) to hide all tools behind search: tool\_search\_mcp.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerHTTP mcp = MCPServerHTTP('http://localhost:8000/mcp') agent = Agent('openai:gpt-5.2', toolsets=[mcp.defer_loading()]) ``` Tool discovery and message history Discovered tools are tracked via metadata in the [message history](/docs/ai/core-concepts/message-history). If a [history processor](/docs/ai/core-concepts/message-history#processing-message-history) truncates messages containing discovery metadata, previously discovered tools will require re-discovery. See [`ToolDefinition.defer_loading`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition.defer_loading) and [Deferred Loading](/docs/ai/tools-toolsets/toolsets#deferred-loading) for more details. ## See Also - [Function Tools](/docs/ai/tools-toolsets/tools) - Basic tool concepts and registration - [Toolsets](/docs/ai/tools-toolsets/toolsets) - Managing collections of tools - [Deferred Tools](/docs/ai/tools-toolsets/deferred-tools) - Tools requiring approval or external execution - [Third-Party Tools](/docs/ai/tools-toolsets/third-party-tools) - Integrations with external tool libraries --- # [Toolsets](https://pydantic.dev/docs/ai/tools-toolsets/toolsets/) # Toolsets A toolset represents a collection of [tools](/docs/ai/tools-toolsets/tools) that can be registered with an agent in one go. They can be reused by different agents, swapped out at runtime or during testing, and composed in order to dynamically filter which tools are available, modify tool definitions, or change tool execution behavior. A toolset can contain locally defined functions, depend on an external service to provide them, or implement custom logic to list available tools and handle them being called. Toolsets can also be provided via [capabilities](/docs/ai/core-concepts/capabilities), which bundle tools with hooks, instructions, and model settings. Toolsets are used (among many other things) to define [MCP servers](/docs/ai/mcp/client) available to an agent. Pydantic AI includes many kinds of toolsets which are described below, and you can define a [custom toolset](#building-a-custom-toolset) by inheriting from the [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) class. The toolsets that will be available during an agent run can be specified in four different ways: - at agent construction time, via the [`toolsets`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) keyword argument to `Agent`, which takes toolset instances as well as functions that generate toolsets [dynamically](#dynamically-building-a-toolset) based on the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) - at agent run time, via the `toolsets` keyword argument to [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`agent.run_sync()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync), [`agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream), or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter). These toolsets will be additional to those registered on the `Agent` - [dynamically](#dynamically-building-a-toolset), via the [`@agent.toolset`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.toolset) decorator which lets you build a toolset based on the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) - as a contextual override, via the `toolsets` keyword argument to the [`agent.override()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) context manager. These toolsets will replace those provided at agent construction or run time during the life of the context manager toolsets.py ```python from pydantic_ai import Agent, FunctionToolset from pydantic_ai.models.test import TestModel def agent_tool(): return "I'm registered directly on the agent" def extra_tool(): return "I'm passed as an extra tool for a specific run" def override_tool(): return 'I override all other tools' agent_toolset = FunctionToolset(tools=[agent_tool]) # (1) extra_toolset = FunctionToolset(tools=[extra_tool]) override_toolset = FunctionToolset(tools=[override_tool]) test_model = TestModel() # (2) agent = Agent(test_model, toolsets=[agent_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['agent_tool'] result = agent.run_sync('What tools are available?', toolsets=[extra_toolset]) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['agent_tool', 'extra_tool'] with agent.override(toolsets=[override_toolset]): result = agent.run_sync('What tools are available?', toolsets=[extra_toolset]) # (3) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['override_tool'] ``` The [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) will be explained in detail in the next section. We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run. This `extra_toolset` will be ignored because we're inside an override context. _(This example is complete, it can be run "as is")_ ## Function Toolset As the name suggests, a [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) makes locally defined functions available as tools. Functions can be added as tools in four different ways: - via the [`@toolset.tool`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.tool) decorator -- for tools that need access to the agent [context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) - via the [`@toolset.tool_plain`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.tool_plain) decorator -- for tools that do not need access to the agent [context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) - via the [`tools`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.__init__) keyword argument to the constructor which can take either plain functions, or instances of [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) - via the [`toolset.add_function()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.add_function) and [`toolset.add_tool()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.add_tool) methods which can take a plain function or an instance of [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) respectively The `add_function()` and `add_tool()` methods can also be used from a tool function to dynamically register new tools during a run to be available in future run steps. function\_toolset.py ```python from datetime import datetime from pydantic_ai import Agent, FunctionToolset, RunContext from pydantic_ai.models.test import TestModel def temperature_celsius(city: str) -> float: return 21.0 def temperature_fahrenheit(city: str) -> float: return 69.8 weather_toolset = FunctionToolset(tools=[temperature_celsius, temperature_fahrenheit]) @weather_toolset.tool def conditions(ctx: RunContext, city: str) -> str: if ctx.run_step % 2 == 0: return "It's sunny" else: return "It's raining" datetime_toolset = FunctionToolset() datetime_toolset.add_function(lambda: datetime.now(), name='now') test_model = TestModel() # (1) agent = Agent(test_model) result = agent.run_sync('What tools are available?', toolsets=[weather_toolset]) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['temperature_celsius', 'temperature_fahrenheit', 'conditions'] result = agent.run_sync('What tools are available?', toolsets=[datetime_toolset]) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['now'] ``` We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run. _(This example is complete, it can be run "as is")_ ### Toolset Instructions A [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) can provide instructions that are automatically included in the model request. This lets each toolset carry its own usage guidance alongside its tools, so you don't need to duplicate instructions on every agent that uses the toolset. Instructions can be provided as strings, functions (sync or async, with or without [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)), or a mix of both: toolset\_instructions.py ```python from pydantic_ai import Agent, FunctionToolset from pydantic_ai.models.test import TestModel search_toolset = FunctionToolset( instructions='Always use the search tool before answering factual questions.', ) @search_toolset.tool_plain def search(query: str) -> str: """Search for information.""" return f'Results for: {query}' test_model = TestModel() agent = Agent(test_model, toolsets=[search_toolset]) result = agent.run_sync('What is the capital of France?') print(result.all_messages()[0].instructions) #> Always use the search tool before answering factual questions. ``` _(This example is complete, it can be run "as is")_ You can also use the [`@toolset.instructions`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.instructions) decorator to register dynamic instruction functions that can access the run context: toolset\_instructions\_decorator.py ```python from pydantic_ai import Agent, FunctionToolset, RunContext from pydantic_ai.models.test import TestModel math_toolset = FunctionToolset[str]() @math_toolset.instructions def math_instructions(ctx: RunContext[str]) -> str: return f'You are helping: {ctx.deps}. Always show your work when using the calculator.' @math_toolset.tool_plain def calculator(expression: str) -> str: """Evaluate a math expression.""" return '4' test_model = TestModel() agent = Agent(test_model, toolsets=[math_toolset], deps_type=str) result = agent.run_sync('What is 2+2?', deps='Alice') print(result.all_messages()[0].instructions) #> You are helping: Alice. Always show your work when using the calculator. ``` _(This example is complete, it can be run "as is")_ When a toolset with instructions is used alongside agent-level [`instructions`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__), the toolset instructions are appended after the agent instructions: toolset\_instructions\_combined.py ```python from pydantic_ai import Agent, FunctionToolset from pydantic_ai.models.test import TestModel toolset = FunctionToolset(instructions='Use the greeting tool for all greetings.') @toolset.tool_plain def greeting(name: str) -> str: """Greet someone.""" return f'Hello, {name}!' test_model = TestModel() agent = Agent( test_model, instructions='You are a friendly assistant.', toolsets=[toolset], ) result = agent.run_sync('Hi there!') print(result.all_messages()[0].instructions) """ You are a friendly assistant. Use the greeting tool for all greetings. """ ``` _(This example is complete, it can be run "as is")_ When multiple toolsets with instructions are registered on an agent, all their instructions are combined: toolset\_instructions\_multiple.py ```python from pydantic_ai import Agent, FunctionToolset from pydantic_ai.models.test import TestModel weather_toolset = FunctionToolset(instructions='Use weather tools for forecasts.') @weather_toolset.tool_plain def forecast(city: str) -> str: """Get weather forecast.""" return 'Sunny' calendar_toolset = FunctionToolset(instructions='Use calendar tools for scheduling.') @calendar_toolset.tool_plain def schedule(event: str) -> str: """Schedule an event.""" return 'Scheduled' test_model = TestModel() agent = Agent(test_model, toolsets=[weather_toolset, calendar_toolset]) result = agent.run_sync('Plan my day') print(result.all_messages()[0].instructions) """ Use weather tools for forecasts. Use calendar tools for scheduling. """ ``` _(This example is complete, it can be run "as is")_ ## Toolset Composition Toolsets can be composed to dynamically filter which tools are available, modify tool definitions, or change tool execution behavior. Multiple toolsets can also be combined into one. ### Combining Toolsets [`CombinedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.CombinedToolset) takes a list of toolsets and lets them be used as one. combined\_toolset.py ```python from pydantic_ai import Agent, CombinedToolset from pydantic_ai.models.test import TestModel from function_toolset import datetime_toolset, weather_toolset combined_toolset = CombinedToolset([weather_toolset, datetime_toolset]) test_model = TestModel() # (1) agent = Agent(test_model, toolsets=[combined_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['temperature_celsius', 'temperature_fahrenheit', 'conditions', 'now'] ``` We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run. _(This example is complete, it can be run "as is")_ ### Filtering Tools [`FilteredToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FilteredToolset) wraps a toolset and filters available tools ahead of each step of the run based on a user-defined function that is passed the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and each tool's [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) and returns a boolean to indicate whether or not a given tool should be available. To easily chain different modifications, you can also call [`filtered()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.filtered) on any toolset instead of directly constructing a `FilteredToolset`. filtered\_toolset.py ```python from pydantic_ai import Agent from pydantic_ai.models.test import TestModel from combined_toolset import combined_toolset filtered_toolset = combined_toolset.filtered(lambda ctx, tool_def: 'fahrenheit' not in tool_def.name) test_model = TestModel() # (1) agent = Agent(test_model, toolsets=[filtered_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['weather_temperature_celsius', 'weather_conditions', 'datetime_now'] ``` We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run. _(This example is complete, it can be run "as is")_ ### Prefixing Tool Names [`PrefixedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.PrefixedToolset) wraps a toolset and adds a prefix to each tool name to prevent tool name conflicts between different toolsets. To easily chain different modifications, you can also call [`prefixed()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.prefixed) on any toolset instead of directly constructing a `PrefixedToolset`. combined\_toolset.py ```python from pydantic_ai import Agent, CombinedToolset from pydantic_ai.models.test import TestModel from function_toolset import datetime_toolset, weather_toolset combined_toolset = CombinedToolset( [ weather_toolset.prefixed('weather'), datetime_toolset.prefixed('datetime') ] ) test_model = TestModel() # (1) agent = Agent(test_model, toolsets=[combined_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) """ [ 'weather_temperature_celsius', 'weather_temperature_fahrenheit', 'weather_conditions', 'datetime_now', ] """ ``` We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run. _(This example is complete, it can be run "as is")_ ### Renaming Tools [`RenamedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.RenamedToolset) wraps a toolset and lets you rename tools using a dictionary mapping new names to original names. This is useful when the names provided by a toolset are ambiguous or would conflict with tools defined by other toolsets, but [prefixing them](#prefixing-tool-names) creates a name that is unnecessarily long or could be confusing to the model. To easily chain different modifications, you can also call [`renamed()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.renamed) on any toolset instead of directly constructing a `RenamedToolset`. renamed\_toolset.py ```python from pydantic_ai import Agent from pydantic_ai.models.test import TestModel from combined_toolset import combined_toolset renamed_toolset = combined_toolset.renamed( { 'current_time': 'datetime_now', 'temperature_celsius': 'weather_temperature_celsius', 'temperature_fahrenheit': 'weather_temperature_fahrenheit' } ) test_model = TestModel() # (1) agent = Agent(test_model, toolsets=[renamed_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) """ ['temperature_celsius', 'temperature_fahrenheit', 'weather_conditions', 'current_time'] """ ``` We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run. _(This example is complete, it can be run "as is")_ ### Dynamic Tool Definitions [`PreparedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.PreparedToolset) lets you modify the entire list of available tools ahead of each step of the agent run using a user-defined function that takes the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and a list of [`ToolDefinition`s](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) and returns a list of modified `ToolDefinition`s. This is the toolset-specific equivalent of the [`prepare_tools`](/docs/ai/tools-toolsets/tools-advanced#prepare-tools) argument to `Agent` that prepares all tool definitions registered on an agent across toolsets. Note that it is not possible to add or rename tools using `PreparedToolset`. Instead, you can use [`FunctionToolset.add_function()`](#function-toolset) or [`RenamedToolset`](#renaming-tools). To easily chain different modifications, you can also call [`prepared()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.prepared) on any toolset instead of directly constructing a `PreparedToolset`. prepared\_toolset.py ```python from dataclasses import replace from pydantic_ai import Agent, RunContext, ToolDefinition from pydantic_ai.models.test import TestModel from renamed_toolset import renamed_toolset descriptions = { 'temperature_celsius': 'Get the temperature in degrees Celsius', 'temperature_fahrenheit': 'Get the temperature in degrees Fahrenheit', 'weather_conditions': 'Get the current weather conditions', 'current_time': 'Get the current time', } async def add_descriptions(ctx: RunContext, tool_defs: list[ToolDefinition]) -> list[ToolDefinition] | None: return [ replace(tool_def, description=description) if (description := descriptions.get(tool_def.name, None)) else tool_def for tool_def in tool_defs ] prepared_toolset = renamed_toolset.prepared(add_descriptions) test_model = TestModel() # (1) agent = Agent(test_model, toolsets=[prepared_toolset]) result = agent.run_sync('What tools are available?') print(test_model.last_model_request_parameters.function_tools) """ [ ToolDefinition( name='temperature_celsius', parameters_json_schema={ 'additionalProperties': False, 'properties': {'city': {'type': 'string'}}, 'required': ['city'], 'type': 'object', }, description='Get the temperature in degrees Celsius', ), ToolDefinition( name='temperature_fahrenheit', parameters_json_schema={ 'additionalProperties': False, 'properties': {'city': {'type': 'string'}}, 'required': ['city'], 'type': 'object', }, description='Get the temperature in degrees Fahrenheit', ), ToolDefinition( name='weather_conditions', parameters_json_schema={ 'additionalProperties': False, 'properties': {'city': {'type': 'string'}}, 'required': ['city'], 'type': 'object', }, description='Get the current weather conditions', ), ToolDefinition( name='current_time', parameters_json_schema={ 'additionalProperties': False, 'properties': {}, 'type': 'object', }, description='Get the current time', ), ] """ ``` We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run. ### Requiring Tool Approval [`ApprovalRequiredToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.ApprovalRequiredToolset) wraps a toolset and lets you dynamically [require approval](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for a given tool call based on a user-defined function that is passed the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), the tool's [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition), and the validated tool call arguments. If no function is provided, all tool calls will require approval. To easily chain different modifications, you can also call [`approval_required()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.approval_required) on any toolset instead of directly constructing a `ApprovalRequiredToolset`. See the [Human-in-the-Loop Tool Approval](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) documentation for more information on how to handle agent runs that call tools that require approval and how to pass in the results. approval\_required\_toolset.py ```python from pydantic_ai import Agent, DeferredToolRequests, DeferredToolResults from pydantic_ai.models.test import TestModel from prepared_toolset import prepared_toolset approval_required_toolset = prepared_toolset.approval_required(lambda ctx, tool_def, tool_args: tool_def.name.startswith('temperature')) test_model = TestModel(call_tools=['temperature_celsius', 'temperature_fahrenheit']) # (1) agent = Agent( test_model, toolsets=[approval_required_toolset], output_type=[str, DeferredToolRequests], ) result = agent.run_sync('Call the temperature tools') messages = result.all_messages() print(result.output) """ DeferredToolRequests( calls=[], approvals=[ ToolCallPart( tool_name='temperature_celsius', args={'city': 'a'}, tool_call_id='pyd_ai_tool_call_id__temperature_celsius', ), ToolCallPart( tool_name='temperature_fahrenheit', args={'city': 'a'}, tool_call_id='pyd_ai_tool_call_id__temperature_fahrenheit', ), ], metadata={}, ) """ result = agent.run_sync( message_history=messages, deferred_tool_results=DeferredToolResults( approvals={ 'pyd_ai_tool_call_id__temperature_celsius': True, 'pyd_ai_tool_call_id__temperature_fahrenheit': False, } ) ) print(result.output) #> {"temperature_celsius":21.0,"temperature_fahrenheit":"The tool call was denied."} ``` We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to specify which tools to call. _(This example is complete, it can be run "as is")_ ### Deferred Loading [`DeferredLoadingToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.DeferredLoadingToolset) wraps a toolset and marks its tools for deferred loading, hiding them from the model until discovered via [tool search](/docs/ai/tools-toolsets/tools-advanced#tool-search). This is useful for large toolsets (e.g. MCP servers with many endpoints) where loading all tool definitions into the model's context would be wasteful. [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) also accepts `defer_loading=True` in its constructor to mark all tools for deferred loading. For other toolsets, call [`.defer_loading()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.defer_loading) -- pass a list of tool names to hide only specific tools, or `None` (the default) to hide all. deferred\_loading\_toolset.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerHTTP mcp = MCPServerHTTP('http://localhost:8000/mcp') agent = Agent('openai:gpt-5.2', toolsets=[mcp.defer_loading()]) ``` ### Including Return Schemas [`IncludeReturnSchemasToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.IncludeReturnSchemasToolset) wraps a toolset and sets `include_return_schema=True` on all its tools, causing the model to receive return type information. For models that natively support return schemas (e.g. Google Gemini), the schema is passed as a structured API field. For other models, it is injected into the tool description as JSON text. To easily chain different modifications, you can also call [`.include_return_schemas()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.include_return_schemas) on any toolset instead of directly constructing an `IncludeReturnSchemasToolset`. include\_return\_schemas\_toolset.py ```python from pydantic_ai import Agent, FunctionToolset from pydantic_ai.models.test import TestModel def get_temperature(city: str) -> float: """Get the temperature for a city.""" return 21.0 toolset = FunctionToolset(tools=[get_temperature]) test_model = TestModel() agent = Agent(test_model, toolsets=[toolset.include_return_schemas()]) result = agent.run_sync('What is the temperature?') params = test_model.last_model_request_parameters assert params is not None assert params.function_tools[0].include_return_schema is True ``` _(This example is complete, it can be run "as is")_ This is the toolset-level equivalent of the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability, which applies across all toolsets or a selected subset. ### Setting Tool Metadata [`SetMetadataToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.SetMetadataToolset) wraps a toolset and merges metadata key-value pairs onto all its tools. This is useful for tagging tools with configuration that other capabilities or custom logic can inspect. To easily chain different modifications, you can also call [`.with_metadata()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.with_metadata) on any toolset instead of directly constructing a `SetMetadataToolset`. set\_metadata\_toolset.py ```python from pydantic_ai import Agent, FunctionToolset from pydantic_ai.models.test import TestModel def search(query: str) -> str: """Search for information.""" return f'Results for: {query}' toolset = FunctionToolset(tools=[search]) test_model = TestModel() agent = Agent(test_model, toolsets=[toolset.with_metadata(sensitive=True)]) result = agent.run_sync('Search for something') params = test_model.last_model_request_parameters assert params is not None assert params.function_tools[0].metadata is not None assert params.function_tools[0].metadata['sensitive'] is True ``` _(This example is complete, it can be run "as is")_ This is the toolset-level equivalent of the [`SetToolMetadata`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.SetToolMetadata) capability, which applies across all toolsets or a selected subset. ### Changing Tool Execution [`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset) wraps another toolset and delegates all responsibility to it. It is a no-op by default, but you can subclass `WrapperToolset` to change the wrapped toolset's tool execution behavior by overriding the [`call_tool()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.call_tool) method. logging\_toolset.py ```python import asyncio from typing_extensions import Any from pydantic_ai import Agent, RunContext, ToolsetTool, WrapperToolset from pydantic_ai.models.test import TestModel from prepared_toolset import prepared_toolset LOG = [] class LoggingToolset(WrapperToolset): async def call_tool(self, name: str, tool_args: dict[str, Any], ctx: RunContext, tool: ToolsetTool) -> Any: LOG.append(f'Calling tool {name!r} with args: {tool_args!r}') try: await asyncio.sleep(0.1 * len(LOG)) # (1) result = await super().call_tool(name, tool_args, ctx, tool) LOG.append(f'Finished calling tool {name!r} with result: {result!r}') except Exception as e: LOG.append(f'Error calling tool {name!r}: {e}') raise e else: return result logging_toolset = LoggingToolset(prepared_toolset) agent = Agent(TestModel(), toolsets=[logging_toolset]) # (2) result = agent.run_sync('Call all the tools') print(LOG) """ [ "Calling tool 'temperature_celsius' with args: {'city': 'a'}", "Calling tool 'temperature_fahrenheit' with args: {'city': 'a'}", "Calling tool 'weather_conditions' with args: {'city': 'a'}", "Calling tool 'current_time' with args: {}", "Finished calling tool 'temperature_celsius' with result: 21.0", "Finished calling tool 'temperature_fahrenheit' with result: 69.8", 'Finished calling tool 'weather_conditions' with result: "It's raining"', "Finished calling tool 'current_time' with result: datetime.datetime(...)", ] """ ``` All docs examples are tested in CI and their their output is verified, so we need `LOG` to always have the same order whenever this code is run. Since the tools could finish in any order, we sleep an increasing amount of time based on which number tool call we are to ensure that they finish (and log) in the same order they were called in. We use [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here as it will automatically call each tool. _(This example is complete, it can be run "as is")_ ## External Toolset If your agent needs to be able to call [external tools](/docs/ai/tools-toolsets/deferred-tools#external-tool-execution) that are provided and executed by an upstream service or frontend, you can build an [`ExternalToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.ExternalToolset) from a list of [`ToolDefinition`s](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) containing the tool names, arguments JSON schemas, and descriptions. When the model calls an external tool, the call is considered to be ["deferred"](/docs/ai/tools-toolsets/deferred-tools#deferred-tools), and the agent run will end with a [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output object with a `calls` list holding [`ToolCallPart`s](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) containing the tool name, validated arguments, and a unique tool call ID, which are expected to be passed to the upstream service or frontend that will produce the results. When the tool call results are received from the upstream service or frontend, you can build a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object with a `calls` dictionary that maps each tool call ID to an arbitrary value to be returned to the model, a [`ToolReturn`](/docs/ai/tools-toolsets/tools-advanced#advanced-tool-returns) object, or a [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exception in case the tool call failed and the model should [try again](/docs/ai/tools-toolsets/tools-advanced#tool-retries). This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](/docs/ai/core-concepts/message-history). Note that you need to add `DeferredToolRequests` to the `Agent`'s or `agent.run()`'s [`output_type`](/docs/ai/core-concepts/output#structured-output) so that the possible types of the agent run output are correctly inferred. For more information, see the [Deferred Tools](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) documentation. To demonstrate, let us first define a simple agent _without_ deferred tools: deferred\_toolset\_agent.py ```python from pydantic import BaseModel from pydantic_ai import Agent, FunctionToolset toolset = FunctionToolset() @toolset.tool_plain def get_default_language(): return 'en-US' @toolset.tool_plain def get_user_name(): return 'David' class PersonalizedGreeting(BaseModel): greeting: str language_code: str agent = Agent('openai:gpt-5.2', toolsets=[toolset], output_type=PersonalizedGreeting) result = agent.run_sync('Greet the user in a personalized way') print(repr(result.output)) #> PersonalizedGreeting(greeting='Hello, David!', language_code='en-US') ``` Next, let's define a function that represents a hypothetical "run agent" API endpoint that can be called by the frontend and takes a list of messages to send to the model, a list of frontend tool definitions, and optional deferred tool results. This is where `ExternalToolset`, `DeferredToolRequests`, and `DeferredToolResults` come in: deferred\_toolset\_api.py ```python from pydantic_ai import ( DeferredToolRequests, DeferredToolResults, ExternalToolset, ModelMessage, ToolDefinition, ) from deferred_toolset_agent import PersonalizedGreeting, agent def run_agent( messages: list[ModelMessage] = [], frontend_tools: list[ToolDefinition] = {}, deferred_tool_results: DeferredToolResults | None = None, ) -> tuple[PersonalizedGreeting | DeferredToolRequests, list[ModelMessage]]: deferred_toolset = ExternalToolset(frontend_tools) result = agent.run_sync( toolsets=[deferred_toolset], # (1) output_type=[agent.output_type, DeferredToolRequests], # (2) message_history=messages, # (3) deferred_tool_results=deferred_tool_results, ) return result.output, result.new_messages() ``` As mentioned in the [Deferred Tools](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) documentation, these `toolsets` are additional to those provided to the `Agent` constructor As mentioned in the [Deferred Tools](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) documentation, this `output_type` overrides the one provided to the `Agent` constructor, so we have to make sure to not lose it We don't include an `user_prompt` keyword argument as we expect the frontend to provide it via `messages` Now, imagine that the code below is implemented on the frontend, and `run_agent` stands in for an API call to the backend that runs the agent. This is where we actually execute the deferred tool calls and start a new run with the new result included: deferred\_tools.py ```python from pydantic_ai import ( DeferredToolRequests, DeferredToolResults, ModelMessage, ModelRequest, ModelRetry, ToolDefinition, UserPromptPart, ) from deferred_toolset_api import run_agent frontend_tool_definitions = [ ToolDefinition( name='get_preferred_language', parameters_json_schema={'type': 'object', 'properties': {'default_language': {'type': 'string'}}}, description="Get the user's preferred language from their browser", ) ] def get_preferred_language(default_language: str) -> str: return 'es-MX' # (1) frontend_tool_functions = {'get_preferred_language': get_preferred_language} messages: list[ModelMessage] = [ ModelRequest( parts=[ UserPromptPart(content='Greet the user in a personalized way') ] ) ] deferred_tool_results: DeferredToolResults | None = None final_output = None while True: output, new_messages = run_agent(messages, frontend_tool_definitions, deferred_tool_results) messages += new_messages if not isinstance(output, DeferredToolRequests): final_output = output break print(output.calls) """ [ ToolCallPart( tool_name='get_preferred_language', args={'default_language': 'en-US'}, tool_call_id='pyd_ai_tool_call_id', ) ] """ deferred_tool_results = DeferredToolResults() for tool_call in output.calls: if function := frontend_tool_functions.get(tool_call.tool_name): result = function(**tool_call.args_as_dict()) else: result = ModelRetry(f'Unknown tool {tool_call.tool_name!r}') deferred_tool_results.calls[tool_call.tool_call_id] = result print(repr(final_output)) """ PersonalizedGreeting(greeting='Hola, David! Espero que tengas un gran día!', language_code='es-MX') """ ``` Imagine that this returns the frontend [`navigator.language`](https://developer.mozilla.org/en-US/docs/Web/API/Navigator/language). _(This example is complete, it can be run "as is")_ ## Dynamically Building a Toolset Toolsets can be built dynamically ahead of each agent run or run step using a function that takes the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns a toolset or `None`. This is useful when a toolset (like an MCP server) depends on information specific to an agent run, like its [dependencies](/docs/ai/core-concepts/dependencies). To register a dynamic toolset, you can pass a function that takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) to the `toolsets` argument of the `Agent` constructor, or you can wrap a compliant function in the [`@agent.toolset`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.toolset) decorator. By default, the function will be called again ahead of each agent run step. If you are using the decorator, you can optionally provide a `per_run_step=False` argument to indicate that the toolset only needs to be built once for the entire run. dynamic\_toolset.py ```python from dataclasses import dataclass from typing import Literal from pydantic_ai import Agent, RunContext from pydantic_ai.models.test import TestModel from function_toolset import datetime_toolset, weather_toolset @dataclass class ToggleableDeps: active: Literal['weather', 'datetime'] def toggle(self): if self.active == 'weather': self.active = 'datetime' else: self.active = 'weather' test_model = TestModel() # (1) agent = Agent( test_model, deps_type=ToggleableDeps # (2) ) @agent.toolset def toggleable_toolset(ctx: RunContext[ToggleableDeps]): if ctx.deps.active == 'weather': return weather_toolset else: return datetime_toolset @agent.tool def toggle(ctx: RunContext[ToggleableDeps]): ctx.deps.toggle() deps = ToggleableDeps('weather') result = agent.run_sync('Toggle the toolset', deps=deps) print([t.name for t in test_model.last_model_request_parameters.function_tools]) # (3) #> ['toggle', 'now'] result = agent.run_sync('Toggle the toolset', deps=deps) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['toggle', 'temperature_celsius', 'temperature_fahrenheit', 'conditions'] ``` We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run. We're using the agent's dependencies to give the `toggle` tool access to the `active` via the `RunContext` argument. This shows the available tools _after_ the `toggle` tool was executed, as the "last model request" was the one that returned the `toggle` tool result to the model. _(This example is complete, it can be run "as is")_ ## Building a Custom Toolset To define a fully custom toolset with its own logic to list available tools and handle them being called, you can subclass [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) and implement the [`get_tools()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.get_tools) and [`call_tool()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.call_tool) methods. You can also override the [`get_instructions()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.get_instructions) method to provide a description of how to use the toolset's tools. This will be injected into the agent's instructions and is useful for helping the model understand how to effectively use your toolset's tools. Tip If your toolset also needs to provide model settings or hooks, consider building a [custom capability](/docs/ai/core-concepts/capabilities#building-custom-capabilities) instead. The toolset lifecycle provides hooks for managing state at different scopes: - [`for_run()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.for_run): Called once before each agent run. Return a fresh instance for per-run state isolation (e.g. resetting counters, creating a new session). The framework enters and exits the returned instance. - [`for_run_step()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.for_run_step): Called at the start of each run step. Return a modified instance for per-step state transitions. If managing inner toolset transitions (e.g. swapping one toolset for another), you are responsible for the inner lifecycle (exiting the old, entering the new). - [`__aenter__()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.__aenter__) and [`__aexit__()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.__aexit__): Set up and tear down resources (e.g. network connections) that should live for the duration of the agent run. ### Per-run and per-step lifecycle Toolsets support lifecycle hooks for per-run isolation and per-step state management: - [`for_run(ctx)`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.for_run) -- called once per agent run, before `__aenter__`. Return a fresh instance to isolate state between runs. Default: returns `self`. - [`for_run_step(ctx)`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.for_run_step) -- called at the start of each run step. Manage internal transitions (e.g. refreshing tool availability) in-place. Default: returns `self`. ## Third-Party Toolsets Third-party toolsets can also be wrapped as [capabilities](/docs/ai/core-concepts/capabilities), which bundle tools with hooks, instructions, and model settings. See [Extensibility](/docs/ai/guides/extensibility) for the full ecosystem. ### MCP Servers Pydantic AI provides two toolsets that allow an agent to connect to and call tools on local and remote MCP Servers: 1. `MCPServer`: the [MCP SDK-based Client](/docs/ai/mcp/client) which offers more direct control by leveraging the MCP SDK directly 2. `FastMCPToolset`: the [FastMCP-based Client](/docs/ai/mcp/fastmcp-client) which offers additional capabilities like Tool Transformation, simpler OAuth configuration, and more. ### Agent Skills Toolsets that implement [Agent Skills](https://agentskills.io) support so agents can efficiently discover and perform specific tasks: - [`pydantic-ai-skills`](https://github.com/DougTrajano/pydantic-ai-skills) - `SkillsToolset` implements Agent Skills support with progressive disclosure (load skills on-demand to reduce tokens). Supports filesystem and programmatic skills; compatible with [agentskills.io](https://agentskills.io). ### Task Management Toolsets for task planning and progress tracking help agents organize complex work and provide visibility into agent progress: - [`pydantic-ai-todo`](https://github.com/vstorm-co/pydantic-ai-todo) - `TodoToolset` with `read_todos` and `write_todos` tools. Included in the third-party [`pydantic-deep`](https://github.com/vstorm-co/pydantic-deepagents) [deep agent](/docs/ai/guides/multi-agent-applications#deep-agents) framework. ### File Operations Toolsets for file operations help agents read, write, and edit files: - [`pydantic-ai-filesystem-sandbox`](https://github.com/zby/pydantic-ai-filesystem-sandbox) - `FileSystemToolset` with a sandbox and LLM-friendly errors - [`pydantic-deep`](https://github.com/vstorm-co/pydantic-deepagents) -- Deep agent framework that includes a `FilesystemToolset` with multiple backends (in-memory, real filesystem, Docker sandbox). ### Code Execution Toolsets for sandboxed code execution help agents run code in a sandboxed environment: - [`mcp-run-python`](https://github.com/pydantic/mcp-run-python) - MCP server by the Pydantic team that runs Python code in a sandboxed environment. Can be used as `MCPServerStdio('uv', args=['run', 'mcp-run-python', 'stdio'])`. ### LangChain Tools If you'd like to use tools or a [toolkit](https://python.langchain.com/docs/concepts/tools/#toolkits) from LangChain's [community tool library](https://python.langchain.com/docs/integrations/tools/) with Pydantic AI, you can use the [`LangChainToolset`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.langchain.LangChainToolset) which takes a list of LangChain tools. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the LangChain tool, and up to the LangChain tool to raise an error if the arguments are invalid. You will need to install the `langchain-community` package and any others required by the tools in question. ```python from langchain_community.agent_toolkits import SlackToolkit from pydantic_ai import Agent from pydantic_ai.ext.langchain import LangChainToolset toolkit = SlackToolkit() toolset = LangChainToolset(toolkit.get_tools()) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) # ... ``` ### ACI.dev Tools If you'd like to use tools from the [ACI.dev tool library](https://www.aci.dev/tools) with Pydantic AI, you can use the [`ACIToolset`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.aci.ACIToolset) [toolset](/docs/ai/tools-toolsets/toolsets) which takes a list of ACI tool names as well as the `linked_account_owner_id`. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the ACI tool, and up to the ACI tool to raise an error if the arguments are invalid. You will need to install the `aci-sdk` package, set your ACI API key in the `ACI_API_KEY` environment variable, and pass your ACI "linked account owner ID" to the function. ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import ACIToolset toolset = ACIToolset( [ 'OPEN_WEATHER_MAP__CURRENT_WEATHER', 'OPEN_WEATHER_MAP__FORECAST', ], linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) ```