# Image, Audio, Video & Document Input

## Image Input

Note

Some models do not support image input. Please check the model's documentation to confirm whether it supports image input.

If you have a direct URL for the image, you can use [`ImageUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ImageUrl):

image\_input.py

```python
from pydantic_ai import Agent, ImageUrl

agent = Agent(model='openai:gpt-5.2')
result = agent.run_sync(
    [
        'What company is this logo from?',
        ImageUrl(url='https://iili.io/3Hs4FMg.png'),
    ]
)
print(result.output)
#> This is the logo for Pydantic, a data validation and settings management library in Python.
```

If you have the image locally, you can also use [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent):

local\_image\_input.py

```python
import httpx

from pydantic_ai import Agent, BinaryContent

image_response = httpx.get('https://iili.io/3Hs4FMg.png')  # Pydantic logo

agent = Agent(model='openai:gpt-5.2')
result = agent.run_sync(
  [
      'What company is this logo from?',
      BinaryContent(data=image_response.content, media_type='image/png'),  # (1)
  ]
)
print(result.output)
#> This is the logo for Pydantic, a data validation and settings management library in Python.
```

To ensure the example is runnable we download this image from the web, but you can also use `Path().read_bytes()` to read a local file's contents.

## Audio Input

Note

Some models do not support audio input. Please check the model's documentation to confirm whether it supports audio input.

You can provide audio input using either [`AudioUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AudioUrl) or [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent). The process is analogous to the examples above.

## Video Input

Note

Some models do not support video input. Please check the model's documentation to confirm whether it supports video input.

You can provide video input using either [`VideoUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.VideoUrl) or [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent). The process is analogous to the examples above.

## Document Input

Note

Some models do not support document input. Please check the model's documentation to confirm whether it supports document input.

You can provide document input using either [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl) or [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent). The process is similar to the examples above.

If you have a direct URL for the document, you can use [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl):

document\_input.py

```python
from pydantic_ai import Agent, DocumentUrl

agent = Agent(model='anthropic:claude-sonnet-4-6')
result = agent.run_sync(
    [
        'What is the main content of this document?',
        DocumentUrl(url='https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf'),
    ]
)
print(result.output)
#> This document is the technical report introducing Gemini 1.5, Google's latest large language model...
```

The supported document formats vary by model.

You can also use [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) to pass document data directly:

binary\_content\_input.py

```python
from pathlib import Path
from pydantic_ai import Agent, BinaryContent

pdf_path = Path('document.pdf')
agent = Agent(model='anthropic:claude-sonnet-4-6')
result = agent.run_sync(
    [
        'What is the main content of this document?',
        BinaryContent(data=pdf_path.read_bytes(), media_type='application/pdf'),
    ]
)
print(result.output)
#> The document discusses...
```

Tip

If neither `DocumentUrl` nor `BinaryContent` is suitable for your use case (e.g., the model doesn't support `DocumentUrl`, or you want to provide a document in a non-binary format), you can still provide document content as text input by extracting the text yourself and passing it as a string or [`TextContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextContent).

## Text Input

You can use [`TextContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextContent) to provide text input with additional metadata:

text\_content\_input.py

```python
from pydantic_ai import Agent, TextContent

agent = Agent(model='openai:gpt-5.2')
result = agent.run_sync([
    'Summarize the key points from this text.',
    TextContent(
        content=(
            'Pydantic AI is a Python agent framework. '
            'It supports text, image, audio, video, and document input.'
        ),
        metadata={'source': 'pydantic_ai_inputs.txt'},
    ),
])
```

This is equivalent to passing the text as a `str`, but allows you to include additional `metadata` that can be accessed programmatically in your agent logic.

Note

The `content` field is treated as input to the model, but the `metadata` is **not sent to the model**. It is preserved in messages for programmatic access.

## User-side download vs. direct file URL

When using one of `ImageUrl`, `AudioUrl`, `VideoUrl` or `DocumentUrl`, Pydantic AI will default to sending the URL to the model provider, so the file is downloaded on their side.

Support for file URLs varies depending on type and provider:

Model

Send URL directly

Download and send bytes

Unsupported

[`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel)

`ImageUrl`

`AudioUrl`, `DocumentUrl`

`VideoUrl`. `DocumentUrl` [not supported with `AzureProvider`](/docs/ai/models/openai#using-azure-with-the-responses-api)

[`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel)

`ImageUrl`, `AudioUrl`, `DocumentUrl`

--

`VideoUrl`

[`AnthropicModel`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModel)

`ImageUrl`, `DocumentUrl` (PDF)

`DocumentUrl` (`text/plain`)

`AudioUrl`, `VideoUrl`

[`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) (Google Cloud)

All URL types

--

--

[`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) (Gemini API)

[YouTube](/docs/ai/models/google#document-image-audio-and-video-input), [Files API](/docs/ai/models/google#document-image-audio-and-video-input)

All other URLs

--

[`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel)

`ImageUrl`

`DocumentUrl`

`AudioUrl`, `VideoUrl`

[`MistralModel`](/docs/ai/api/models/mistral/#pydantic_ai.models.mistral.MistralModel)

`ImageUrl`, `DocumentUrl` (PDF)

--

`AudioUrl`, `VideoUrl`, `DocumentUrl` (non-PDF)

[`BedrockConverseModel`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockConverseModel)

S3 URLs (`s3://`)

`ImageUrl`, `DocumentUrl`, `VideoUrl`

`AudioUrl`

[`OpenRouterModel`](/docs/ai/api/models/openrouter/#pydantic_ai.models.openrouter.OpenRouterModel)

`ImageUrl`, `DocumentUrl`, `VideoUrl`

`AudioUrl`

--

A model API may be unable to download a file (e.g., because of crawling or access restrictions) even if it supports file URLs. For example, [`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) on Google Cloud limits YouTube video URLs to one URL per request.

In such cases, you can instruct Pydantic AI to download the file content locally and send that instead of the URL by setting `force_download` on the URL object:

force\_download.py

```py
from pydantic_ai import ImageUrl, AudioUrl, VideoUrl, DocumentUrl

ImageUrl(url='https://example.com/image.png', force_download=True)
AudioUrl(url='https://example.com/audio.mp3', force_download=True)
VideoUrl(url='https://example.com/video.mp4', force_download=True)
DocumentUrl(url='https://example.com/doc.pdf', force_download=True)
```

Trust model for file URLs

When URLs are forwarded to the provider, the provider fetches them under its own credentials. For cloud-storage schemes like `s3://` (Bedrock) and `gs://` (Google Cloud), those credentials are your server's IAM role or service account, so whoever controls the URL effectively controls what the provider can read on your behalf.

Don't construct [`ImageUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ImageUrl), [`AudioUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AudioUrl), [`VideoUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.VideoUrl), or [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl) from untrusted user input without validating the scheme and scope. For frontend-initiated uploads to cloud storage, convert references like `s3://bucket/key` into pre-signed `https://` URLs server-side before constructing the file URL part. `force_download=True` only works for `http(s)://` URLs (it routes through the library's HTTP client and applies SSRF protection); cloud-storage schemes like `s3://` and `gs://` aren't supported by the local download path and are forwarded to the provider as-is. Only use `force_download='allow-local'` for server-authored URLs, since it allows local network access.

The [UI adapters](/docs/ai/integrations/ui/overview) apply this sanitization automatically to client-submitted messages via [`UIAdapter.allowed_file_url_schemes`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.allowed_file_url_schemes) and [`UIAdapter.allowed_file_url_force_download`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.allowed_file_url_force_download).

## Uploaded Files

Some model providers have their own file storage APIs where you can upload files and reference them by ID or URL.

Use [`UploadedFile`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.UploadedFile) to reference files that have been uploaded to a provider's file storage API.

Tip

For providers that return a file URL (like Google Files API or S3 URLs for Bedrock), you can also use [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl), [`ImageUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ImageUrl), or [`VideoUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.VideoUrl) directly. However, we recommend using `UploadedFile` for a unified API across providers and consistent provider name validation.

### Supported Models

Model

Support

[`AnthropicModel`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModel)

✅ via [Anthropic Files API](https://docs.anthropic.com/en/docs/build-with-claude/files)

[`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel)

✅ via [OpenAI Files API](https://platform.openai.com/docs/api-reference/files)

[`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel)

✅ via [OpenAI Files API](https://platform.openai.com/docs/api-reference/files)

[`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel)

✅ via [Google Files API](https://ai.google.dev/gemini-api/docs/files)

[`BedrockConverseModel`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockConverseModel)

✅ via S3 URLs (`s3://bucket/key`)

[`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel)

✅ via [xAI Files API](https://docs.x.ai/docs/guides/files)

Other models

❌ Not supported

### Provider Name Requirement

When using [`UploadedFile`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.UploadedFile) you must set the `provider_name`. Uploaded files are specific to the system they are uploaded to and are not transferable across providers. Trying to use a message that contains an `UploadedFile` with a different provider will result in an error.

Getting the provider name

Use [`model.system`](/docs/ai/api/models/base/#pydantic_ai.models.Model.system) to get the correct provider name dynamically. This ensures your code works correctly even if the provider name changes. All examples below demonstrate this pattern.

If you want to introduce portability into your agent logic to allow the same prompt history to work with different provider backends, you can use a [history processor](/docs/ai/core-concepts/message-history#processing-message-history) to remove or rewrite `UploadedFile` parts from messages before sending them to a provider that does not support them. Be aware that stripping out `UploadedFile` instances might confuse the model, especially if references to those files remain in the text.

### Media Type Inference

The `media_type` parameter is optional for [`UploadedFile`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.UploadedFile). If not specified, Pydantic AI will attempt to infer it from the `file_id`:

1.  If `file_id` is a URL or path with a recognizable file extension (e.g., `.pdf`, `.png`), the media type is inferred automatically
2.  For opaque file IDs (e.g., `'file-abc123'`), the media type defaults to `'application/octet-stream'`

Tip

While `media_type` is optional, we recommend explicitly setting it when known to ensure correct handling by the model provider.

### Anthropic

Follow the [Anthropic Files API docs](https://docs.anthropic.com/en/docs/build-with-claude/files) to upload files. You can access the underlying Anthropic client via `provider.client`.

Beta Feature

The Anthropic Files API is currently in beta. You need to include the beta header `anthropic-beta: files-api-2025-04-14` when making requests.

uploaded\_file\_anthropic.py

```py
import asyncio

from pydantic_ai import Agent, ModelSettings, UploadedFile
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider


async def main():
    provider = AnthropicProvider()
    model = AnthropicModel('claude-sonnet-4-5', provider=provider)

    # Upload a file using the provider's client (Anthropic client)
    with open('document.pdf', 'rb') as f:
        uploaded_file = await provider.client.beta.files.upload(file=f)

    # Reference the uploaded file, including the required beta header
    agent = Agent(model)
    result = await agent.run(
        [
            'Summarize this document',
            UploadedFile(file_id=uploaded_file.id, provider_name=model.system),
        ],
        model_settings=ModelSettings(extra_headers={'anthropic-beta': 'files-api-2025-04-14'}),
    )
    print(result.output)
    #> The document discusses the main topics and key findings...


asyncio.run(main())
```

### OpenAI

Follow the [OpenAI Files API docs](https://platform.openai.com/docs/api-reference/files/create) to upload files. You can access the underlying OpenAI client via `provider.client`.

uploaded\_file\_openai.py

```py
import asyncio

from pydantic_ai import Agent, UploadedFile
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider


async def main():
    provider = OpenAIProvider()
    model = OpenAIChatModel('gpt-5', provider=provider)

    # Upload a file using the provider's client (OpenAI client)
    with open('document.pdf', 'rb') as f:
        uploaded_file = await provider.client.files.create(file=f, purpose='user_data')

    # Reference the uploaded file
    agent = Agent(model)
    result = await agent.run(
        [
            'Summarize this document',
            UploadedFile(file_id=uploaded_file.id, provider_name=model.system),
        ]
    )
    print(result.output)
    #> The document discusses the main topics and key findings...


asyncio.run(main())
```

### Google

Follow the [Google Files API docs](https://ai.google.dev/gemini-api/docs/files) to upload files. You can access the underlying Google GenAI client via `provider.client`.

uploaded\_file\_google.py

```py
import asyncio

from pydantic_ai import Agent, UploadedFile
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider


async def main():
    provider = GoogleProvider()
    model = GoogleModel('gemini-2.5-flash', provider=provider)

    # Upload a file using the provider's client (Google GenAI client)
    with open('document.pdf', 'rb') as f:
        file = await provider.client.aio.files.upload(file=f)
        assert file.uri is not None

    # Reference the uploaded file by URI (media_type is optional for Google)
    agent = Agent(model)
    result = await agent.run(
        [
            'Summarize this document',
            UploadedFile(file_id=file.uri, media_type=file.mime_type, provider_name=model.system),
        ]
    )
    print(result.output)
    #> The document discusses the main topics and key findings...


asyncio.run(main())
```

### Bedrock (S3)

For Bedrock, files must be uploaded to S3 separately (e.g., using [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/put_object.html)). The assumed role must have `s3:GetObject` permission on the bucket.

`media_type` may be required

Bedrock requires `media_type` when the file extension is ambiguous or missing. For S3 URLs with clear extensions like `.pdf`, `.png`, etc., it can be inferred automatically.

uploaded\_file\_bedrock.py

```py
import asyncio

from pydantic_ai import Agent, UploadedFile
from pydantic_ai.models.bedrock import BedrockConverseModel


async def main():
    model = BedrockConverseModel('us.anthropic.claude-sonnet-4-20250514-v1:0')

    agent = Agent(model)
    result = await agent.run([
        'Summarize this document',
        UploadedFile(
            file_id='s3://my-bucket/document.pdf',
            provider_name=model.system,  # 'bedrock'
            media_type='application/pdf',  # Optional for .pdf, but recommended
        ),
    ])
    print(result.output)
    #> The document discusses the main topics and key findings...


asyncio.run(main())
```

Note

You can optionally specify a `bucketOwner` query parameter if the bucket is not owned by the account making the request: `s3://my-bucket/document.pdf?bucketOwner=123456789012`

### xAI

Follow the [xAI Files API docs](https://docs.x.ai/docs/guides/files) to upload files. You can access the underlying xAI client via `provider.client`.

uploaded\_file\_xai.py

```py
import asyncio

from pydantic_ai import Agent, UploadedFile
from pydantic_ai.models.xai import XaiModel
from pydantic_ai.providers.xai import XaiProvider


async def main():
    provider = XaiProvider()
    model = XaiModel('grok-4.3', provider=provider)

    # Upload a file using the provider's client (xAI client)
    with open('document.pdf', 'rb') as f:
        uploaded_file = await provider.client.files.upload(f, filename='document.pdf')

    # Reference the uploaded file
    agent = Agent(model)
    result = await agent.run(
        [
            'Summarize this document',
            UploadedFile(file_id=uploaded_file.id, provider_name=model.system),
        ]
    )
    print(result.output)
    #> The document discusses the main topics and key findings...


asyncio.run(main())
```