Skip to content

pydantic_ai.models.anthropic

Setup

For details on how to set up authentication with this model, see model configuration for Anthropic.

AnthropicModelSettings

Bases: ModelSettings

Settings used for an Anthropic model request.

Attributes

anthropic_metadata

An object describing metadata about the request.

Contains user_id, an external identifier for the user who is associated with the request.

Type: BetaMetadataParam

anthropic_thinking

Determine whether the model should generate a thinking block.

See the Anthropic docs for more information.

Type: BetaThinkingConfigParam

anthropic_cache_tool_definitions

Whether to add cache_control to the last tool definition.

When enabled, the last tool in the tools array will have cache_control set, allowing Anthropic to cache tool definitions and reduce costs. If True, uses TTL=‘5m’. You can also specify ‘5m’ or ‘1h’ directly. See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching for more information.

Type: bool | Literal[‘5m’, ‘1h’]

anthropic_service_tier

The service tier to use for the model request.

See https://docs.anthropic.com/en/docs/build-with-claude/latency-and-throughput for more information.

Type: Literal[‘auto’, ‘standard_only’]

anthropic_cache_instructions

Whether to add cache_control to the last system prompt block.

When enabled, the last system prompt will have cache_control set, allowing Anthropic to cache system instructions and reduce costs. If True, uses TTL=‘5m’. You can also specify ‘5m’ or ‘1h’ directly. See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching for more information.

Type: bool | Literal[‘5m’, ‘1h’]

anthropic_cache_messages

Whether to add cache_control to the last message content block.

This is an alternative to anthropic_cache for Anthropic-compatible gateways and proxies that accept the Anthropic message format but don’t support the top-level automatic caching parameter.

If True, uses TTL=‘5m’. You can also specify ‘5m’ or ‘1h’ directly. Cannot be combined with anthropic_cache.

Type: bool | Literal[‘5m’, ‘1h’]

anthropic_cache

Enable prompt caching for multi-turn conversations.

Passes a top-level cache_control parameter so the server automatically applies a cache breakpoint to the last cacheable block and moves it forward as conversations grow.

On Bedrock and Vertex, automatic caching is not yet supported, so this falls back to per-block caching on the last user message. If the last content block already has cache_control from an explicit CachePoint, it is preserved.

If True, uses TTL=‘5m’. You can also specify ‘5m’ or ‘1h’ directly.

This can be combined with explicit cache breakpoints (anthropic_cache_instructions, anthropic_cache_tool_definitions, CachePoint). The automatic breakpoint counts as 1 of Anthropic’s 4 cache point slots; we automatically trim excess explicit breakpoints. See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#automatic-caching for more information.

Type: bool | Literal[‘5m’, ‘1h’]

anthropic_effort

The effort level for the model to use when generating a response.

See the Anthropic docs for more information.

Type: Literal[‘low’, ‘medium’, ‘high’, ‘xhigh’, ‘max’] | None

anthropic_task_budget

Task budget configuration for Claude Opus 4.7 beta requests.

Maps to output_config.task_budget. This setting is currently only supported on claude-opus-4-7, and Pydantic AI automatically enables Anthropic’s required task-budget beta when it is present.

Omit remaining unless you are intentionally carrying a budget across compaction or other rewritten context.

Type: AnthropicTaskBudget

anthropic_container

Container configuration for multi-turn conversations.

By default, if previous messages contain a container_id (from a prior response), it will be reused automatically.

Set to False to force a fresh container (ignore any container_id from history). Set to a container id string (e.g. 'container_xxx') to explicitly reuse a container, or to a BetaContainerParams dict (e.g. {'skills': [...]} or {'id': 'container_xxx', 'skills': [...]}) when passing Skills to the Anthropic Skills beta.

Type: BetaContainerParams | str | Literal[False]

anthropic_code_execution_tool_version

Which Anthropic code execution tool version to send for CodeExecutionTool.

Defaults to 'auto', which uses the default version from the model profile: '20260120' for Sonnet 4.5+ and Opus 4.5+, otherwise '20250825'. Set a concrete version to force that tool version; a UserError is raised if the selected model profile does not support that version.

Type: AnthropicCodeExecutionToolVersion | Literal[‘auto’]

anthropic_eager_input_streaming

Whether to enable eager input streaming on tool definitions.

When enabled, all tool definitions will have eager_input_streaming set to True, allowing Anthropic to stream tool call arguments incrementally instead of buffering the entire JSON before streaming. This reduces latency for tool calls with large inputs. See https://platform.claude.com/docs/en/agents-and-tools/tool-use/fine-grained-tool-streaming for more information.

Type: bool

anthropic_betas

List of Anthropic beta features to enable for API requests.

Each item can be a known beta name (e.g. ‘interleaved-thinking-2025-05-14’) or a custom string. Merged with auto-added betas (e.g. builtin tools) and any betas from extra_headers[‘anthropic-beta’]. See the Anthropic docs for available beta features.

Type: list[AnthropicBetaParam]

anthropic_speed

The inference speed mode for this request.

'fast' enables high output-tokens-per-second inference for supported models (currently Claude Opus 4.6 only). On unsupported models or clients, anthropic_speed='fast' is ignored with a UserWarning. Fast mode is a research preview and only available on the direct Anthropic API (not Bedrock, Vertex, or Foundry); see the Anthropic docs for details. Note: switching between 'fast' and 'standard' invalidates the prompt cache.

Type: Literal[‘standard’, ‘fast’]

anthropic_context_management

Context management configuration for automatic compaction.

When configured, Anthropic will automatically compact older context when the input token count exceeds the configured threshold. The compaction produces a summary that replaces the compacted messages.

See the Anthropic docs for more details.

Type: BetaContextManagementConfigParam

AnthropicModel

Bases: Model[AsyncAnthropicClient]

A model that uses the Anthropic API.

Internally, this uses the Anthropic Python client to interact with the API.

Apart from __init__, all methods are private or match those of the base class.

Attributes

model_name

The model name.

Type: AnthropicModelName

system

The model provider.

Type: str

Methods

__init__
def __init__(
    model_name: AnthropicModelName,
    provider: Literal['anthropic', 'gateway'] | Provider[AsyncAnthropicClient] = 'anthropic',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)

Initialize an Anthropic model.

Parameters

model_name : AnthropicModelName

The name of the Anthropic model to use. List of model names available here.

provider : Literal[‘anthropic’, ‘gateway’] | Provider[AsyncAnthropicClient] Default: 'anthropic'

The provider to use for the Anthropic API. Can be either the string ‘anthropic’ or an instance of Provider[AsyncAnthropicClient]. Defaults to ‘anthropic’.

profile : ModelProfileSpec | None Default: None

The model profile to use. Defaults to a profile picked by the provider based on the model name. The default ‘anthropic’ provider will use the default ..profiles.anthropic.anthropic_model_profile.

settings : ModelSettings | None Default: None

Default model settings for this model instance.

supported_native_tools

@classmethod

def supported_native_tools(cls) -> frozenset[type[AbstractNativeTool]]

The set of builtin tool types this model can handle.

Returns

frozenset[type[AbstractNativeTool]]

AnthropicCompaction

Bases: AbstractCapability[AgentDepsT]

Compaction capability for Anthropic models.

Configures automatic context management via Anthropic’s context_management API parameter. Compaction triggers server-side when input tokens exceed the configured threshold.

Example usage:

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicCompaction

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    capabilities=[AnthropicCompaction(token_threshold=100_000)],
)

Methods

__init__
def __init__(
    token_threshold: int = 150000,
    instructions: str | None = None,
    pause_after_compaction: bool = False,
) -> None

Initialize the Anthropic compaction capability.

Returns

None

Parameters

token_threshold : int Default: 150000

Compact when input tokens exceed this threshold. Minimum 50,000.

instructions : str | None Default: None

Custom instructions for the compaction summarization.

pause_after_compaction : bool Default: False

If True, the response will stop after the compaction block with stop_reason='compaction', allowing explicit handling.

AnthropicStreamedResponse

Bases: StreamedResponse

Implementation of StreamedResponse for Anthropic models.

Attributes

model_name

Get the model name of the response.

Type: AnthropicModelName

provider_name

Get the provider name.

Type: str

provider_url

Get the provider base URL.

Type: str

timestamp

Get the timestamp of the response.

Type: datetime

LatestAnthropicModelNames

Anthropic model names from the installed SDK.

Default: ModelParam

AnthropicModelName

Possible Anthropic model names.

The installed Anthropic SDK exposes the current literal set and still allows arbitrary string model names. See the Anthropic docs for a full list.

Default: LatestAnthropicModelNames

AnthropicTaskBudget

Anthropic task budget payload for output_config.task_budget.

Type: TypeAlias Default: BetaTokenTaskBudgetParam