pydantic_ai.models.anthropic
For details on how to set up authentication with this model, see model configuration for Anthropic.
Bases: ModelSettings
Settings used for an Anthropic model request.
An object describing metadata about the request.
Contains user_id, an external identifier for the user who is associated with the request.
Type: BetaMetadataParam
Determine whether the model should generate a thinking block.
See the Anthropic docs for more information.
Type: BetaThinkingConfigParam
Whether to add cache_control to the last tool definition.
When enabled, the last tool in the tools array will have cache_control set,
allowing Anthropic to cache tool definitions and reduce costs.
If True, uses TTL=‘5m’. You can also specify ‘5m’ or ‘1h’ directly.
See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching for more information.
Type: bool | Literal[‘5m’, ‘1h’]
Whether to add cache_control to the last system prompt block.
When enabled, the last system prompt will have cache_control set,
allowing Anthropic to cache system instructions and reduce costs.
If True, uses TTL=‘5m’. You can also specify ‘5m’ or ‘1h’ directly.
See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching for more information.
Type: bool | Literal[‘5m’, ‘1h’]
Deprecated: use anthropic_cache instead.
Behaves the same as anthropic_cache: uses automatic caching where supported,
falls back to per-block caching on Bedrock and Vertex. Emits a deprecation warning.
Cannot be combined with anthropic_cache.
Type: bool | Literal[‘5m’, ‘1h’]
Enable prompt caching for multi-turn conversations.
Passes a top-level cache_control parameter so the server automatically applies a
cache breakpoint to the last cacheable block and moves it forward as conversations grow.
On Bedrock and Vertex, automatic caching is not yet supported, so this falls back to
per-block caching on the last user message. If the last content block already has
cache_control from an explicit CachePoint, it is preserved.
If True, uses TTL=‘5m’. You can also specify ‘5m’ or ‘1h’ directly.
This can be combined with explicit cache breakpoints (anthropic_cache_instructions,
anthropic_cache_tool_definitions, CachePoint). The automatic breakpoint counts as
1 of Anthropic’s 4 cache point slots; we automatically trim excess explicit breakpoints.
See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#automatic-caching
for more information.
Type: bool | Literal[‘5m’, ‘1h’]
The effort level for the model to use when generating a response.
See the Anthropic docs for more information.
Type: Literal[‘low’, ‘medium’, ‘high’, ‘xhigh’, ‘max’] | None
Container configuration for multi-turn conversations.
By default, if previous messages contain a container_id (from a prior response), it will be reused automatically.
Set to False to force a fresh container (ignore any container_id from history).
Set to a container id string (e.g. 'container_xxx') to explicitly reuse a container,
or to a BetaContainerParams dict (e.g. {'skills': [...]} or
{'id': 'container_xxx', 'skills': [...]}) when passing Skills to the Anthropic
Skills beta.
Type: BetaContainerParams | str | Literal[False]
Whether to enable eager input streaming on tool definitions.
When enabled, all tool definitions will have eager_input_streaming set to True,
allowing Anthropic to stream tool call arguments incrementally instead of buffering
the entire JSON before streaming. This reduces latency for tool calls with large inputs.
See https://platform.claude.com/docs/en/agents-and-tools/tool-use/fine-grained-tool-streaming for more information.
Type: bool
List of Anthropic beta features to enable for API requests.
Each item can be a known beta name (e.g. ‘interleaved-thinking-2025-05-14’) or a custom string. Merged with auto-added betas (e.g. builtin tools) and any betas from extra_headers[‘anthropic-beta’]. See the Anthropic docs for available beta features.
Type: list[AnthropicBetaParam]
Context management configuration for automatic compaction.
When configured, Anthropic will automatically compact older context when the input token count exceeds the configured threshold. The compaction produces a summary that replaces the compacted messages.
See the Anthropic docs for more details.
Type: BetaContextManagementConfigParam
Bases: Model[AsyncAnthropicClient]
A model that uses the Anthropic API.
Internally, this uses the Anthropic Python client to interact with the API.
Apart from __init__, all methods are private or match those of the base class.
The model name.
Type: AnthropicModelName
The model provider.
Type: str
def __init__(
model_name: AnthropicModelName,
provider: Literal['anthropic', 'gateway'] | Provider[AsyncAnthropicClient] = 'anthropic',
profile: ModelProfileSpec | None = None,
settings: ModelSettings | None = None,
)
Initialize an Anthropic model.
The name of the Anthropic model to use. List of model names available here.
provider : Literal[‘anthropic’, ‘gateway’] | Provider[AsyncAnthropicClient] Default: 'anthropic'
The provider to use for the Anthropic API. Can be either the string ‘anthropic’ or an
instance of Provider[AsyncAnthropicClient]. Defaults to ‘anthropic’.
profile : ModelProfileSpec | None Default: None
The model profile to use. Defaults to a profile picked by the provider based on the model name.
The default ‘anthropic’ provider will use the default ..profiles.anthropic.anthropic_model_profile.
settings : ModelSettings | None Default: None
Default model settings for this model instance.
@classmethod
def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]
The set of builtin tool types this model can handle.
frozenset[type[AbstractBuiltinTool]]
Bases: AbstractCapability[AgentDepsT]
Compaction capability for Anthropic models.
Configures automatic context management via Anthropic’s context_management
API parameter. Compaction triggers server-side when input tokens exceed
the configured threshold.
Example usage::
from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicCompaction
agent = Agent( ‘anthropic:claude-sonnet-4-6’, capabilities=[AnthropicCompaction(token_threshold=100_000)], )
def __init__(
token_threshold: int = 150000,
instructions: str | None = None,
pause_after_compaction: bool = False,
) -> None
Initialize the Anthropic compaction capability.
token_threshold : int Default: 150000
Compact when input tokens exceed this threshold. Minimum 50,000.
Custom instructions for the compaction summarization.
pause_after_compaction : bool Default: False
If True, the response will stop after the compaction block
with stop_reason='compaction', allowing explicit handling.
Bases: StreamedResponse
Implementation of StreamedResponse for Anthropic models.
Get the model name of the response.
Type: AnthropicModelName
Get the provider name.
Type: str
Get the provider base URL.
Type: str
Get the timestamp of the response.
Type: datetime
Anthropic model names from the installed SDK.
Default: ModelParam
Possible Anthropic model names.
The installed Anthropic SDK exposes the current literal set and still allows arbitrary string model names. See the Anthropic docs for a full list.
Default: LatestAnthropicModelNames