pydantic_ai.profiles

ModelProfile

Bases: TypedDict

Describes how requests to and responses from specific models or families of models need to be constructed and processed to get the best results, independent of the model and provider classes used.

All fields are optional; absent keys mean “use the documented default” (defaults are documented per field below and applied at access sites).

Subclasses (OpenAIModelProfile, AnthropicModelProfile, …) add provider-specific keys; cross-class merging via dict-spread is supported.

Attributes

supports_tools

Whether the model supports tools. Default: True.

Type: bool

supports_tool_return_schema

Whether the model natively supports tool return schemas. Default: False.

When True, the model’s API accepts a structured return schema alongside each tool definition. When False, return schemas are injected as JSON text into tool descriptions as a fallback.

Type: bool

supports_json_schema_output

Whether the model supports JSON schema output. Default: False.

This is also referred to as ‘native’ support for structured output. Relates to the NativeOutput output type.

Type: bool

supports_json_object_output

Whether the model supports a dedicated mode to enforce JSON output, without necessarily sending a schema. Default: False.

E.g. OpenAI’s JSON mode Relates to the PromptedOutput output type.

Type: bool

supports_image_output

Whether the model supports image output. Default: False.

Type: bool

supports_inline_system_prompts

Whether the provider’s API accepts SystemPromptParts inline at any position. Default: False.

When False, non-leading SystemPromptParts are wrapped as UserPromptParts with <system>...</system> content in Model.prepare_messages. Leading ones still hoist to the provider’s top-level system parameter.

Type: bool

default_structured_output_mode

The default structured output mode to use for the model. Default: 'tool'.

Type: StructuredOutputMode

prompted_output_template

The instructions template to use for prompted structured output. The {schema} placeholder will be replaced with the JSON schema for the output. Default: DEFAULT_PROMPTED_OUTPUT_TEMPLATE.

Type: str

native_output_requires_schema_in_instructions

Whether to add prompted output template in native structured output mode. Default: False.

Type: bool

json_schema_transformer

The transformer to use to make JSON schemas for tools and structured output compatible with the model. Default: None.

Type: type[JsonSchemaTransformer] | None

supports_thinking

Whether the model supports thinking/reasoning configuration. Default: False.

When False, the unified thinking setting in ModelSettings is silently ignored.

Type: bool

thinking_always_enabled

Whether the model always uses thinking/reasoning (e.g., OpenAI o-series, DeepSeek R1). Default: False.

When True, thinking=False is silently ignored since the model cannot disable thinking. Implies supports_thinking=True.

Type: bool

thinking_tags

The tags used to indicate thinking parts in the model’s output. Default: DEFAULT_THINKING_TAGS.

Type: tuple[str, str]

ignore_streamed_leading_whitespace

Whether to ignore leading whitespace when streaming a response. Default: False.

This is a workaround for models that emit `<think> </think>

or an empty text part ahead of tool calls (e.g. Ollama + Qwen3), which we don't want to end up treating as a final result when usingrun_streamwithstra validoutput_type`.

This is currently only used by OpenAIChatModel, HuggingFaceModel, and GroqModel.

Type: bool

supported_native_tools

The set of native tool types that this model/profile supports. Default: SUPPORTED_NATIVE_TOOLS (all).

Type: frozenset[type[AbstractNativeTool]]

merge_profile

def merge_profile(
    base: ModelProfile | None,
    *overrides: ModelProfile | None,
) -> ModelProfile

Merge profiles via dict-spread. Later arguments override earlier ones; None is treated as empty.

This is the canonical way to layer profiles in providers and tests; replaces the old ModelProfile.update() method.

Returns

ModelProfile

ModelProfileSpec

Acceptable shapes for the profile= argument on a Model.

A ModelProfile dict — a partial profile, merged on top of the provider’s resolved default.
A Callable[[ModelProfile], ModelProfile] — receives the provider’s resolved default (with DEFAULT_PROFILE already merged in) and returns the final profile (full control: replace, derive, ignore the default).

Provider classes still expose Provider.model_profile(model_name) (Callable[[str], ModelProfile | None]) — that’s a separate concept used internally by Model.profile to resolve the provider’s default for a given model name.

Type: TypeAlias Default: ModelProfile | Callable[['ModelProfile'], 'ModelProfile']

DEFAULT_PROFILE

Fully populated default ModelProfile. Used as the base layer when resolving a model’s effective profile.

Type: ModelProfile Default: {'supports_tools': True, 'supports_tool_return_schema': False, 'supports_json_schema_output': False, 'supports_json_object_output': False, 'supports_image_output': False, 'default_structured_output_mode': 'tool', 'prompted_output_template': DEFAULT_PROMPTED_OUTPUT_TEMPLATE, 'native_output_requires_schema_in_instructions': False, 'json_schema_transformer': None, 'supports_thinking': False, 'thinking_always_enabled': False, 'thinking_tags': DEFAULT_THINKING_TAGS, 'ignore_streamed_leading_whitespace': False, 'supported_native_tools': SUPPORTED_NATIVE_TOOLS}

DEFAULT_PROMPTED_OUTPUT_TEMPLATE

Default instructions template for prompted structured output. The {schema} placeholder is replaced with the JSON schema for the output.

Default: dedent("\n Always respond with a JSON object that's compatible with this schema:\n\n {schema}\n\n Don't include any text or Markdown fencing before or after.\n ")

DEFAULT_THINKING_TAGS

Default (start_tag, end_tag) pair for parsing thinking content out of text responses.

Type: tuple[str, str] Default: ('<think>', '</think>')

OpenAIModelProfile

Bases: ModelProfile

Profile for models used with OpenAIChatModel.

ALL FIELDS MUST BE openai_ PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

Attributes

openai_chat_thinking_field

Non-standard field name used by some providers for model thinking content in Chat Completions API responses. Default: None.

Plenty of providers use custom field names for thinking content. Ollama and newer versions of vLLM use reasoning, while DeepSeek, older vLLM and some others use reasoning_content.

Notice that the thinking field configured here is currently limited to str type content.

If openai_chat_send_back_thinking_parts is set to 'field', this field must be set to a non-None value.

Type: str | None

openai_chat_send_back_thinking_parts

Whether the model includes thinking content in requests. Default: 'auto'.

This can be:

'auto' (default): Automatically detects how to send thinking content. If thinking was received in a custom field (tracked via ThinkingPart.id and ThinkingPart.provider_name), it’s sent back in that same field. Otherwise, it’s sent using tags. Only the reasoning and reasoning_content fields are checked by default when receiving responses. If your provider uses a different field name, you must explicitly set openai_chat_thinking_field to that field name.
'tags': The thinking content is included in the main content field, enclosed within thinking tags as specified in thinking_tags profile option.
'field': The thinking content is included in a separate field specified by openai_chat_thinking_field.
False: No thinking content is sent in the request.

Defaults to 'auto' to ensure thinking is sent back in the format expected by the model/provider.

Type: Literal[‘auto’, ‘tags’, ‘field’, False]

openai_supports_strict_tool_definition

This can be set by a provider or user if the OpenAI-”compatible” API doesn’t support strict tool definitions. Default: True.

Type: bool

openai_unsupported_model_settings

A list of model settings that are not supported by this model. Default: ().

Type: Sequence[str]

openai_supports_tool_choice_required

Whether the provider accepts the value tool_choice='required' in the request payload. Default: True.

Type: bool

openai_system_prompt_role

The role to use for the system prompt message. If not provided, defaults to 'system'.

Type: OpenAISystemPromptRole | None

openai_chat_supports_multiple_system_messages

Whether the Chat Completions API accepts more than one system-role message at the start of the conversation. Default: True.

OpenAI itself and most compatible providers accept multiple system messages, so this defaults to True. Set to False for strict OpenAI-compatible backends (e.g. some LiteLLM/vLLM deployments) that require exactly one initial system message; consecutive system messages at the start will be merged into one (joined with two newlines) before being sent.

Type: bool

openai_chat_supports_web_search

Whether the model supports web search in Chat Completions API. Default: False.

Type: bool

openai_chat_audio_input_encoding

The encoding to use for audio input in Chat Completions requests. Default: 'base64'.

'base64': Raw base64 encoded string. (Default, used by OpenAI)
'uri': Data URI (e.g. data:audio/wav;base64,...).

Type: Literal[‘base64’, ‘uri’]

openai_chat_supports_file_urls

Whether the Chat API supports file URLs directly in the file_data field. Default: False.

OpenAI’s native Chat API only supports base64-encoded data, but some providers like OpenRouter support passing URLs directly.

Type: bool

openai_supports_encrypted_reasoning_content

Whether the model supports including encrypted reasoning content in the response. Default: False.

Type: bool

openai_supports_reasoning

Whether the model supports reasoning (o-series, GPT-5+). Default: False.

When True, sampling parameters may need to be dropped depending on reasoning_effort setting.

Type: bool

openai_reasoning_enabled_by_default

Whether the model reasons by default when reasoning_effort is omitted. Default: False.

True for models whose default effort is active (e.g. ‘medium’), such as the o-series, the original GPT-5, and GPT-5.5+, and False for the GPT-5.1..5.4 mainline models which default to reasoning_effort='none'. This decides whether sampling parameters must be dropped when no effort is set, and is independent of whether reasoning can be turned off (openai_supports_reasoning_effort_none).

Type: bool

openai_supports_reasoning_effort_none

Whether the model accepts reasoning_effort='none' and allows sampling parameters (temperature, top_p, etc.) while reasoning is off. Default: False.

The GPT-5.1+ mainline models support turning reasoning off via effort='none', and sampling params are accepted in that mode. When reasoning is enabled (low/medium/high/xhigh), sampling params are not supported. Whether the model reasons by default is tracked separately by openai_reasoning_enabled_by_default.

Type: bool

openai_responses_supports_reasoning_mode

Whether the Responses API supports reasoning.mode ('standard' | 'pro') for this model. Default: False.

Currently only supported by the GPT-5.6 family.

Type: bool

openai_responses_requires_function_call_status_none

Whether the Responses API requires the status field on function tool calls to be None. Default: False.

This is required by vLLM Responses API versions before https://github.com/vllm-project/vllm/pull/26706. See https://github.com/pydantic/pydantic-ai/issues/3245 for more details.

Type: bool

openai_supports_phase

Whether the Responses API supports the phase field on assistant messages. Default: False.

phase labels an assistant message as intermediate commentary or the final_answer. When the model supports it, OpenAI recommends preserving and sending it back unchanged on every assistant message in follow-up requests; dropping it can cause preambles to be interpreted as final answers and degrade behavior in long-running or tool-heavy flows.

Supported by gpt-5.3-codex, gpt-5.4 and later mainline models. The official OpenAI Responses API silently ignores the field on older models, but defaults to False so we don’t risk sending an unrecognized field to OpenAI-compatible APIs (vLLM, Bifrost, …) that haven’t been verified to accept it.

Type: bool

openai_chat_supports_document_input

Whether the Chat Completions API supports document content parts (type='file'). Default: True.

Some OpenAI-compatible providers (e.g. Azure) do not support document input via the Chat Completions API.

Type: bool

openai_chat_supports_max_completion_tokens

Whether the Chat Completions API accepts the max_completion_tokens field for the max_tokens setting. Default: True.

OpenAI itself (including the o-series reasoning models) uses max_completion_tokens, the field that caps visible output plus reasoning tokens, so this defaults to True. Many OpenAI-compatible providers (e.g. OpenRouter) only accept the older max_tokens field; set this to False for those so the max_tokens setting is sent as max_tokens instead.

Type: bool

OpenAIJsonSchemaTransformer

Bases: JsonSchemaTransformer

Recursively handle the schema to make it compatible with OpenAI strict mode.

See https://platform.openai.com/docs/guides/function-calling?api-mode=responses#strict-mode for more details, but this basically just requires:

additionalProperties must be set to false for each object in the parameters
all fields in properties must be marked as required

validate_openai_profile

def validate_openai_profile(profile: ModelProfile) -> None

Validate an OpenAI-compatible profile after resolution. Called from OpenAIChatModel.__init__.

Returns

None

openai_model_profile

def openai_model_profile(model_name: str) -> ModelProfile

Get the model profile for an OpenAI model.

Returns

ModelProfile

OPENAI_REASONING_EFFORT_MAP

Maps unified thinking values to OpenAI reasoning_effort strings.

Type: dict[ThinkingLevel, str] Default: {True: 'medium', False: 'none', 'minimal': 'minimal', 'low': 'low', 'medium': 'medium', 'high': 'high', 'xhigh': 'xhigh'}

SAMPLING_PARAMS

Sampling parameter names that are incompatible with reasoning.

These parameters are not supported when reasoning is enabled (reasoning_effort != ‘none’). See https://platform.openai.com/docs/guides/reasoning for details.

Default: ('temperature', 'top_p', 'presence_penalty', 'frequency_penalty', 'logit_bias', 'openai_logprobs', 'openai_top_logprobs')

AnthropicModelProfile

Bases: ModelProfile

Profile for models used with AnthropicModel.

ALL FIELDS MUST BE anthropic_ PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

Attributes

anthropic_supports_fast_speed

Whether the model supports fast inference speed (anthropic_speed='fast'). Default: False.

Currently Claude Opus 4.6, 4.7, and 4.8 support fast mode. See the Anthropic docs for the latest list.

Type: bool

anthropic_supports_adaptive_thinking

Whether the model supports adaptive thinking (Sonnet 4.6+, Opus 4.6+). Default: False.

When True, unified thinking translates to {'type': 'adaptive'}. When False, it translates to {'type': 'enabled', 'budget_tokens': N}.

Type: bool

anthropic_supports_effort

Whether the model supports the effort parameter in output_config (Opus 4.5+, Sonnet 4.6+). Default: False.

When True and the unified thinking level is a string (e.g. ‘high’), it is also mapped to output_config.effort.

Type: bool

anthropic_supports_dynamic_filtering

Whether the model supports Anthropic-managed dynamic filtering for web search/fetch. Default: False.

When enabled, Pydantic AI selects the web_search_20260209 / web_fetch_20260209 tool versions, which let Claude filter web results via code execution before they enter context.

Type: bool

anthropic_supports_xhigh_effort

Whether the model supports the xhigh effort value in output_config. Default: False.

Claude Opus 4.7 and 4.8 accept xhigh; older Anthropic models should use max instead.

Type: bool

anthropic_disallows_budget_thinking

Whether the model rejects budget-based thinking settings. Default: False.

Claude Opus 4.7 and 4.8 require adaptive thinking and return a 400 for {'type': 'enabled', 'budget_tokens': ...}.

Type: bool

anthropic_disallows_sampling_settings

Whether the model rejects sampling settings like temperature and top_p. Default: False.

Claude Opus 4.7 and 4.8 require these settings to be omitted from request payloads.

Type: bool

anthropic_default_code_execution_tool_version

The Anthropic code execution tool version used when anthropic_code_execution_tool_version='auto'. Default: '20250825'.

Type: AnthropicCodeExecutionToolVersion

anthropic_supported_code_execution_tool_versions

The Anthropic code execution tool versions supported by the model. Default: ('20250825',).

Type: tuple[AnthropicCodeExecutionToolVersion, …]

anthropic_supports_task_budgets

Whether the model supports output_config.task_budget. Default: False.

Anthropic currently documents task budgets as a Claude Opus 4.7 / 4.8 beta feature.

Type: bool

anthropic_supports_forced_tool_choice

Whether the model accepts a forced tool_choice ({'type': 'any'} or {'type': 'tool'}).

Most Anthropic models only reject forcing alongside thinking mode; Claude Fable 5 and Claude Mythos Preview reject it unconditionally with a 400. When False, a resolved required tool choice falls back to auto (filtering tools to the requested set), and an explicit tool_choice='required' (or an explicit list of tools) raises a UserError.

Type: bool

resolve_anthropic_effort

def resolve_anthropic_effort(
    level: ThinkingEffort,
    *,
    supports_xhigh: bool,
) -> AnthropicEffort

Resolve a unified thinking effort level to the Anthropic output_config.effort value.

Shared between the direct Anthropic path and any provider that translates to the Anthropic output_config wire shape (e.g. Bedrock Converse for Anthropic models). Keeps ANTHROPIC_THINKING_EFFORT_MAP as the single source of truth for the base mapping, while letting the xhigh passthrough decision live in one place.

Returns

AnthropicEffort

anthropic_model_profile

def anthropic_model_profile(model_name: str) -> ModelProfile | None

Get the model profile for an Anthropic model.

Returns

ModelProfile | None

AnthropicCodeExecutionToolVersion

Concrete Anthropic code execution tool version to send for CodeExecutionTool.

Type: TypeAlias Default: Literal['20250825', '20260120']

ANTHROPIC_THINKING_BUDGET_MAP

Maps unified thinking values to Anthropic budget_tokens for non-adaptive models.

Type: dict[ThinkingLevel, int] Default: {True: 10000, 'minimal': 1024, 'low': 2048, 'medium': 10000, 'high': 16384, 'xhigh': 32768}

AnthropicEffort

Effort values Anthropic accepts at output_config.effort.

Type: TypeAlias Default: Literal['low', 'medium', 'high', 'xhigh', 'max']

ANTHROPIC_THINKING_EFFORT_MAP

Maps unified thinking effort levels to Anthropic output_config.effort.

xhigh maps to 'max' by default; callers that target a model with anthropic_supports_xhigh_effort should pass supports_xhigh=True to resolve_anthropic_effort to preserve xhigh instead of downshifting.

Type: dict[ThinkingEffort, AnthropicEffort] Default: {'minimal': 'low', 'low': 'low', 'medium': 'medium', 'high': 'high', 'xhigh': 'max'}

GoogleModelProfile

Bases: ModelProfile

Profile for models used with GoogleModel.

ALL FIELDS MUST BE google_ PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

Attributes

google_supports_tool_combination

Whether the model supports combining function declarations with native tools and response_schema. Default: False.

Gemini 3+ supports all tool combinations:

function_declarations + native_tools
output_tools (function declarations) + native_tools
response_schema (NativeOutput) + function_declarations See https://ai.google.dev/gemini-api/docs/tool-combination

Type: bool

google_supports_server_side_tool_invocations

Whether the model accepts the include_server_side_tool_invocations tool-config field. Default: False.

When enabled, Gemini emits explicit tool_call/tool_response parts for server-side native tools (Google Search, URL Context, File Search) that we round-trip through NativeToolCallPart / NativeToolReturnPart. Pre-Gemini-3 models reject the field with 'Tool call context circulation is not enabled'.

Distinct from google_supports_tool_combination even though both currently flip on for Gemini 3+ — the former gates the SDK request field, the latter gates which combinations of native / function / output tools are allowed in the same request.

Type: bool

google_supported_mime_types_in_tool_returns

MIME types supported in native FunctionResponseDict.parts. Default: (). See https://ai.google.dev/gemini-api/docs/function-calling#multimodal-function-responses

Type: tuple[str, …]

google_supports_thinking_level

Whether the model uses thinking_level (enum: LOW/MEDIUM/HIGH) instead of thinking_budget (int). Default: False.

Gemini 3+ models use thinking_level; Gemini 2.5 uses thinking_budget.

Type: bool

GoogleJsonSchemaTransformer

Bases: JsonSchemaTransformer

Transforms the JSON Schema from Pydantic to be suitable for Gemini.

Gemini supports a subset of OpenAPI v3.0.3.

google_model_profile

def google_model_profile(model_name: str) -> ModelProfile | None

Get the model profile for a Google model.

Returns

ModelProfile | None

meta_model_profile

def meta_model_profile(model_name: str) -> ModelProfile | None

Get the model profile for a Meta model.

Returns

ModelProfile | None

amazon_model_profile

def amazon_model_profile(model_name: str) -> ModelProfile | None

Get the model profile for an Amazon model.

Returns

ModelProfile | None

deepseek_model_profile

def deepseek_model_profile(model_name: str) -> ModelProfile | None

Get the model profile for a DeepSeek model.

Returns

ModelProfile | None

GrokModelProfile

Bases: ModelProfile

Profile for Grok models (used with XaiProvider and various OpenAI-compatible providers).

ALL FIELDS MUST BE grok_ PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

Attributes

grok_supports_builtin_tools

Whether the model supports builtin tools (web_search, x_search, code_execution, mcp). Default: False.

Type: bool

grok_supports_tool_choice_required

Whether the provider accepts the value tool_choice='required' in the request payload. Default: True.

Type: bool

grok_reasoning_efforts

Native reasoning_effort values supported by the Grok model. Default: empty (frozenset()).

Type: frozenset[GrokReasoningEffort]

grok_model_profile

def grok_model_profile(model_name: str) -> ModelProfile | None

Get the model profile for a Grok model.

Returns

ModelProfile | None

GrokReasoningEffort

Native xAI reasoning_effort values.

Type: TypeAlias Default: Literal['none', 'low', 'medium', 'high']

mistral_model_profile

def mistral_model_profile(model_name: str) -> ModelProfile | None

Get the model profile for a Mistral model.

Returns

ModelProfile | None

qwen_model_profile

def qwen_model_profile(model_name: str) -> ModelProfile | None

Get the model profile for a Qwen model.

Returns

ModelProfile | None

GroqModelProfile

Bases: ModelProfile

Profile for models used with GroqModel.

ALL FIELDS MUST BE groq_ PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

Attributes

groq_always_has_web_search_builtin_tool

Whether the model always has the web search built-in tool available. Default: False.

Type: bool

groq_supports_reasoning_disable

Whether thinking=False truly disables reasoning via reasoning_effort='none'. Default: False.

Only the qwen3 family supports this; other Groq reasoning models can at most suppress reasoning output via reasoning_format='hidden' while still reasoning internally.

Type: bool

groq_supports_graded_reasoning_effort

Whether the model accepts graded reasoning_effort values (low/medium/high). Default: False.

Only the gpt-oss family supports this; unified thinking levels map to those values via GROQ_GPT_OSS_REASONING_EFFORT_MAP. The qwen3 family instead only accepts none/default (see groq_supports_reasoning_disable).

Type: bool

groq_model_profile

def groq_model_profile(model_name: str) -> ModelProfile

Get the model profile for a Groq model.

Returns

ModelProfile

GROQ_GPT_OSS_REASONING_EFFORT_MAP

Maps unified thinking effort levels to the graded reasoning_effort values the gpt-oss family accepts.

gpt-oss only accepts low/medium/high (not none/default), so minimal folds into low and xhigh into high. thinking=True (bare enable) maps to medium in GroqModel, mirroring the neutral default other providers use. See the Groq docs.

Type: dict[ThinkingEffort, Literal[‘low’, ‘medium’, ‘high’]] Default: {'minimal': 'low', 'low': 'low', 'medium': 'medium', 'high': 'high', 'xhigh': 'high'}

ZaiModelProfile

Bases: ModelProfile

Profile for Z.AI (Zhipu AI) GLM models.

Attributes

zai_supports_reasoning_effort

Whether the model accepts a per-request reasoning_effort level (GLM-5.2).

Type: bool

zai_model_profile

def zai_model_profile(model_name: str) -> ModelProfile | None

The model profile for ZAI (Zhipu AI) GLM models, matched by Z.AI’s native glm-* ids.

Marks thinking-capable models (glm-5, glm-4.7, glm-4.6, glm-4.5) via supports_thinking=True. This includes the glm-4.6v and glm-4.5v vision models, which also support thinking mode per the Z.AI docs. GLM-5.2 additionally accepts a per-request reasoning effort level, flagged via zai_supports_reasoning_effort=True.

The provider-specific request/response shape (e.g. the reasoning_content field used by Z.AI’s API) is configured in ZaiProvider.model_profile() rather than here. Providers that serve GLM models under a different id scheme (e.g. Cerebras’s zai-glm-*, which doesn’t match the glm-* prefixes above) configure thinking support in their own model_profile().

Returns

ModelProfile | None