pydantic_ai.models.ollama
For details on how to set up authentication with this model, see model configuration for Ollama.
Ollama model implementation using OpenAI-compatible API.
Bases: OpenAIChatModel
A model that uses Ollama’s OpenAI-compatible Chat Completions API.
Self-hosted Ollama (v0.5.0+) honors response_format with json_schema via
llama.cpp’s grammar-constrained decoder, so NativeOutput produces
schema-valid output at generation time.
Ollama Cloud currently accepts response_format with json_schema without
error but does not enforce the schema upstream (see
pydantic-ai#4917 and
ollama/ollama#12362). When
this model detects a Cloud path — either a base_url on ollama.com or a
model name ending in -cloud — it disables supports_json_schema_output
on the resolved profile. With that flag off,
NativeOutput raises a clear
UserError so users pick a mode that
actually works on Cloud (ToolOutput —
the default — and PromptedOutput are
both verified to work).
Apart from __init__, all methods are inherited from the base class.
def __init__(
model_name: str,
provider: Literal['ollama'] | Provider[AsyncOpenAI] = 'ollama',
profile: ModelProfileSpec | None = None,
settings: ModelSettings | None = None,
)
Initialize an Ollama model.
model_name : str
The name of the Ollama model to use (e.g. 'qwen3', 'llama3.2').
provider : Literal[‘ollama’] | Provider[AsyncOpenAI] Default: 'ollama'
The provider to use. Defaults to 'ollama'.
profile : ModelProfileSpec | None Default: None
The model profile to use. Defaults to a profile picked by the provider based on the model name,
adjusted to disable supports_json_schema_output when the request routes through Ollama Cloud.
settings : ModelSettings | None Default: None
Model-specific settings that will be used as defaults for this model.