pydantic_ai.models.google

Interface that uses the google-genai package under the hood to access Google’s Gemini models via both the Generative Language API and Vertex AI.

Setup

For details on how to set up authentication with this model, see model configuration for Google.

GoogleModelSettings

Bases: ModelSettings

Settings used for a Gemini model request.

Attributes

google_safety_settings

The safety settings to use for the model.

See https://ai.google.dev/gemini-api/docs/safety-settings for more information.

Type: list[SafetySettingDict]

google_thinking_config

The thinking configuration to use for the model.

See https://ai.google.dev/gemini-api/docs/thinking for more information.

Type: ThinkingConfigDict

google_labels

User-defined metadata to break down billed charges. Only supported by the Vertex AI API.

See the Gemini API docs for use cases and limitations.

Type: dict[str, str]

google_video_resolution

The video resolution to use for the model.

See https://ai.google.dev/api/generate-content#MediaResolution for more information.

Type: MediaResolution

google_cached_content

The name of the cached content to use for the model.

See https://ai.google.dev/gemini-api/docs/caching for more information.

Type: str

google_logprobs

Include log probabilities in the response.

See https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/content-generation-parameters#log-probabilities-output-tokens for more information.

Note: Only supported for Vertex AI and non-streaming requests.

These will be included in ModelResponse.provider_details['logprobs'].

Type: bool

google_top_logprobs

Include log probabilities of the top n tokens in the response.

See https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/content-generation-parameters#log-probabilities-output-tokens for more information.

Note: Only supported for Vertex AI and non-streaming requests.

These will be included in ModelResponse.provider_details['logprobs'].

Type: int

google_service_tier

Vertex AI routing for Provisioned Throughput and Flex PayGo. Defaults to 'pt_then_on_demand'.

See GoogleServiceTier for all values, headers sent, and links to Google docs.

Type: GoogleServiceTier

GoogleModel

Bases: Model

A model that uses Gemini via generativelanguage.googleapis.com API.

This is implemented from scratch rather than using a dedicated SDK, good API documentation is available here.

Apart from __init__, all methods are private or match those of the base class.

Attributes

model_name

The model name.

Type: GoogleModelName

system

The model provider.

Type: str

Methods

init

def __init__(
    model_name: GoogleModelName,
    provider: Literal['google-gla', 'google-vertex', 'gateway'] | Provider[Client] = 'google-gla',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)

Initialize a Gemini model.

Parameters

model_name : GoogleModelName

The name of the model to use.

provider : Literal[‘google-gla’, ‘google-vertex’, ‘gateway’] | Provider[Client] Default: 'google-gla'

The provider to use for authentication and API access. Can be either the string ‘google-gla’ or ‘google-vertex’ or an instance of Provider[google.genai.AsyncClient]. Defaults to ‘google-gla’.

profile : ModelProfileSpec | None Default: None

The model profile to use. Defaults to a profile picked by the provider based on the model name.

settings : ModelSettings | None Default: None

The model settings to use. Defaults to None.

supported_builtin_tools

@classmethod

def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]

Return the set of builtin tool types this model can handle.

Returns

frozenset[type[AbstractBuiltinTool]]

GeminiStreamedResponse

Bases: StreamedResponse

Implementation of StreamedResponse for the Gemini model.

Attributes

model_name

Get the model name of the response.

Type: GoogleModelName

provider_name

Get the provider name.

Type: str

provider_url

Get the provider base URL.

Type: str

timestamp

Get the timestamp of the response.

Type: datetime

LatestGoogleModelNames

Latest Gemini models.

Default: Literal['gemini-flash-latest', 'gemini-flash-lite-latest', 'gemini-2.0-flash', 'gemini-2.0-flash-lite', 'gemini-2.5-flash', 'gemini-2.5-flash-preview-09-2025', 'gemini-2.5-flash-image', 'gemini-2.5-flash-lite', 'gemini-2.5-flash-lite-preview-09-2025', 'gemini-2.5-pro', 'gemini-3-flash-preview', 'gemini-3-pro-image-preview', 'gemini-3-pro-preview', 'gemini-3.1-flash-image-preview', 'gemini-3.1-flash-lite-preview', 'gemini-3.1-pro-preview']

GoogleModelName

Possible Gemini model names.

Since Gemini supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See the Gemini API docs for a full list.

Default: str | LatestGoogleModelNames

GoogleServiceTier

Values for the google_service_tier field on GoogleModelSettings.

Controls Vertex AI HTTP headers for Provisioned Throughput (PT) and Flex PayGo. Only applies when using the Vertex AI API.

Values:

'pt_then_on_demand' (default): PT when quota allows, then standard on-demand spillover. No headers sent.
'pt_only': PT only (X-Vertex-AI-LLM-Request-Type: dedicated). No on-demand spillover; returns 429 when over quota.
'pt_then_flex': PT when quota allows, then Flex PayGo spillover (X-Vertex-AI-LLM-Shared-Request-Type: flex).
'on_demand': Standard on-demand only (X-Vertex-AI-LLM-Request-Type: shared). Bypasses PT for this request.
'flex_only': Flex PayGo only (X-Vertex-AI-LLM-Request-Type: shared and X-Vertex-AI-LLM-Shared-Request-Type: flex). Bypasses PT.

Not every model or region supports every value; see the linked Google docs.

Note: these headers only affect Vertex AI. When using the GLA API they are silently ignored.

Default: Literal['pt_then_on_demand', 'pt_only', 'pt_then_flex', 'on_demand', 'flex_only']