/Release

Pydantic v2.8

Sydney Runkle avatar
Sydney Runkle
10 mins
2024/07/01

Pydantic v2.8 is now available! You can install it now via PyPi or your favorite package manager:

pip install --upgrade pydantic

This release features the work of over 50 contributors! In this post, we'll cover the highlights of the release. You can see the full changelog on github.

This release focused especially on typing related bug fixes and improvements, as well as some opt-in performance enhancing features.

Pydantic v2.8 introduces a new feature called fail fast validation. This is currently available for a limited number of sequence types including list, tuple, set, and frozenset. When you use FailFast validation, Pydantic will stop validation as soon as it encounters an error.

This feature is useful when you care more about the validity of your data than the thoroughness of the validation errors. For many folks, this tradeoff in error specificity is well worth the performance gain.

You can either use FailFast() as a type annotation or specify the fail_fast parameter in the Field constructor.

For example:

from typing import List

from typing_extensions import Annotated

from pydantic import BaseModel, FailFast, Field, ValidationError


class Model(BaseModel):
    x: Annotated[List[int], FailFast()]
    y: List[int] = Field(..., fail_fast=True)


# This will raise a single error for the first invalid value in each list
# At which point, validation for said field will stop
try:
    obj = Model(x=[1, 2, 'a', 4, 5, 'b', 7, 8, 9, 'c'], y=[1, 2, 'a', 4, 5, 'b', 7, 8, 9, 'c'])
except ValidationError as e:
    print(e)
    """
    2 validation errors for Model
    x.2
    Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='a', input_type=str]
        For further information visit https://errors.pydantic.dev/2.8/v/int_parsing
    y.2
    Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='a', input_type=str]
        For further information visit https://errors.pydantic.dev/2.8/v/int_parsing
    """

We plan on extending this feature to other types in the future!

You can read more about FailFast in the API reference

In v2.7, we introduced the ability to deprecate models and fields. In v2.8, we've extended this feature to include deprecation information in the JSON schema.

from typing_extensions import deprecated

from pydantic import BaseModel, Field

@deprecated('DeprecatedModel is... sadly deprecated')
class DeprecatedModel(BaseModel):
    deprecated_field: str = Field(..., deprecated=True)

json_schema = DeprecatedModel.schema()
assert json_schema['deprecated'] is True
assert json_schema['properties']['deprecated_field']['deprecated'] is True

This new feature allows you to generate titles for your models and fields with a callable. This is quite helpful when you want to generate titles dynamically based on the model or field's attributes.

You can share callable title generators between models and fields, which helps keep your code DRY.

import json

from pydantic import BaseModel, ConfigDict, Field


class MyModel(BaseModel):
    foo: str = Field(..., field_title_generator=lambda name, field_info: f'title-{name}-from-field')
    bar: str

    model_config = ConfigDict(
        field_title_generator=lambda name, field_info: f'title-{name}-from-config',
        model_title_generator=lambda cls: f'title-{cls.__name__}-from-config',
    )

print(json.dumps(MyModel.model_json_schema(), indent=2))
"""
{
  "properties": {
    "foo": {
      "title": "title-foo-from-field",
      "type": "string"
    },
    "bar": {
      "title": "title-bar-from-config",
      "type": "string"
    },
  },
  "required": [
    "foo",
    "bar",
  ],
  "title": "title-MyModel-from-config",
  "type": "object"
}
"""

Want to learn more? Check out the JSON schema customization docs.

In v2.7 we added support for passing context to serializers for BaseModels. In v2.8, we've extended that support to TypeAdapter's serialization methods.

Here's a simple example, where we use a unit provided in the context to convert a distance field:

from typing_extensions import Annotated

from pydantic import SerializationInfo, TypeAdapter, PlainSerializer


def serialize_distance(v: float, info: SerializationInfo) -> float:
    """We assume a distance is provided in meters, but we can convert it to other units if a context is provided."""
    context = info.context
    if context and 'unit' in context:
        if context['unit'] == 'km':
            v /= 1000  # convert to kilometers
        elif context['unit'] == 'cm':
            v *= 100  # convert to centimeters
    return v


distance_adapter = TypeAdapter(Annotated[float, PlainSerializer(serialize_distance)])

print(distance_adapter.dump_python(500))  # no context, dumps in meters
# > 500.0

print(distance_adapter.dump_python(500, context={'unit': 'km'}))  # with context, dumps in kilometers
# > 0.5

print(distance_adapter.dump_python(500, context={'unit': 'cm'}))  # with context, dumps in centimeters
# > 50000

In v2.8.0, we introduced a new pattern for introducing experimental features and settings. We've added a section to our version policy explaining how we'll handle experimental features moving forward.

You can find documentation for our new experimental features in the experimental features section.

Experimental features will either be:

  • Located in the experimental module, so you can import them via from pydantic.experimental import ...
  • Prefixed with experimental, so you can use them like some_func(experimental_param=...) or some_model.experimental_method(...)

When you import an experimental feature from the experimental module, you'll see a PydanticExperimentalWarning. You can filter this via:

import warnings

from pydantic import PydanticExperimentalWarning

warnings.filterwarnings('ignore', category=PydanticExperimentalWarning)

In our version policy, we also touch on the lifecycle of experimental features. It's very possible that experimental features will experience non-backward-compatible changes or be removed entirely in future versions, so please be aware of their volatility when opting to use them.

Pipeline API — Experimental

Pydantic v2.8.0 introduced an experimental "pipeline" API that allows composing of parsing (validation), constraints and transformations in a more type-safe manner than existing APIs.

Generally, the pipeline API is used to define a sequence of steps to apply to incoming data during validation. The below example illustrates how you can use the pipeline API to validate an int first as a string, strip extra whitespace, then parse it as an int, and finally ensure it's greater than or equal to 0.

import warnings

from typing_extensions import Annotated

from pydantic import BaseModel, PydanticExperimentalWarning, ValidationError

warnings.filterwarnings('ignore', category=PydanticExperimentalWarning)

from pydantic.experimental.pipeline import validate_as


class Model(BaseModel):
    data: Annotated[str, validate_as(str).str_strip().validate_as(...).ge(0)]


print(repr(Model(data='  123  ')))
#> Model(data=123)

try:
    Model(data='  -123  ')
except ValidationError as e:
    print(e)
    """
    1 validation error for Model
    data
    Input should be greater than or equal to 0 [type=greater_than_equal, input_value='  -123  ', input_type=str]
        For further information visit https://errors.pydantic.dev/2.8/v/greater_than_equal
    """

Note, validate_as(...) is equivalent to validate_as(<type_in_annotation>). So for the above example, validate_as(...) is equivalent to validate_as(int).

We've added some additional examples to the pipeline docs, if you'd like to learn more.

defer_build support for `TypeAdapter — Experimental

Pydantic BaseModels currently support a defer_build setting in their configuration, allowing for deferred schema building (until the first validation call). This can help reduce startup time for applications that might have an abundance of complex models that bear a heavy schema-building cost.

In v2.8, we've added experimental support for defer_build in TypeAdapter's configuration.

Here's an example of how you can use this new experimental feature:

from pydantic import ConfigDict, TypeAdapter

from typing import TypeAlias


# in practice, this would be some complex type for which schema building
# takes a while, hence the value of deferring the build
SuperDuperComplexType: TypeAlias = int


ta = TypeAdapter(
    SuperDuperComplexType,
    config=ConfigDict(
        defer_build=True,
        experimental_defer_build_mode=('model', 'type_adapter'),
    ),
)

assert ta._core_schema is None
print(ta.validate_python(0))

# after a call is made that requires the core schema, it's built and cached
assert ta.core_schema == {'type': 'int'}

We've been working with the codeflash team to make LLM driven optimizations to Pydantic's source code. Thus far, we've made some minor optimizations in our internal logic, and we're looking forward to collaborating more significant improvements in the future.

We also hope to integrate codeflash into our CI/CD pipeline to ensure that we're consistently checking for optimization opportunities.

Pydantic V2 now supports Python 3.13!

A few of our test-oriented dependencies are not yet compatible with Python 3.13, but we plan to upgrade them soon. This means that not all tests can be run against Python 3.13 (there are just a few that aren't yet compatible), but this should only affect contributors.

For users, Pydantic should work as expected with Python 3.13. If you run into any issues, please let us know!

When validating data against a union of types, Pydantic offers a smart mode and a left_to_right mode.

We've made some improvements to the smart mode to improve behavior when validating against a union of types that contain many of the same fields. The following example showcases an example of the old behavior vs the new behavior:

from pydantic import BaseModel, TypeAdapter


class ModelA(BaseModel):
    a: int


class ModelB(ModelA):
    b: str

ta = TypeAdapter(ModelA | ModelB)
print(repr(ta.validate_python({'a': 1, 'b': 'foo'})))
#> old behavior: ModelA(a=1)
#> new behavior: ModelB(a=1, b='foo')

There are many other more complex cases where the new behavior is more intuitive and correct. The general idea is that we now incorporate the number of valid fields into the scoring algorithm, in addition to the exactness (in terms of strict, lax, etc) of the match.

We have documented the match scoring algorithm in order to provide transparency and predictability. However, we do reserve the right to modify the smart union algorithm in the future to further improve its behavior. In general, we expect such changes will only impact edge cases, and we'll only make such changes when they near-universally improve the handling of such edge cases.

Better understanding this algorithm can also help you choose which union mode is right for your use case.

If you're looking for more performant union validation than smart mode provides, we recommend that you use tagged unions.

Python's re module supports flags that can be passed to regex patterns to change their behavior. For example, the re.IGNORECASE flag makes a pattern case-insensitive. In previous versions of Pydantic, even when using the python-re regex engine, these flags were ignored.

Now, we've improved Pydantic's constrained string validation by:

  1. Not recompiling Python regex patterns (which is more performant)
  2. Respecting the flags passed to a compiled pattern

!!! note If you use a compiled regex pattern, the python-re engine will be used regardless of this setting. This is so that flags such as re.IGNORECASE are respected.

Here's an example of said flags in action:

import re

from typing_extensions import Annotated

from pydantic import BaseModel, ConfigDict, StringConstraints


class Model(BaseModel):
    a: Annotated[str, StringConstraints(pattern=re.compile(r'[A-Z]+', re.IGNORECASE))]

    model_config = ConfigDict(regex_engine='python-re')


# allows lowercase letters, even though the pattern is uppercase only due to the re.IGNORECASE flag
assert Model(a='abc').a == 'abc'

You can learn more about the regex_engine setting in our model config docs

With these new features and performance improvements, Pydantic v2.8.0 is the best and most feature-rich version of Pydantic yet. If you have any questions or feedback, please open a Github discussion. If you encounter any bugs, please open a Github issue.

Thank you to all of our contributors for making this release possible! We would especially like to acknowledge the following individuals for their significant contributions to this release:

If you're enjoying Pydantic, you might really like Pydantic Logfire, a new observability tool built by the team behind Pydantic. You can now try logfire for free during our open beta period. We'd love it if you'd join the Pydantic Logfire Slack and let us know what you think!