/Pydantic Logfire

Pydantic Logfire vs LangSmith: AI observability that traces your whole app

Karina Ung avatar
Karina Ung
7 mins

When an AI agent fails in production, the useful question is rarely "what did the LLM say?" It's "why did the LLM say that?" Maybe the retrieval pipeline returned stale data. Maybe a downstream API timed out. Maybe the vector database query took 12 seconds instead of 200 milliseconds.

LangSmith shows you the LLM call. Pydantic Logfire shows you the LLM call, the database query, the API request, and the background job that triggered it, in one trace.

That difference matters at 2am when the agent breaks.

LangSmith was built by LangChain to trace LangChain applications. It captures LLM calls, chain steps, and tool invocations natively, and extends through SDK integrations to other frameworks.

In production, that scope creates blind spots. AI applications don't run in isolation. An agent calls a model, which triggers a tool, which queries a database, which calls an external API. LangSmith traces the model and the tool. The database query, the API latency, the background worker that queued the request, all of that lives in a separate system, or nowhere.

So when something breaks, engineers bounce between LangSmith for LLM traces, and Datadog or Sentry for application errors, and raw logs for everything in between. Debugging becomes a scavenger hunt.

Three patterns push teams off LangSmith:

Cost at scale. LangSmith's Plus plan charges $39 per seat per month with 10,000 base traces included, then $2.50 per 1,000 base traces in overages. Base traces have a 14-day retention window, while extended traces (400-day retention) cost $5.00 per 1,000. The free Developer tier locks you to one user and 5,000 base traces per month. Logfire's Growth plan includes 5 seats and 10 million spans, then charges $2.00 per million span in overages. At a moderate production scale (5 users, 50 million spans), Logfire is approximately 40x less expensive than LangSmith. At this pricing, teams managing production AI traffic face a cost-versus-coverage tradeoff: pay for full ingestion, or sample down to keep bills predictable. For AI applications where failures are probabilistic and context-dependent, sampling away from production data is a poor tradeoff.

Proprietary trace format and lock-in. LangSmith stores traces in a proprietary format with a proprietary DSL for querying. Once your historical traces live in LangSmith, moving them out is a project. Logfire stores traces in OpenTelemetry format and queries them with standard PostgreSQL-compatible SQL, so your instrumentation and your data both stay portable.

Query interface limits. When you need to answer a question nobody anticipated, LangSmith's DSL and UI filters constrain what you can ask. SQL doesn't.

Pydantic Logfire takes a fundamentally different approach. Instead of wrapping LLM calls in a proprietary format, Logfire builds on OpenTelemetry (OTel) with 100% GenAI semantic convention alignment. This architectural decision has practical consequences that matter daily.

Logfire traces your entire application in a single timeline: LLM interactions, database queries, API requests, background jobs, and infrastructure. When an agent fails, you see the complete picture without switching tools.

This matters for agentic workflows in particular. A Pydantic AI agent calling an MCP tool that queries PostgreSQL and then hits an external API produces one unified trace in Logfire. In LangSmith, you'd see the agent and tool call but miss the database and API context.

Logfire uses standard PostgreSQL-compatible SQL for querying. Write JOINs, aggregations, CTEs, or any query that answers your question. No proprietary DSL to learn, no predefined filters to work around.

The advantage compounds with AI coding assistants. Point your IDE at Logfire's MCP server, and your coding agent can write SQL queries against your production telemetry. AI assistants write excellent SQL. They struggle with proprietary query languages.

Logfire's live view shows pending spans, spans that render before they complete. You watch requests flow through your system in real time, see where they spend time, and debug performance issues as they happen. This is unique to Logfire and turns debugging from a post-mortem exercise into a live investigation.

Logfire's OpenTelemetry foundation means your instrumentation is portable. Pydantic AI works with any observability backend that supports OTel. You're not locked into Logfire. First-party SDKs cover Python, JavaScript/TypeScript, and Rust, and any OTel-compatible language can send traces to Logfire through the standard collector pipeline.

This is worth emphasizing: Logfire is not Python-only. The deepest integration is with Python (Pydantic, FastAPI, and the broader Python ecosystem), but the OTel foundation means Go, Java, .NET, or any language with OTel support can participate in the same traces.

General Intelligence Company of New York (GIC) builds autonomous agents that run businesses. Their flagship product, Cofounder, manages features, works through Linear backlogs, and oversees infrastructure without human intervention. The architecture orchestrates multiple specialized sub-agents (coding agents, QA agents, etc.), and at that complexity, knowing when an agent deviates becomes essential.

GIC ran on LangSmith before migrating to Logfire. The blocker was query performance. LangSmith's nested data model required fetching parent runs and child runs in separate API calls, then flattening them client-side. Complex agent traces with deep nesting exhausted rate limits before queries could complete.

After migrating to Pydantic Logfire, GIC ran systematic benchmarks. Query latency dropped from 114-145 seconds to under 1 second across all percentiles. The improvement factor ranged from 141x to 161x. Logfire's flat latency curve held even as trace complexity increased.

We migrated from LangSmith to Logfire and the time it took to query our agent traces went down by 96.2%.

— Andrew Pignanelli, Founder and CEO, General Intelligence Company

The performance leap unlocked a capability that wasn't possible before: live evaluation. Because Logfire queries return in sub-second time, GIC's agents can now query their own execution history during a run, examine past violations, and self-correct. Logfire became a tool the agents use, not just a dashboard their engineers open.

I've seen agents get stuck, query their own traces through Pydantic Logfire, look at violations, and use that to inform their next move.

— Abhishyant Khare, Co-founder and CTO, General Intelligence Company

Overjoy, an AI-powered CRM for consumer packaged goods brands, replaced LangChain and LangSmith with Pydantic AI and Logfire. Ishaan Nagpal, founding engineer at Overjoy, had tracing running the same day he built his first agent. Before Logfire, diagnosing a non-trivial production issue could take half a day or more, piecing together logs across multiple systems. Now the team triages most issues in minutes, and engineers who aren't fully onboarded on the backend independently debug customer issues using Logfire's MCP server.

The unified visibility also caught a bug where one agent's usage spiked to 20x normal, silently repeating the same call. Without Logfire's cost tracking, Overjoy would have burned through budget before noticing.

The instrumentation with Logfire has been the most powerful thing. Now we can see exactly what's being called, how an agent thinks, what it costs.

— Ishaan Nagpal, Founding Engineer, Overjoy

Capability Pydantic Logfire LangSmith
Observability scope Full-stack (LLM + database + API + infra) LLM and agent focused
Standards foundation OpenTelemetry native, fully portable Proprietary first, OTel supported
Query interface PostgreSQL-compatible SQL Proprietary DSL + UI filters
Live debugging Pending spans (real-time) Post-hoc trace review
Language support Python, JS/TS, Rust SDKs + any OTel language Python, JS/TS
Free tier 10 million spans/month 5,000 base traces/month
Pricing model $49/mo (includes 5 seats and $10M spans) + $2/million records in overages $39/mo (includes 1 seat and 10k base traces) + $2.50/per 1k base traces in overage
Data retention 30 days default 14 days default, 400 days costs extra
Framework coupling Works with any framework Best with LangChain/LangGraph

The cost difference isn't marginal. Here's what LangSmith's per-seat, per-trace pricing looks like compared to Logfire's flat span pricing at three production workloads, based on Logfire Cloud Team/Growth plan pricing and LangSmith Plus plan pricing:

Workload LangSmith Logfire Savings
1 user, 5M spans/month ~$514 $0 (free tier) ~$514
5 users, 50M spans/month ~$5,170 ~$129 ~40x
20 users, 500M spans/month ~$50,755 ~$1,249 ~41x

These figures use LangSmith's base trace pricing ($2.50 per 1,000 base traces, 14-day retention) and assume 25 spans per trace. Extended traces with 400-day retention are billed at $5.00 per 1,000 which roughly doubles the LangSmith cost at each tier. Logfire's Team plan ($49/month) includes 5 seats and 10M spans, with $2 per million spans in overage and $25 per seat if more than 5 seats (up to 12 total). The Growth plan ($249/month) removes the seat cap.

Trace-based billing breaks down under agentic workloads where a single user action can generate dozens of spans. Logfire's flat span pricing aligns cost with application load and does not create incentives to discard telemetry.

Logfire requires minimal setup. For a Pydantic AI application:

import logfire

logfire.configure()
logfire.instrument_pydantic_ai()

That's two lines. Your agent runs, tool calls, and model requests are automatically instrumented. Add logfire.instrument_openai() or logfire.instrument_anthropic() for direct provider calls. Add database and web framework instrumentation through Logfire's integration library and you have full-stack visibility.

Logfire offers 10 million free spans per month. No credit card required.

Yes. Logfire has a LangChain integration that traces LangChain applications. You don't have to migrate your agent framework to get Logfire's observability benefits. That said, teams often migrate to Pydantic AI alongside Logfire because the type-safe integration is tighter.

No. Logfire is Python-first, with the deepest integration across Pydantic, FastAPI, and the broader Python ecosystem. First-party SDKs also cover JavaScript/TypeScript and Rust, and any language with OpenTelemetry support can send traces to Logfire through the standard OTel collector pipeline. This makes Logfire suitable for polyglot stacks where Python handles the AI layer and other languages run surrounding services.

Most teams have traces in Logfire within a day. Overjoy's founding engineer had tracing running the same day he built his first agent. If you're also migrating from LangChain to Pydantic AI, the Pydantic AI documentation covers the transition patterns.

Logfire supports self-hosting for teams with data sovereignty requirements. The Logfire SDK is MIT-licensed and open source. Enterprise self-hosting options are available for organizations that need on-premises deployment.

If you're debugging AI applications by bouncing between LangSmith, your APM tool, and raw logs, you're spending time on tooling instead of fixing problems. Logfire puts everything in one trace with SQL-powered querying, real-time debugging, and pricing that doesn't penalize scale.

Book a demo to see how Logfire fits your stack, or start free with 10 million spans per month.