/Logfire

Logfire MCP: AI-Native Debugging for your Applications

Chris Samiullah avatar
Chris Samiullah
5 mins

If you've built anything non-trivial, you've felt the gap: your app is producing traces, logs, and exceptions, but your AI assistant can't see any of it. You end up pasting stack traces into chat, losing context, and playing telephone with the truth.

The Logfire MCP server closes that gap.

It exposes your OpenTelemetry data in Logfire to an LLM through the Model Context Protocol (MCP), allowing the model to answer questions like:

  • "What actually caused the 500 error on the checkout endpoint?"
  • "Which file is throwing exceptions most often this week?"
  • "Show me the trace context around this specific database failure."
  • "How many times did this failure happen after the last deploy?"

And it does so with structured tools, not vague UI scraping. This is a pragmatic way to turn "we have observability" into "we can interrogate observability quickly, accurately, and repeatably."

MCP is a standard interface that lets a client (like Cursor, Claude Code, or other AI IDEs) call tools provided by a server. In this case, the server is logfire-mcp, and the tools let you query Logfire telemetry directly.

Instead of the model "guessing" what happened, it can:

  • Fetch recent exceptions grouped by file.
  • Pull exception stack traces for a specific module.
  • Run SQL against trace/log records.
  • Generate a link to the Logfire UI for a specific trace.

In other words: grounding.

The server is basically a thin, efficient wrapper around Logfire's API:

  1. It authenticates with a Logfire read token.
  2. It queries a Logfire-backed query engine (using DataFusion-like SQL).
  3. It returns results as structured JSON for the client/LLM to interpret.

The important part: this keeps you in the telemetry domain. No re-instrumentation, no custom export pipeline. You're querying the same traces you already trust.


Logfire MCP is intentionally small: four tools cover the vast majority of diagnostic workflows.

schema_reference

Use it when: You want to write SQL and need the table/column names.

It returns the schema for the Logfire "records" store (and metrics). Think of it as "show me the shape of the data." This is critical because most failed AI queries are simply "hallucinated column name" problems. Schema-first avoids wasted cycles.

If you're querying JSON attributes (including OpenTelemetry and gen_ai.* attributes), the Logfire SQL reference on attributes is the canonical guide (including -> / ->> operators).

arbitrary_query(query: str, age: int)

Use it when: You want precision.

This runs SQL against the Logfire records database.

It's the most powerful tool in the set because it turns "telemetry" into a queryable dataset: you can slice, aggregate, and correlate across traces in a way that mirrors how senior engineers actually debug (and how teams build dashboards/SLOs).

You can filter by:

  • Time window (age minutes back)
  • trace_id
  • service_name
  • http_route, status codes
  • Exception fields (is_exception, exception_type, etc.)
  • Span/log metadata (span_name, message)
  • Nested JSON in attributes via -> / ->> operators

Where it really shines is when you need answers like:

  • "Did failures increase after the last deploy, and is it tied to one model/provider?"
  • "Which LLM operation is slowest, and what's the latency by model?"
  • "Are token counts (or cost proxies) spiking for a specific route, user cohort, or prompt template?"
  • "What's the top exception type for LLM spans over the last day?"

All GenAI semantic convention attributes are queryable via SQL, too: anything you emit under gen_ai.* ends up in attributes and can be filtered/selected with ->> just like any other span attribute.

For example, you can pull recent LLM-related spans and their GenAI attributes (model, operation, token counts) like this:

SELECT created_at,
       trace_id,
       span_name,
       attributes->>'gen_ai.system' AS gen_ai_system,
       attributes->>'gen_ai.model' AS gen_ai_model,
       attributes->>'gen_ai.operation.name' AS gen_ai_operation,
       attributes->>'gen_ai.usage.input_tokens' AS input_tokens,
       attributes->>'gen_ai.usage.output_tokens' AS output_tokens,
       message
FROM records
WHERE attributes->>'gen_ai.model' IS NOT NULL
ORDER BY created_at DESC
LIMIT 50

You can also do incident-style breakdowns. For example, recent exception volume by model and exception type:

SELECT attributes->>'gen_ai.model' AS gen_ai_model,
       exception_type,
       COUNT(*) AS exceptions
FROM records
WHERE is_exception = true
  AND attributes->>'gen_ai.model' IS NOT NULL
ORDER BY exceptions DESC
LIMIT 50

Or a simple latency comparison by model and operation (using the duration column):

SELECT attributes->>'gen_ai.model' AS gen_ai_model,
       attributes->>'gen_ai.operation.name' AS gen_ai_operation,
       COUNT(*) AS spans,
       AVG(duration) AS avg_duration
FROM records
WHERE attributes->>'gen_ai.model' IS NOT NULL
GROUP BY gen_ai_model, gen_ai_operation
ORDER BY avg_duration DESC
LIMIT 50

find_exceptions_in_file(filepath: str, age: int)

Use it when: You already have a suspicious file, or you want to start from code.

It returns details about the most recent exceptions associated with that file (including stack traces and trace_ids). This is often the fastest path from "something broke" to "here's the exact line." It helps bridge the gap between "file in editor" and "error in production."

logfire_link(trace_id: str)

Use it when: You want to hand a human a link.

This generates a Logfire UI link scoped to that trace. The LLM can do the analysis, then hand you a direct link (permalink) so you can confirm in the UI, add annotations, or share with the team.


Here's a robust "LLM-assisted debugging" loop that doesn't rely on luck:

  1. Start broad: "Our chat endpoint is failing or slow after a deploy."
  2. Narrow by code: Run find_exceptions_in_file() for the route handler or the module that calls your LLM provider.
  3. Extract the trace_id from the exception result.
  4. Zoom out: Open the trace with logfire_link(trace_id).
  5. Confirm frequency: Run arbitrary_query() to see if this is new, recurring, or correlated with a specific deployment (and filter on gen_ai.* attributes to scope to LLM spans).

It's the same technique a senior engineer uses--start with a symptom, find a trace, expand context--just with significantly less manual friction.


Cursor supports MCP servers via a .cursor/mcp.json file in your project root.

Logfire's docs call out an important limitation: Cursor currently doesn't support an env field in the MCP config, so you must pass the token via an argument instead.

Create .cursor/mcp.json:

{
  "mcpServers": {
    "logfire": {
      "command": "uvx",
      "args": ["logfire-mcp@latest", "--read-token=YOUR-TOKEN"]
    }
  }
}

Notes:

  • uvx will download and run the MCP server package on demand.
  • The read token is project-specific (you can create it in your Logfire project settings).
  • If you're using the EU region, you may need to set a different base URL (check the docs for the specific flag).

Claude Code can register MCP servers via a CLI command, which supports passing environment variables properly. This is cleaner as it keeps secrets out of your committed JSON files.

From the Logfire docs:

claude mcp add logfire -e LOGFIRE_READ_TOKEN="your-token" -- uvx logfire-mcp@latest

For more detail and options, check the Logfire MCP server setup guide.


Imagine an LLM-backed chat endpoint that suddenly starts failing. The frontend just says "request failed," and that's all you have.

A good MCP-driven prompt is:

"What exception caused the failures in our chat endpoint, and which LLM model/operation was involved?"

The assistant might execute a sequence like:

  1. find_exceptions_in_file(filepath="backend/app/api/chat.py", age=1440)
  2. Pull a trace_id from the result.
  3. Call logfire_link(trace_id) to give you a clickable link.
  4. Run a targeted query to check prevalence:
SELECT created_at,
       http_route,
       http_response_status_code,
       message,
       exception_type,
       exception_message,
       trace_id,
       attributes->>'gen_ai.system' AS gen_ai_system,
       attributes->>'gen_ai.model' AS gen_ai_model,
       attributes->>'gen_ai.operation.name' AS gen_ai_operation
FROM records
WHERE is_exception = true
  AND http_route = '/api/chat'
ORDER BY created_at DESC
LIMIT 20

This pattern scales. It works for flaky integrations, "only in prod" bugs, regressions after a deploy, and even performance spikes (querying by duration). Most importantly: it moves you from narrative debugging ("I think it might be X") to evidence-based debugging ("the trace says it's Y").


There are a lot of "AI observability" pitches right now. Most of them fall into one of three buckets:

  1. "We'll summarize your logs."
  2. "We'll build a separate pipeline."
  3. "We'll guess."

Logfire MCP is attractive because it's modest and honest:

  • It exposes the telemetry you already have.
  • It gives the model a small number of sharp tools.
  • It keeps you anchored to trace IDs, spans, and hard facts.

If you already run Logfire, this is one of the fastest upgrades you can make to your debugging workflow - especially when you're iterating quickly and you want your assistant to stay grounded in reality.

References: