/jobs

Principal DataFusion Engineer

Last updated:

Logfire stores billions of telemetry records and serves them back with sub-second query latency. The engine that makes that possible is fusionfire — a disaggregated analytical database we built in Rust on Apache DataFusion and Apache Arrow.

The data store is one of the largest and most technically demanding parts of our engineering org — roughly a quarter of our engineers work on it — and we're looking for a Principal Engineer to help lead its next phase. You'll work at the level of query planning, columnar storage, distributed execution, and resource scheduling: the parts of the system where one good decision is worth orders of magnitude.

  • Push the limits of DataFusion: custom operators, UDFs/UDAFs (e.g. our metric_* time-series aggregates), logical and physical plan rewriting, and predicate pushdown via zone maps and Bloom filters.
  • Design and evolve our storage layout: Parquet on object storage, compaction policy, schema evolution across rolling deploys, and incremental (including aggregating) materialized views.
  • Make distributed query execution fair and fast: multi-pod scheduling (we run a CFS-style fair scheduler coordinated through Redis), time-budget rate limiting, and demand-based autoscaling.
  • Own performance end to end: multi-tier caching (object store → SSD → in-memory), memory profiling and OOM-resistance, and the write-amplification trade-offs in compaction.
  • Set technical direction for the data store team and raise the bar through design review, mentorship, and hands-on code.
  • Monitor system health with traces / logs / metrics — we run Logfire on Logfire.

We expect a candidate for this position to have:

  • Deep experience with Rust and its ecosystem.
  • Substantial experience building analytical or time-series database internals — ideally with Apache DataFusion and Apache Arrow, or comparable engines (ClickHouse, DuckDB, InfluxDB IOx, Polars, Trino/Presto, Spark, etc.).
  • Strong command of query optimization, columnar/Parquet storage, and distributed systems.
  • A track record of leading the design of complex systems and lifting the engineers around you.
  • A commitment to performance, scalability, and developer experience.
  • At least 8 years of software engineering experience, with meaningful time spent deep in database or query-engine internals.

Nice to haves but not required:

  • Contributions to DataFusion, Arrow, Parquet, or another open source database/query engine.
  • Experience with disaggregated (storage/compute separated) architectures backed by object storage.
  • Knowledge of OpenTelemetry and observability data models.
  • Experience with full-text search (we use Tantivy) and high-cardinality aggregation.
  • Python and/or TypeScript for the surrounding stack.
  • Live and work in a timezone between PT (UTC-8) and CET (UTC+1)
  • Able to travel to EU, UK and US up to 4 times a year to join our off-sites
  • Willing to join our on-call rotation, roughly 1 week in every 10

Pydantic Validation is the data validation library that powers modern Python development - 500 million downloads per month, used by virtually every tech company you've heard of. Why? Because we obsess over developer experience and write code we'd actually want to use ourselves.

We're applying that same engineering mindset to Pydantic Logfire, our observability platform with first class support for AI engineering, built for today's development reality: AI workloads, multi-language environments, and cloud infrastructure that's designed to be straightforward to set up and maintain.

We build with technologies developers actually want to work with:

  • OpenTelemetry for standardized instrumentation
  • SQL for intuitive querying (no proprietary query language to learn)
  • Rust, Python, and TypeScript for performance and productivity
  • Postgres, DataFusion, and object storage for scalable backends

Unlike other companies that pay lip service to open source, we commit over 20% of our engineering team to maintaining and expanding our open source ecosystem. This includes the core Pydantic Validation library and Pydantic AI - our rapidly growing framework that's becoming the standard for AI application development. We're signatories of the open source pledge and build on open standards because we believe in interoperability, not lock-in. Use our OpenTelemetry-based SDK with any compatible backend - we're confident you'll choose us on merit.

We're backed by Sequoia Capital and run a fully remote team across multiple time zones (with regular in-person offsites - next one is June 2026 in London).

Join our team of exceptional engineers who value substance over hype, practical approaches over perfectionism, and meaningful progress over busyness. We've built a culture that balances technical ambition with sustainable practices—minimal meetings and respect for your expertise and time. We're creating tools that genuinely improve developers' lives, and we're looking for thoughtful contributors who share our commitment to quality and our passion for elegant solutions.

  • 💰 Compensation: Competitive salary and stock options
  • 🌍 Truly Remote: Work from anywhere within our timezone range - no office requirements
  • 🌐 Global & Diverse: Join a multi-cultural team of 8+ nationalities
  • 💪 Impact: Direct influence on tools used by millions of developers worldwide
  • 🎯 Focus on Growth: Regular opportunities for learning and professional development
  • 🤝 Team Gatherings: Connect with the team at our regular international off-sites
  • 🏥 Healthcare: Comprehensive health coverage for you and your dependents
  • 🎮 Flexible Hours: Work when you're most productive
  • 💻 Equipment: Budget for your home office setup
  • ⚖️ Work-Life Balance: flexible working hours and 33 days PTO no matter where you live (including public holidays, which you can choose to take or not)

To apply, email careers@pydantic.dev with the job title in the subject line. We'd also appreciate a few lines explaining why you think you'd be a good fit for the role and what you've done in the past that evidences that.

No recruiters or agencies please. Unsolicited recruiters will be marked as spam.

To make your application stand out, please include examples of your work on database systems, query engines, or performance optimization — including contributions to relevant open source projects.