PyAI Conf
Register now
/Company

OSS in the Age of AI: PyAI Panel discussion

15 mins

On March 10, at San Francisco's Ferry Building, we put four Python open source maintainers on a stage and asked them to stop being polite about what AI is doing to their projects.

Guido van Rossum, creator of Python. Sebastián Ramirez, creator of FastAPI. Jeremiah Lowin, CEO of Prefect and creator of FastMCP. Samuel Colvin, creator of Pydantic. The panel was moderated by me, and part of PyAI, Pydantic's first conference, co-organised with Prefect.

Here's the video of the session and a write-up summary of what was discussed.

The conversation opened on something every OS maintainer is dealing with now: AI-generated pull requests flooding repositories.

The examples are well documented at this point. curl ended its bug bounty program after AI-generated security reports made triage unsustainable. A matplotlib maintainer was shamed by an autonomous bot after closing its PR. GitHub has now shipped a feature letting maintainers disable PRs entirely. Ashley Wolf, GitHub's Director of Open Source Programs, put it plainly: "the cost to create has dropped, but the cost to review has not."

GitHub's blog called it an Eternal September, a reference to the 1993 when AOL began offering Usenet access to its subscribers and brought a continuous flood of new users unfamiliar with community norms. The difference now is it's not just new users, it's automated noise.

Sebastián was direct: "Way worse." He framed it as a distributed denial of service (DDoS) attack, not on servers, but on maintainer attention.

"The effort that person has to make is tiny. And then this creates a PR that will consume days of reviewing. The imbalance of effort is the actual problem. Not bad intent."

Sebastián Ramirez

Guido pointed out that this isn't entirely new territory. CPython has been dealing with bad-faith PRs for decades. People submitting one-line typo fixes just to put "Python contributor" on a CV.

AI isn't even all that different. The volume is new. The mechanics of abuse are familiar.

Guido van Rossum

What CPython has landed on: you can use AI tools, but "it's not okay to submit a PR that has no sign of a human actually being involved." The test is simple. When maintainers ask a question about the PR and there's no response, it gets labeled pending and closed after two weeks. It works, mostly because it's hard to fake engagement.

Jeremiah shared what FastMCP did on their repo. Requiring an issue before a PR was opened just meant the issue got opened one second before the PR. Requiring maintainer interaction just moved the burden from the PR queue to the issues page. What actually worked: flagging PRs where the description was too long.

"LLMs have this weird insistence on telling you everything they did even though the code is right there."

Jeremiah Lowin

A 10,000-word description for five lines of code: closed. Five lines of description for 10,000 lines of code: they'd look at it. He admitted the heuristic would probably be dead in three months once the models learn shorter descriptions.

Samuel's solution was structural. "GitHub needs to build human and AI identity into the contribution system", he argued. Not perfect detection, but an honesty layer combined with reputation. His idea:

A metric on your GitHub profile tracking how sloppy the code you submit is. You spend a little reputation every time you open a big PR, and get it back when it merges. Reviewers get signal before they ever open the diff."

Samuel Colvin

I called it the "new economy of open source." He was careful to add that federated reputation would be better than centralising it with one company, but Microsoft via GitHub might be better than nothing.

A clear worry: any system that raises the bar for contribution will also shut out the well-meaning first-timer. One of the beauties of open source is that it's open for everybody. A reputation score that gates contribution is almost a credit score. You build credibility on one repo before you can contribute to FastAPI. That's a real cost.

Samuel acknowledged Pydantic uses something called Big Cheese internally. It's a Chrome extension that shows who's commenting on issues and gives them a one-to-five score based on their employer and professional profile. The impetus was less flattering: someone from Pydantic's team had closed an issue submitted by one of the OpenAI founders without recognising who it was. "It's pure credentialism, because we don't have the means to go and (...automatically assess) 'how good is this person's code?' but you can imagine that could exist."

Across the ecosystem, Kate Holterhoff at RedMonk has catalogued how 73 open source organisations are now handling AI contributions: from outright bans to required disclosure to still-undecided. Melissa Weber Mendonça's GitHub repository tracks the primary source documents. The landscape is evolving and we all seem to be looking for the thing that works.


There's a live debate about whether code review as a practice is no more. Ankit Jain's essay in Latent Space makes the case that humans already couldn't keep up when humans wrote code at human-speed. "Move judgment upstream to the plan, not the diff. Review intent, not implementation."

The problem is the data doesn't make that easy to accept. CodeRabbit's analysis of 470 open source PRs found AI-authored PRs contain roughly 1.7 times more issues overall. Logic errors 75% more common. Security issues up to 2.74 times higher. The acceleration is real and so is the error rate. Important to note that the "AI authored" label on the study was inferred, not verified.

Samuel offered four rules for when AI gives you 10 to 20x acceleration rather than 3 to 5x: the internals are known to the model; the external interface is known; unit tests exist or are easy to generate; and there's no dispute about what the interface should be. The Redis clone is his canonical example. Everyone knows what Redis is. The protocol is documented. Tests can verify conformance without needing a human to specify correctness from scratch. "You can go and produce a clone of Redis that would have taken a good engineer years before in a matter of hours now." Monty, Pydantic's Rust-based Python interpreter for LLM-generated code, only exists because those four rules applied. CPython had done the deep work of designing the language. Samuel was building a different implementation of something already well-defined.

But those conditions are the exception, not the rule. When the interface is still being invented, when the security implications are subtle, when the project is something hundreds of millions of people depend on, the assessment should be different.

Guido was clear on where he stood on long AI-generated code contributions: "I would much rather sort of co-develop that PR from small beginnings, with occasionally bursts of code written by a model but carefully reviewed then." Large PRs are suspicious regardless of how they were written.

"If someone confronts you with 10,000 lines of code, I find it real hard to believe that those are the right 10,000 lines of code."

His point: the problem isn't really AI. It's the PR size. AI just makes it easier to generate the wrong 10,000 lines very quickly.

Jeremiah reframed the whole question. The 10,000-line PR is a useful villain, but most PRs are five or twenty lines. LLM-generated or not, they still take the same mental burden to review.

"I think it's about creating an environment where explaining the code is of paramount importance. If the description is clear and the code is six lines long, then I will take the time to review it, and it will go in."

He was explicit that he doesn't care who or what wrote the code. What he cares about is the disproportionate burden. In his view, the problem is that the scale tipped entirely on the code production side, and culture hasn't caught up. "It's about a mutual trying to reduce the burden, which is not new in open source." In a world where a well-scoped issue is something you can hand to an agent and get back a tested implementation, the human's role shifts: write the issue, specify the acceptance criteria, then let the model handle the execution and let automated tests verify correctness. "I will say I love when I get a well-scoped issue that I can send one of my agents with my skills, wi th my preference to just go implement that plan. That will get merged so much faster than a giant PR."

Sebastian's approach, shared from his own practice: when he uses AI to contribute to someone else's project, he includes a full AI disclaimer in the PR description. Which model, which prompts, the full conversation in a collapsed HTML detail block.

"My intention is not to be a contributor to this thing, to say 'I am a contributor now because like I made the LLM do something'. I want the problem to be solved!"

He's explicit that the maintainer can discard his PR entirely, take the prompt, and run it themselves. If that's more useful than merging what he submitted, great. That reframing of contribution as information-sharing rather than code-claiming was refreshing. Guido's counterpoint was that if he was reviewing this PR, he'd not click on the collapsed HTML detail block.

Guido also mentioned he's "slightly old-fashioned on development speed for something as important as Python where if we screw up we introduce a bug into everybody's systems."

But not everybody is CPython. For projects that needs to move fast, with no backward compatibility constraints, no millions of downstream users, and no critical security surface, it's a different trade-off. Samuel picked up the thread mentioning that the discussion was not about the bleeding edge of what building with AI actually looks like right now. The discussion was about how to maintain projects that are used by millions of people with the help of AI. He mentioned that Armin Ronacher told him OpenCode hit 1.4 million lines of code written in a matter of months. "That is, I imagine, massively more than CPython." Guido agreed: "Yep." Those projects have no backward compatibility requirements, no one cares if things change next week, and they're "almost embracing it's insecure by design." Not right or wrong. Just a completely different context.


The final question was the one that tends to stay polite but landed honestly here: how should large companies building on open source, specifically the companies offering LLMs, invest back in the ecosystem?

Guido went first with a concrete example. Anthropic donated $1.5 million to the Python Software Foundation. That money mostly goes to security developers and researchers, he said, ensuring Python is secure not just in the core codebase but in the whole ecosystem around it. Money is one answer.

Samuel made the case for a different kind of investment. He pointed to the Sentry-originated Open Source Pledge: $2,000 per engineer per year donated to open source. "There is no scale at which $2,000 per engineer is impossible if you have the right mindset. But mostly today it is startups doing it." He called out Google and OpenAI specifically, noting he introduced it to Jason Liu from OpenAI to the pledge recently. Big companies signing on would make a "profound, profound difference."

Jeremiah's take was more concrete. Companies don't have to write cheques. Prefect opened their Washington DC conference facilities and food budget to any Python or data open source meetup in the city, no strings attached. "I remember when I was organising meetups, that was the worst, finding a place and finding food." Not glamorous. Genuinely useful. Guido mentioned how Google used to open its facilities to host events but that this is no longer possible due to the physical security team.

Sebastian raised something less obvious: the legal structures around open source contribution are quietly anti-open source at many companies. California law often means that if you write any code that belongs to a company, you can't contribute to open source without clearance. "Lawyers don't know about open source. It's super weird." His suggestion: getting lawyers who understand open source would let companies give engineers one day a week to maintain the tools they depend on, which is "much cheaper than waiting for those tools to go stale and deprecated and having to do a massive migration that is going to break production." Guido recommended Mozilla, which has had open source-knowledgeable lawyers for 25 years.

Samuel slipped in that a good way to support open source is to "go buy products like Logfire, built by great open source maintainers." He acknowledged immediately it was a slight ad.


Python has become the AI language. As Guido mentioned: "I haven't seen a single example (at the conference) that wasn't in Python so far."

Python won because it was already everywhere, because its libraries were already the basis for data science, numerical methods, machine learning, and so much more. Easier than Perl, but as powerful as Shell, and human enough to be grasped easily. Its success was always about accessibility.

"Python is a great first language. It's a scripting language, it's a glue language, it's a community language, a prototyping language, a production language."

(adapted from) Guido van Rossum

Some questions that stuck with me which I did not have a chance to ask:

  • What Python's role now when natural language is the accessible layer and code production is no longer the bottleneck?
  • How do open source maintainers cut through the noise and decide what to build effectively?

Python Brasil has a theme they return to yearly: people (are greater than) > technology. That's what we kept circling back to, even when we were talking about PRs and reputation systems and legal structures. The common denominator underneath all of it is the same: people.

The session ended with a wish transmitted as a chant:

No more slop. No more slop. No more slop.


If you were at the conference or found this useful, a star on our GitHub repos, a comment on social media, or a subscription to our YouTube channel all help.

pyai-oss.png


PyAI was held on March 10, 2026, at the Ferry Building in San Francisco. The panel was moderated by me, Laís Carvalho. Quotes have been lightly edited for clarity (mostly by hand). This summary was produced combining the panel notes and reference resources, with LLM assistance (Claude Sonnet 4.6) to analyse the transcript. It has been revised but may contain discrepancies.