LLM harm reduction, policy proposal for Drupal Core contribution

I've been following and participating in the conversation about applying AI tools to the Drupal core issue queue, and the broader community. I've been listening, reading, and experimenting quite a bit in and out of Drupal. It's been a wild ride since last December and for the past few weeks a few things started to solidify. This very good video How to (Anti) AI Better, and this other great series of blog posts definitely helped frame my opinion on what to do. Expect a few hours of listening and reading, that's the cost of nuanced opinions on a complex topic.

TL:DR; I don't believe LLM will solve the problem they're causing. The main focus of an AI policy for Drupal Core should be LLM harm reduction and education, by enforcing a small set of directives for LLM tools and their human operator to follow. I opened a new AGENTS.md core issue to implement this.

Why now? A few issues with LLM use popped up, and it seems the rate is starting to increase. Better deal with it sooner rather than later. The response can't be "shame harder", we have a policy that requires disclosure of LLM use when contributing. Right now people are afraid of mentioning their LLM use, defeating the policy. If that doesn't improve we'll have even more "hidden" LLM use, or people will leave, or lash out, none of which is helpful. Of course harm reduction methods can be counter-productive in some cases, that's up to us to make sure we're making things better than what they would be without.

What I'm proposing is aimed at making working with people using LLMs less painful than it is today on the reviewer side of things. Review is the real bottleneck of Drupal core velocity, anything slowing reviewers down is a problem to be fixed. The full details are available in the Drupal core issue, this is a summary of the rules an LLM must follow when generating content:

Enforce LLM use disclosure, when the tool generates a comment, a commit, add the disclosure automatically to the generated text.
Limit text to 180 words, it's enough to get useful information and small enough that hallucinations will be less likely. LLMs are good at following that kind of constraints, let's use this. This is also making LLM cheaper to use, less output token is less money and energy spent generating text.
Basic code workflow, make sure everything the LLM sends is passing the various linters we have configured, and suggest heavily to run the associated tests with the code change.
Git commit etiquette, keep it short, don't use git trailers to attribute work to the LLM (the specific form used to grant visibility as a co-author in git systems). Since all the commit messages are shown in the issue, this prevents intermediate commits from interfering with issue discussion.

I'm not going to ignore the fact that LLMs are fascist tech, this and other objections to LLMs are discussed in the resources I linked to above. Regardless of where they come from and who design them, people use LLMs today in our community. Prohibition and abstinence does not work, only education can help us. You can ban drunk driving a lot more effectively than banning alcohol consumption, and the majority of people self-enforce that ban because they don't want to die or kill their friends and family.

It doesn't belong in the AGENTS.md issue yet, but we can suggest where LLM use is more or less appropriate. From my experience:

Any situation where the audience is another human is not particularly suited for LLMs. There needs to be heavy human involvement to make sure we don't waste time managing the LLM instead of doing the work. Examples include: handling the "glue work" necessary to make the community function, credit attribution (since that requires interpretation). When there is no human oversight, the bare minimum is to make the content short and unopinionated. If there is an opinion to be had, own it.
Audiences are a mix of computers and humans: clearing out technical debt, moving from one architecture to another, LLMs can be efficient, time saving even, and brittle. You need some basic skills and knowledge in the problem space to make sure you can review the LLM output correctly. Some of this work is in the realm of "it would take too long"… for a human. If you've got tokens to spend, go for it, make sure to carefully review the output, make sure linter and tests are running and green before sending it out for review.
Audience is computers: This situation doesn't exist for anything you post to Drupal core issues. We read everything that gets into Drupal core, there is always a human involved. Act accordingly.

I found LLM tools particularly helpful for:

Matching unstructured thoughts or concepts with existing formal theories and methods. Once you know what the formal definition of what you're doing is, it's very easy to have the LLM help detect anti-patterns, and sub-optimal implementations, or compare different approaches and give examples of what that would mean applied to your specific situation. It helped me a bunch of times.
Stress tests a plan or code assumptions, LLMs tend to detect and address edge cases more easily than most humans would.
It's easier to polish what you ship. Some of the tedious work can be managed by the LLM like making a large number of screenshots after an UI update and updating them in place in the documentation.

Like with most things, LLMs: it depends. They will not solve the problems they're causing, they will not disappear overnight, and people are already using them. We can try to ignore it, or we can help LLMs users help us. I'm for making the community welcoming and making the machines work for us, not against us.

Thanks to catch for the review.