I tried Claude Code for a month with Drupal

Large Language Models are a pretty fascinating piece of technology. Back in December I saw some chatter about Claude Code and I wanted to try it out myself. It is really fun, it's jQuery in 2010 levels of fun. And, more concerning, it's fun even when it waste my time. For example I tried it on a few things in the past 4 weeks: 

  1. The very first thing was a Firefox browser extension. I didn't want to spend time learning the whole extension system for my tiny use case. Claude Code spit it out first try, I followed it's instructions to get it set up on my browser and it just worked. Nice, as a first impression that's hard to beat.
  2. Then I needed a few bash scripts to keep medias organized and optimized. Less easy to get it working but it managed with enough prompting. I think we need to thank the countless bash tutorials that were made over the years. My ego though I could have done it faster, I disagree.
  3. After that I asked it to refactor some legacy part of the code supporting my Drupal bot. It looked good, then 3 weeks later I noticed it broke something and I didn't catch it (codebase lacks tests). At least now I'm not double-fetching every core issue updated in the last 20 minutes. That's another win.
  4. With these few quick wins I tried something more ambitious: test how LLMs can help with Drupal credit attribution. When I commit a long issue on Drupal core, it can take up to 15 minutes to figure out who did what, and evaluate the activity against the core credit guidelines. How good can a LLM be at assigning credit when you give it our guidelines and the full history of an issue? not too bad it turns out. That's a subject for later, when the model choice and finetuning is done.
  5. The mass update of GitLab merge requests to point to the new main branch of Drupal core was generated by Claude Code in a few minutes. It used the same architecture of my other commands and got that working faster than I could have.
  6. In the few hours it took to finish this article, I had Claude Code generate a working implementation of the npm cli in PHP. To test if we can use it to manage contrib JS dependencies without making node a hosting requirement of Drupal. It works to manage core dependencies at least!

At this point I was starting to hit some of the weakness of this type of tools. It's happy to add duplicate code, put unrelated features in the same file. It gets messy real quick if you don't frame things early on. There are ways to mitigate this of course, it'll be an additional layer to keep in mind. 

A few things I liked about working with LLMs:

  • I have a hard time starting things. LLMs just spit out whatever and it's enough to get things started and actually make progress. Less procrastinating is a win.
  • The code generated tend to have a more finished feel to it, like adding a --dry-run options to bash script. I know it's by necessity so that the tool can test its work, it's still nice to have.
  • It implements some edge cases I wouldn't have bothered with first, until I actually need it.

I have reservations about AI/LLMs, how they were made, trained, and the whole philosophy of the folks with the money in this space. If that's your case too, check out the work of the DAIR Institute. For now I think I'll keep cautiously working with this type of tooling. I'll probably move to OpenCode soon to be able to use local models when (not if) AI providers raise prices to unsustainable amount. 

Wordpress just released a benchmark for AI tools, we could go that way but I don't think we have the market share to make anyone care about it. There is a risk of being overrun by AI slop code contributions, as a maintainer and community member I don't want that to happen. What we can do right now to prevent this is to add a couple of AGENTS.md files to core (and contrib?). We can't tell people to not use a tool, but we can definitely help them help us.