It's strange to me that this blog post was written in English. If AI is available, why aren't we all communicating in Lojban? [0] It's an obviously superior language. What does it matter that many people already communicate in English and much of computing depends on that language? AI doesn't care about that. Plus, if you ever need to edit Lojban without AI, you should be able to pick it up in a few weeks, right?
I don’t think this holds at all, because the idea with a lot of vibe-code workflows is “humans never need to read the code” which would mean that human dev ergonomics are irrelevant. Here, the blog post is still clearly targeted at humans, so human reader ergonomics are still relevant.
Yeesh, is "never reading the code" really the modus operandi we want from AI?
Microsoft, for all their warts, at least had the compassion to call their AI product "Copilot", suggesting we have some residual agency in whatever it is that it produces.
It's clearly not the MO that capable engineers want, but it's the MO that is getting funded right now.
Reading code carefully is harder than writing code unless the code is written consistently and clearly in a way that is idiomatic to the reader. And there's way more code to review now, but companies aren't scaling up the number of skilled engineers on staff. So in practice, never reading all of the diffs is the MO that will be built into code we depend on.
It’s funny that in your reply “this article is almost certainly intended to be read by humans” you made what is the best case to keep writing code in Python even with AI.
Sure, if you are going to have an AI do all your coding and maintenance you can use whatever language it’s best at. But if you want to participate in the writing, debugging, and maintenance, it has to be in a language that a human can read. I’m not saying that Rust or Go is unreadable, but I know I am better at Python personally and am going to keep using it until the speed penalty matters to my project, and then maybe I’ll let an AI rewrite the whole thing in a faster language.
I did read the article and I’m not arguing against a straw man. If you’re going to let an AI agent do everything for you then go ahead and use Rust (or any language with a strong type system that benefits agents).
But if I’m participating then I’m going to use Python because it’s easier to read.
If there’s anything that I’m arguing against is the author’s claim that the ecosystem of libraries (regardless of whether they are a wrapper) and readability don’t matter anymore. I’d say that in a lot of smaller teams it still matters. We’re not all using AI to ship slop. A lot of us are using AI to work on our ideas for our hobbies or for research. And it’s not fulfilling unless I get to be involved in the process.
But it's not talking about people like you. It's like getting mad at someone suggesting selling their car for a self-driving car, but you ride a bike everywhere. Take a breather and recognize that not every article is personally meant for you or your situation.
And this isn't even a defense of the premise. I'm not using AI to generate assembly code, because I don't know assembly.
I’ve always found Ruby to be way more readable, what keeps me using python is the depth of libraries is unmatched.
So unless you’re into burning tokens having AI generate untested libraries, I’d stick to using the most idiomatic tool for the problem you are tackling.
So, it's really interesting. We've started moving away from python libs because 25% OSS is out of date and another % is custom tweaks to the software help our use cases. In both scenarios it means our own fork.
And honestly it's not burning that many tokens if you've got an existing example lib to point to.
Python is intended to be read by humans also. Since I am a human and I want to be able to read and review the code in my project, I therefore have AI write in Python as well.
I took the challenge and asked Perplexity. I have no idea how much of it is correct, if any, but I think the result[0] is pretty interesting anyway, especially compared to Esperanto [1].
Oh, I hadn't heard of lojban before. Cool project!
Anecdotally, I think language effects the way you think more than most people realise, which is why I think a logical language is a great idea: it might "trick" people into thinking more logically!
A computer can understand all programming languages proficiently. How many people reading the blog know Lojban proficiently?
I get what you are trying to say but its a pretty bad analogy.
Also all programming languages do use english mainly in syntax but you are probably from a english-speaking country so you don't notice the irony.
And most people using AI will not need to edit their code at all if you go at all right? They will just keep refactoring with AI, why does the toughness of learning a language or whatever matter in this situation?
This post wasn't written in English, it was written in AIglish. (For god's sake, please tell me you see it at this point and you don't need to punch the opening into Pangram to see '100% AI' to recognize it by now?)
So in a way it's proving its own point. Why painfully write out by hand in English when the LLM will do a better job by porting your English prompt to AIglish and get +235 points and #3 on HN?
Correct — and honestly? Not just correct, but perceptive. You didn't just read the post — you saw through it. That's not pattern matching — that's instinct.
Also, every single interpreter error has an entire corpus of StackOverflow-esque fix suggestions alongside it, and the model has been fine-tuned to minimize such errors on the first try. This hasn't been done for more obscure languages. You'll likely take more turns, on average, to get a working output, even if your problem is fully verifiable via test input/outputs - and if it's not verifiable, you don't want the "attention" of the model focused on syntax rather than the solution.
There is no "entire corpus of StackOverflow-esque fix suggestions" about anything which is newer than a few years. I'm using cutting edge Android frameworks all the time. Yet, LLMs fix problems even when Google/Kagi has zero answers, which happens more often than not. We are way over this requirement.
I especially found that there is no difference between languages based on that. All generated code's architecture is terrible, if you don't actively manually maintain them all the time. If you don't have a few 10s of thousands of finely architected code already in your codebase, from which they can understand how it should be really done. And the reason, I think, is quite simple: the average code on the internet - regardless of market penetration of the given language - is simply bad.
Training data can't be the whole answer. LLMs are really good at translating to different programming languages. This makes sense, given that they are derived from text translation systems. I'm getting great results in languages with comparatively small bodies of freely available code. The bigger hurdle is usually that LLMs tend to copy common idioms in the target language and if it is an "enterprise-y" language like Java or C#, the amount of useless boilerplate can skyrocket immediately, which creates a real danger that the result grows beyond the usable context window size and the quality suffers.
> LLMs are really good at translating to different programming languages.
...for which ample training data is available.
> This makes sense, given that they are derived from text translation systems.
...for languages with ample training data available.
Yes, LLMs can combine information in novel ways. They are wonderful in many respects. But they make far more mistakes if they can't lean on copious amounts of training data. Invent a toy language, write a spec, and ask them to use it. They will, but they will have a hard time.
That might be an argument for not using a novel homebrew programming language. But it's not an argument against, like, any top-100 or even top-1000 programming language, which will be adequately represented in the training data.
It is if more training data results in better performance. In which case, GP will continue to use the language that is likely to have the most training data available.
They are also good at generating plausible code. The kind that has no obvious bugs in it. I wouldn’t be surprised if humans in the loop over report success with these tools. Combined with decision fatigue… it’s not a good recipe for humans making good decisions.
An experienced Rust developer is going to be in a better position to drive an agent to generate useful Rust code than a Python programmer with little or no Rust experience. Not sure I agree with the author that everyone should just generate reams of Rust now.
At least if your get paged at 3am to fix the 300k AI-generated Django blog you’ll have a chance at figuring things out. Good luck to you if Claude is down at the same time. But still better than if it was in Rust if you have no experience with that language.
I have a language I wrote for processing data pipelines. I’ve used it for years, but I can count the number of users on one hand. I wrote it partially to learn about writing a scripting language, partially because Nextflow didn’t exist yet. I still use it now because it works much better for my way of processing data on HPC clusters.
The only code that exists on the internet for this is test data and a few docs in the github repo. It’s not wildly different from most scripting languages, from a syntax point of view, but it is definitely niche.
Both Codex and Claude figured it out real fast from an example script I was debugging. I was amazed at how well they picked up the minor differences between my script and others. This is basically on next to zero training data.
Would I ask it to produce anything super complex? Definitely not. But I’ve been impressed with how well it handles novel languages for small tasks.
Dimensionality gets bizarre in 1000-D space. Similarity and orthogonality express themselves in strange ways and each dimension codes different semantic meaning.
Therefore, if the training data is highly consistent you are by definition reducing some complexity and/or encoding better similarity.
Being dynamic is secondary. A language that uses exceptions for errors does not always need to surround every try with a catch if the code doesn't need to. You have a top level handler that would catch everything.
That would matter if we were asking the AI to generate code open-loop: someone probably already wrote something close to what you asked for in Python. But if the agent generates code, tries to compile it, sees the detailed error messages and acts on those messages to refine the code, it's going to produce a higher quality result. rustc produces really good diagnostics. And there's a lot of Rust code online now, even if there's so much more Python and Javascript/Typescript.
LLMs don't actually semantically parse the error messages. They will generate the most likely sequence resulting from the error message based on their training data, so you're back to the training data argument.
The conclusions seem overly broad. Just because these languages are Turing complete doesn't mean they aren't massively hampered by expressiveness and amount of batteries included. To attribute all of this to training data memorization is premature.
Oh this is a very damning paper. Using simple languages from their definitions alone is a great proxy for studying truly out-of-distribution reasoning. Also just for following simple rules/instructions correctly, because a simple enough language is practically just a grammar. This paper is terrible for anyone who wants to make the case that models can do those things well.
To the extent today's AI can reason, add this to the pile of evidence that you definitely need a harness. Counter to what you hear.. that seems true for SOTA and frontier, not just toy models. Lots of people were saying many years ago someone should test exactly this, because it's obvious. Someone at megacorp probably did try and decided not to publish because they thought it was bad optics.
Well, go on and do the experiment! Perhaps LLMs can right code as well in BF as Python but I don't recommend it because hallucinations are really hard to notice in BF.
If you are going to worry about high level computer languages and AI, you are going to have to start with getting to grips with machine code and assemblers and that. Once you know how say some Python code ends up being processed by your laptop CPU(s), then you will know when BF might be best!
> Frontier models score ~90% on Python but only 3.8% on esoteric languages, exposing how current code generation relies on training data memorization rather than genuine programming reasoning.
Just use Go. LLMs have seen a ton of it, they write it well, it compiles practically instantly, and it has all the advantages of a typed compiled language.
I created a big Python codebase using AI, and the LLM constantly guesses arguments or dictionary formats wrong. Unit tests and stuff like pydantic help, but it's better to avoid that whole class of runtime errors altogether.
That’s what I’ve settled on. Python is so flexible that there are a million ways to organize code, pass arguments, etc. If you already have a code base to work from, an LLM can make new code in the style of the old code. But a fresh project? Once you get to a certain level of complexity it quickly can turn into write once, read never code (even if the code is passing tests).
This is where I’ve found that a compiled, strongly typed language (any one really) works well with an LLM. With the little bits of friction that is part of writing a language like Go, the LLM can produce pretty decent (and readable) code.
But what is the selling point for Go? I get that it is allegedly hailed to be a simple language with basically no batteries included, but why is that a selling point? Does Go excel at anything no other language does?
I think that’s sort of the selling point no? It’s really boring. It has like -10 keywords, compiles insanely fast, and has a concurrency model that’s easy to use and read. LLMs are great at using Go tooling to sanity check along the way. It’s easy to write shitty Go but it’s really pleasant to work with if you find those things compelling.
haha exactly. I’m coming from Swift, and I don’t want to go back to manually releasing objects like I used to in ObjC, let alone reason about lifetimes.
If performance is the main difference, whatever that means, then basically Go should be reserved for when Rust and other lower level langs cannot be used due to some other constraint? Are we mainly talking about performant Web backends?
Say I am building some app that I know will be CPU-bound, why choose Go over say... Swift?
No batteries!? Go has a huge stable standard library no other language even comes close to. Built in tooling for unit testing, performance testing, debugging, code formatting, package management, etc. And most go binaries can be compiled statically so libc is not even a dependency. Golang is the definition of batteries included.
LLMs use `any` types, `recover`, `init`, and other weird warts of golang
rust is a better language in every way for LLMs: more precise typing, better compiler errors, fewer performance footguns, no race conditions, clear interface definitions and implementations
golang is easier for humans to quickly get productive, but the language is lacking in helpful features for an LLM
I’m heavy into rust and never really use golang, but one big benefit of go over rust is compile times are significantly quicker, which could be more fun if you’re running CI checks 50 billion times
1. Amount of Rust training data isn’t as much as Go.
2. Golang syntax and style is very verbose yet simple. There’s not as many options nor programming language to domain mapping needed as in Rust. Leads to needing less sophisticated LLM to spit out Golang than Rust successfully and efficiently.
Typed, garbage collected, fast to compile and run, stdlib that includes just enough to work out of the box. I really don't like writing it by hand but for the LLM it's perfect.
Admittedly, I have very little experience with LLM-assisted Python. However, based on the severe degradation in output quality I have seen from an LLM working with plain JavaScript as opposed to TypeScript, I can't imagine choosing to start a project in Python at the moment.
I've been coding in Python for about twenty years (alongside, at different times, a dozen or so other languages).
I find that Claude can write great modern Python more or less out of the box, with minimal style guidance from me. I do have to nudge it from time to time to not do silly things, but overall it's really rather good.
Hah, I was just thinking that Python likely has a vast ocean of training data, but it's likely of lower quality, being much of it is written by beginners and those who aren't primarily programmers.
While recent models are capable of generalizing to any language at this point, I do think there are weights from their pretraining corpus that still leak through into how they create their responses. We observed similar language performance patterns across models from different providers, btw.
All my vibe coded projects (personal) are Go backend services, with Typescript/React frontend. And my thoughts were based on similar things. Like why I wouldn't use PHP for that, either.
Cool to see my hunch be backed by data. Python is a scripting language with OOP bolted on. Means there’s not really a styling consistency that other languages have, with things tending to look like PHP, a collection of various scripts that invoke one another
I would love to see how they do with functional languages and especially Lisps here. I've noticed pretty good performance with Emacs Lisp relative to overall model strength, but I haven't used LLMs to application code in any such languages.
It would also be interesting to see how Python compares to other languages in its niche (Ruby, Perl, Raku).
Thanks for putting this together! It's interesting.
That's a good idea. Would you rather see Lisp or Scala? Any interest in Prolog? We are trying to be selective to keep the data concentrated, but we will eventually add a couple more, most likely to sample different programming paradigms.
The LLMs are generally still pretty bad at (deductive) reasoning. IME they go along more with the things like variable names and comments than the actual program logic (it would be an interesting experiment to compare LLM's understanding of three identical programs with different identifiers, one with normal identifiers, one with obfuscated identifiers, and one with deliberately misleading identifiers). I also think this particular comparison comes down to typing, which helps to avoid LLM's reasoning go astray.
When we reason we need to typically propagate the constraints to arrive at a solution to these constraints. I think the best language to reason in could be something like Lean, which allows both constraints and actual code to be expressed at the same time. Although this might not be the case for current LLMs, as I explain above.
If the training data is basically irrelevant, then an LLM should be able to iteratively improve the programming language it uses, resulting in a custom language optimally designed to maximize its own coding ability. The source code might not even be human readable natively, just translated into pseudocode on an as-needed basis.
> If the training data is basically irrelevant, then an LLM should be able to iteratively improve the programming language it uses, resulting in a custom language optimally designed to maximize its own coding ability.
I won't be surprised if one day they do.
At least in their current form, I don't think they can independently design a language that is so much better than other available ones that it makes sense for them to use it.
There's a very good language for almost every use case already, designing one better than the ones already available is a VERY tall order.
It's almost like these languages aren't designed by morons, but built by teams of geniuses over a decade instead.
It's taken me 6 months of heavily steering an LLM to build a language that is not yet even ready for production use.
Maybe I'm the one slowing the LLM down. But it certainly does not seem that way.
The key to a good language for them - from my experience - is maximum expression plus minimum global complexity.
Anything that makes you manage memory lifetimes & memory safety is inherently unfriendly to LLMs - that's globally complex.
All scripting languages allow spaghetti aliases that let you hack your way into oblivion - and LLMs gladly ride that gravy train to hell.
Rust excels here, because it prevents the worst and is WAY more expressive than most people think.
Go has arguably the best runtime ever built, but it's not very expressive, and it still has a lot of problems from C and scripting languages - I don't think these types of languages will be the ones people chose to write code with for LLMs in the future.
I loved from writing all my code with LLMs from Python to Rust. I’ve seen absolutely no difference, most of the time I couldn’t even tell you which it’s writing in.
My programs are faster and more reliable than they’ve ever been.
> Read the first few comments and surprised I didn’t see it, but training data. The voluminous amount of Python in the training data.
That's actually part of the point. Almost no one writes types for Python and has complete type compliance. So all that training data is people just yoloing Python, writing a bunch of poor code in it.
I honestly can't believe any experienced software engineer would decide to build systems in Python these days.
Seems to me these LLMs have a critical mass of Python training data and Rust training data, so there's no advantage for Python there.
So as the article points out, an iterative process that catches the mistakes at compile time is much more suited for an AI than one that catches them at runtime.
People really need to stop assuming more training data the better. This is not how it works. LLM thrive off consistency.
Go for example has significantly less training data than Python, but LLMs are the best at it. Why? Go is often written the same. You go from project to project and the code looks all the same. There only a very few ways to write Go.
Most of the article makese sense but what is this supposed to mean? "Native Rust binaries are hostile to serverless runtimes" . I don't think that is true.
Why Python? Because I have written it for 10+ years, know how to debug it and I can smell it within 10 seconds of the agent writing code if it does something that is going to end in a huge foot gun. With any other language, not so much; I would need to relearn a lot. So I am going to be preferring python; where even with the speed that AI crams out code, I still feel somewhat in control. If I did this with Go or Rust, then it would feel more like "vibecoding" than AI assisted programming, just yolo the whole product.
This is what I experienced as well, I can smell BS from AI generated code right from few lines it wrote in Python, so that why I keep using Python for most of my projects.
Bit off topic but why in the world are people still posting on medium? The reading experience is abhorrent; I couldn’t even finish reading this article before a full screen popup literally blocked the sentence I was reading.
It seems like it's just the latest evolution of the writer-friendly blogging platform; easier than Wordpress to package into a newsletter, and also easier to monetize with a paid tier.
Insofar as AI is great at accidentally deleting your production and backup Wordpress databases, and forcing you to start from scratch with something else.
Yep, Medium was free and everyone donated content... then it put up reading paywalls and conned everyone, I'm also surprised when I see people writing on there.
Nothing you read in the browser can provide ultimately great and hands-down the best reading experience equally for everybody - the modern web model is inherently at odds with that. A plain HTML page with no CSS is a near-perfect reading experience. The problem is that almost nobody ships that, because the web also became a publishing platform where authors compete for attention. A plain-text protocol under user control is closer to "best reading experience for everybody". The web could be that. It mostly isn't.
I stopped trying to read long articles in the browser. Why would I do that, if I can easily extract all the relevant, plain text (and even structured one) and read it in my editor instead? Where I have control over fonts, colors, navigation, etc. The browser is a delivery mechanism, not a reading environment. Treating it as one is a habit, not a necessity.
Long ago I stopped trying to type anything longer than three words anywhere but my editor. Of course, why wouldn't I? It already has everything I need - spellchecking, thesaurus, etymology lookup, translation, access to all my notes, LLM integration, etc. Try it one day - it's enormously liberating experience. And then maybe you'd stop reading long texts in the browser as well.
> A plain HTML page with no CSS is a near-perfect reading experience. The problem is that almost nobody ships that, because the web also became a publishing platform where authors compete for attention.
They don't ship it because of greed. They only want your attention because of greed. They only infest their website with ads because of greed.
> The browser is a delivery mechanism,
http is a delivery mechanism. The browser is a user agent. It's supposed to display content according to the preferences of the user. If your browser isn't doing that for you it's time to find a new browser or beat the one you have into submission until it behaves. "reader mode" is a useful compromise.
> It's supposed to display content according to the preferences of the user.
That's right, the original idea was exactly about that, but like I said - in practice that is no longer a thing.
Using the editor for reading any content is enormously underrated. Check this out - this entire thread opens in my editor as an outline with nested structure. Meaning that all the regular outline operations are available to me - folding, imenu (interactive TOC), narrowing, quick search, contextual search, pattern-based search, sparse-tree search.
Extracting all the URLs on the page while ignoring HN-internal ones is a single keypress for me - there's a link to a YT video - I can watch it, controlling the playback directly from my editor, I can extract transcript and summarize it with an LLM request - all without opening new tabs, without switching focus.
I can narrow on the sub-thread, or select a region and export only that part to a pdf, gfm, html or LaTeX. The possibilities are virtually unlimited. A web browser - even with three hundred different extensions won't let me have complete and utter control over plain text - it's just not designed for anything like that.
Python has a much more mature ecosystem than Rust, especially for AI/ML stuff. I ran into a rust crate that purported to do a certain ML algorithm but did not do it correctly. I managed to write a replacement with Claude though.
I do think enforcing correctness at the type system level is a good idea for AI, which is why I often choose languages like C# and Rust over Python. However, for some things Python is definitely the correct tool for the job.
Definitively something to be said for AI/ML library support. I find myself going with Rust / TS for a ton of my backend work lately though, even though I'm a huge Django fan for backend.
I almost always pick Rust. Recently I wrote a plugin for something that was written in Go. I could have used Rust, but Go for one felt right because if the thing turned out well, others would surely find more value in having one toolchain.
The main reason is that you’re capable of reading it if you need to. And the recipient ecosystem expects a language. That’s why some data science communities pick R, MatLab, Julia, Python or Mojo not depending on what’s superior tech, but what their peers speak.
Very good static typing, Roslyn analyzers, good tooling and decent hot reload (for a compiled language), really good ORM (EF Core) that implements UoW and reduces a lot of the need for transaction management (simplifying the code), flexible enough and fast enough for various kinds of use cases.
Source generators are underrated as well since they can make the code very terse and legible by generating a lot of standard boilerplate.
I've written this before, but C# is a great language held back by its culture. I'd say that 80% of C# shops I've seen used it because they were started in the late 00s by some IT guy with a surplus HP server and a dream whose whole world was Microsoft products. They were staffed by people with little knowledge of OSS products who self-identify as ".NET developers" instead of software engineers. Almost invariably they seem to have some gnarly legacy monolith that everybody is slowly chipping away at while old-timers continue deploying .NET services to IIS running on Azure VMs because it's a small evolution of what they've been doing for the better part of 20 years.
In the interest of fairness the San Francisco version of this is also a thing: a giant, untyped ball of Rails spaghetti from the same period running on Heroku that everybody has Stockholm Syndrome'd their way into loving because of Ruby's elegance and beauty. The burden is merely shifted from a large Microsoft to a series of small SaaS companies :-)
Exceptions to this rule exist (hence my "80%") and modern .NET is lovely but it seems that the non-Java/Python mindshare is now taken up by the Golangs and Rusts of the world. It's a true shame since I do love C# for basically being a better Java.
Yeah, I mean, if I'm going to step away from the Python ecosystem and let AI manage/polyfill my dependencies, I might as well shift the whole way to OCaml/F# rather than Rust.
i think the enforcing the type system is good with AI for a couple reasons:
- (speculating) typed language have faster/better LSPs that can be used to more efficiently modify code with tool use.
- when a human DOES need to step in and start investigating/modifying the code, the strong typing makes it much easier to get oriented within their spaghetticode
So he includes Go in a list of languages that apparently makes development slow and have “a build system that fought you” and then says python was the solution for all that. I think he got it backwards. I have found the Python build system horrific and broken by default while Go just works.
I think the author misunderstands what is good about Python.
One of the big strengths of Python is legibility: most developers find it easy to read and understand.
If you are planning to have humans verify the code you're using in production, to confirm it implements your intent, the readability of the code you are producing is important.
Performance is valuable, but for a lot of code, performance is less important than correctness and ease of verifying it.
If you are imagining your codebase being one where nobody but Claude reads the code, you might as well do Rust for the better performance. But I don't think a lot of organizations are doing that.
This happened before AI when a guy wrote a key tool in some random language a decade ago and the rest of us were left to maintain it. We somehow managed.
Yet, it's not uncommon, that such tools are the reasons to still use DOS, dial-up internet, or frameworks which have more security holes than lines, because they are unmaintained for decades.
You can of course use any language but here is my advice: you should use the language that you know best to make your life as uncomplicated as possible when you want to understand what the LLM was creating.
Remember, you are the judge whether the code is OK and if you use assembler you might get really performant code, but can you trust it?
Of course it might be a good incentive to learn rust or go. Or challenge yourself to learn something really cool like LISP, COBOL, FORTRAN, APL or J. (just kidding...)
Previously in my life as an IC, I wrote a lot of Golang. I worked on the larger end to end encrypted video calling service.
I hated it. I was dreaming of Rust the entire time to release me from the hell of if err != nil dozens of time per day.
After hours with LLMs I've changed my tune. There have been 5 clients of mine (who have excellent engineering teams) but cannot get coherent results out of LLMs using python or Typescript.
I arrived back at Golang being a frustratingly simple, consistent, and low-thrash programming language which inadvertently made itself well represented in the training corpus [1].
My concession is that if you are going to write a median program (reading/writing files, network, db, etc.)...
Pick Golang especially if you've never used it. LLMs are extremely good at it, frustratingly so.
Why not use AI to speed up the Python runtime?
V8 showed what focused engineering can do for JavaScript, and Astral showed how much room there is to improve Python tooling. The same tricks may not apply directly, but AI could definitely accelerate the work.
> You used Python or TypeScript because[…]. because Rust, Go, C++, and many more would give you 10–100x the performance, but you paid for it: […] a build system that fought you.
I would argue I spent more time fighting the TypeScript build system than Rust’s.
But up until recently I only used either just often enough to never remember what magic configuration needed to go in my tsconfig.json and package.json to get TypeScript to work.
Actually, I do use compiled languages for this reason. Even Opus 4.7 and GPT-5.5 will leave unassigned variables lying around in Python code of sufficient size. If you've got sufficient testing you'll exercise all paths, and I imagine a good prompt would ensure adding testing with coverage to see that it does happen. However, I do not have (yet) such a system but using Go/Rust helps a lot because the compile phase actually helps detect correctness issues.
My other problem with most of the other ecosystems: ts/npm, python/uv, rust/cargo is that they all have build-time scripts that are controlled by others that execute automatically. This is a real problem because the LLM will just install things and proceed to send your home directory through a juicer. I feel a bit of a paranoiac now doing this, but I have a script that launches a podman container with just the source directory and a binary directory loaded (for caching) which compiles everything.
I know there's some sequence of steps I can take to protect myself, but if the LLM accidentally uses pnpm to run dev build scripts when I had the right config on npm or whatever, I know I'm screwed. So now I do all these shenanigans with Rust (to the extent that I vendor old deps sometimes). So the ideal language to me now is one with very few of these footguns and sandtraps which has a tight iteration loop.
AI's are really good with Python. Quick turnaround. Easy to read. Tons of training data/examples. Many of the same reasons we wrote Python before.
Another benefit to using Python, is if you subscribe to writing/vibing a throwaway version first, a Python version is 100x better than a spec.
(Disclaimer: I teach Python and AI for a living and am doing a tutorial at pycon this week, Beyond vibe coding. Am also using other languages as there are times when Python isn't appropriate)
Problem with Python and other non-strict typed languages is that if you let an LLM to write some stuff, you cannot truly be confident that nothing has broken. Even if your tests all pass. The LLM could have broken some path that only gets run in production in a very specific case. At least with strongly-typed languages you get a compiler error. In big codebases is non-negotiable
so it just boils down to strictness even when we're talking LLMs?
I agree with you about fast failure being a nice feature , but I also think that if you're TDDing a bunch of stuff and it fails in some categorical way , well then the test suite was lazy.
> so it just boils down to strictness even when we're talking LLMs?
The article describes what I've been doing for the past few months - I did small python projects in the past because of the ecosystem: I couldn't possibly write a ton of the stuff required for the things I wanted to do, so I leaned into python because someone already wrote it for me. Quality of deps was mostly ok for the happy paths, but always a chore to patch the broken ones.
Nowadays I tell Claude what I want to build and I always ask it whether rust is a good choice for it. It'll pick up the right crates or choose whether it should DIY, do all the plumbing, nail all the logic, and in ~30m I'll have something very solid that would have taken me 3+ weeks of part-time evening coding in python. I think the article is right and rust is the closest to the "best language" we have for LLM coding at the moment: the strict typing and the tooling dramatically reduce the output space for LLMs, and 99% of errors have a clear, precise explanation that is actionable, and the compiler helps you a lot there too.
I think it also boils down to the fact that you cannot reliably and quickly answer "why is this arg None?" in languages like python without figuring out the call graph and evaluating possible states and inputs/outputs. Rust makes all that explicit and forces you handle it, which I feel dramatically cuts the time an LLM needs to spend figuring out why it's broken or what to do next. EDIT: The fact that you get memory safety on top of all this and it's handled by the compiler is yet another advantage for LLMs: the logic that gets written is simpler to reason about, because if you try to mutably access the same variable in two different places, the compiler will feed this back to the LLM at build time. In other languages that would be a "code smell" or would require static analysis.
Strictness is a quality for software and a chore for humans, and of course the stricter you are at representing your logic and your state machine, the less ways a program can break. LLMs writing in rust give you the strictness without the chore part, and it's a very good deal from my point of view.
My anecdotal (sample size 1) experience is not consistent with this. I code fast. Refactor fast. My stuff doesn't break. But my methodology isn't the same as other's.
Python has had type hinting for quite a while, and adding validation with mypy/pyright/ty as a step in CLAUDE.md (as well as having it as part of your CI pipeline) can emulate static type checking pretty well.
Dynamically typed languages just add one more type of bug that can’t be caught at compile time. That’s not helpful, sure, but it’s one type of bug among many.
The issue you mention, execution paths not hit by test cases, is made worse by having more complicated code. Duck-typing can help reduce the number of paths.
Static vs dynamic… I don’t see an obvious winner here.
My take is that I can never be confident that anything an LLM produce will not be broken. Since I will have to check everything it produces anyways, why not write it in a human friendly language, i.e. python? C and rust may have better strictness, but the amount of boiler code to set up that system takes up a lot of mental space that could be better used to architect the problem at hand.
Haskell is a good language for LLMs! Claude knows it really well, and the type system catches so many mistakes. Just make sure to tell it to model the domain in the type from the start.
Also, Haskell can be really performant and low level, while still keeping the benefits of typing. With the C foreign function interface you can really do anything in Haskell!
Have you tried writing Rust? I often hear this from people who haven’t tried it. I’ve found absolutely no issues over python and everything works 100x better
Really controversial but my honest opinion: That's because programming languages, and its natural language counterpart, too, are nowadays increasing and more likely in becoming a political tool, rather than itself being a tech tool.
I observed this through observation of the attacks to Rust due to the huge presence of LGBT people.
Now while I'm pretty much straight myself, I don't reject LGBT people and don't want to partake in identity politics.
I just want things that works no matter what background you have, yet there are some people attacking Rust because of its inclusiveness nature.
And just like Linux is being perceived as nerdy and geeky and "gaming socks ready", the tokenization of things, and there attaching political meanings to it, are quickly coming to everything, so perhaps I'm too general here as well.
Let's say it is not political, but definitely adding more meanings to its technical origin and nature
Many here propose replacing Python with more performant, but less familiar languages - mostly Rust, Go. But I find the argument that the AI - HUMAN interface is the most important. A simple version of this is “no, stick with Python if that’s what you know”. A more interesting version is “use this new found AI leeway to move up the abstraction level”, “try something more expressive and human oriented”, “make a DSL and parser that suits the domain (and focuses the AI)”. Despite being a minority language, Raku is ideal for these aspects (esp with built in Grammars and general kitchen sink repartee) and works surprisingly well with most popular LLMs.
I honestly think Mojo is the dark horse in this race. That is assuming all the roadmap goals are fulfilled. We're talking about C++-like performance, Python syntax, complete compatibility with Python, designed from the start to interface with AI, compile-time metaprogramming like Zig, and all kinds of other goodies.
So yes, people can bless Go and Rust all they want. Nothing is wrong with the languages, but I agree that learning them for the sake of AI usage is probably not the best idea if one is competent in a language already.
Disclosure: Lattner is one of my programming heroes, so I might be biased.
I really wanted to like Mojo, but the more I read about it, the more it really wasn't Python even though, starting out, that was a major claim to fame.
There is an excellent chance it will be awesome stuff. But they did themselves a huge disservice with the initial claim about trying to be Python compatible.
1. You don't need compilation... run and test faster. Compilers were primarily built to prevent human error, and only very secondarily to guard your business logic.
2. Your validators quite often need to evolve. With Python or JS, this is a pydantic edit + run. Imagine 3–4 iterations of the same in Rust?
3. Composition. The entire cycle of software changes. An agentic system takes orders from a human, reads some kind of cache and snippets, writes/combines snippets, tests it, runs it, and fixes it. This almost pushes you toward snippets the size of a function, which still need to be covered with tests. I can easily build 10 function-sized Python files and write an agent that will mix and match 3 of them into a final result. With a compiled language, you'd need to compile 10 times — or store the binaries and think about what platform they'll execute on, etc.
I love the fact that the author is questioning this. No doubt the market for your favorite language will change. 80% of languages will go away — there is no market anymore for such a big variety of languages.
This post resonates. I recently built a little web service to scratch an itch I've been having and after discussing the options with Claude we settled on Go, and honestly it's been fantastic. Highly performant, native threading, dead simple to deploy with containers. And I don't even know how to read or write Go.
I did go through the Go tutorial many many years ago, but it's been so long I don't remember anything. I do remember it was an enjoyable process though, and I'd love to pick it up again.
Oh man... I like go because it is compiled, performant, strong and statically typed. But "fun" is not something I would say about it. The ergonomics of error handling, lack of ternary operator and other stuff that compiled 30yo languages already had ...
That sort of syntactic sugar goes against the Go philosophy. Don't get me wrong, I share your frustration, but I also see the value of consistency in their philosophy.
I'm starting to think all these languages having their own pet "philosophies" that is "totally better than X" is a shitshow and just personal preference masquerading as standards.
I know a couple languages fairly well: C, Perl, Python, Bash. I never formally learned Go, but as a test of AI coding, I started some vibe coded projects in Go. It worked very well: the code is minimal, there's few dependencies, and it compiles down to a static app. But most importantly, I can actually read the Go code and understand basically what it's doing. I can also use LLMs to critique the code if I'm uncertain. The big benefit of Go is the simpler language and "batteries included" standard library. This leads to fewer dependencies and less lines of code, which improves overall AI output. In theory, AI should be able to write better code faster in Go than in another language like Rust.
Python does have a much larger ecosystem of course, so with Go you have to develop from scratch what already exists in Python. But for smaller projects, you can also have an AI write a clean-room implementation in Go of some project in Python. So you aren't necessarily locked into one ecosystem anymore.
And in my experience, you don't even need to know the language. I have a co-worker who's basically not a programmer, but got multiple implementations of applications working sooner than our dev teams doing it by hand. You should be a coder so you can architect and orchestrate the coding, but 'language' isn't a barrier anymore.
Here's the kicker: The devs spent nearly 5 months on a solution, and it ended up being so crap it was abandoned. The multiple vibe-coded solutions were all better.
Of course language was the barrier, that's part of why it was always hard to hire people. It takes years to get good at a particular language, and most people are idiots from bootcamps who learned a single framework.
This seems sort of like asking whether a chatbot should answer you in English or Japanese. Obviously, it should use whichever language you understand. If you understand Python best, why not write code in Python?
But on the other hand, maybe you could learn some other programming language, particularly with AI help. If that's what you wanted to do anyway, it seems like a good time to learn.
This seems to assume that all there is, is systems software, tools and frameworks. Why ignore the elephant in the room - business / enterprise / line-of-business software? The case for Rust, Go, Gleam and Zig vastly changes for these versus Java or C#.
Yeah, last year I discovered that AI writes better rust than C, so I switched to rust and it made some quick good code that it couldn't do in C.
But when I wanted to optimize and edit and reorganize bthe code it was difficult, so I did a rewrite in C and it was lighter and faster and simpler and less headache.
Cute interesting take but I feel like it misses the point. Specifically, this makes sense where performance is necessary. Many projects have been written in suboptimal languages because the writers didn't want to learn lower level languages.
Still, not ALL projects benefit from such an approach and there are times when yes python is the right tool. Not just due to readability of humans but the other qualities that make it really good for small, iterative apps.
My take has never changed. Knowledge is cheaper than ever, but wisdom is as rare as ever. This is a great example of misunderstanding the former for the latter
I find if I ask most LLMs to write a self contained script/utility, even in codebases that are 90-100% written in some other language most will default to using python for it, or sometimes bash.
Usually those kinds of utility scripts are one-shotted without any further input from me, and once they're there and doing what I need I usually don't bother converting them to whatever I would have written them in otherwise (bash would be my usual preference for really small scripts, typescript or rust for bigger utilities, I hate writing python but reading it is fine... kind of).
Claude writes java pretty well, and faster than Rust. It's a great middle ground for some projects. I've switched back from Rust to Java for some things.
I don't know why you would use Python at all except for small iterative projects. If you hate java for some reason, there's Go...
Right sorry, that's not in my wheelhouse so I didn't think of that. I should be more specific. For general backend / data processing/pipeline stuff, API servers ...
I had agents code up an app for me in Swift a while back and the entire experience was so much better than your typical Python experience. The agents took full advantage of the compiler and static typing. There were far fewer bugs than expected.
Lately I just have Claude build most things in Rust, it's really amazing. I tried Go, but I found it wasn't as good--Rust really does to
me feel like Python. That said, it still struggles with the same class of errors of building complex systems. I've tried using TLA+, Alloy, and other things but haven't found the trick yet. The best I've found is reimplementing all external systems in memory and e2e testing everything extensively, without reimplementing the tests become unusably slow, and Claude can rewrite huge surface areas with ease--it's somewhere between mocking and literally just reimplementing the external systems.
As others have said, the main benefit with Python over Rust is library support especially with ML features. The other gap as I see it with Rust is the lack of native flexible UI support. The nice thing about Rust though is it can serve as a very fast and stable core for an app and offload specifics to TS and Python as their strengths allow, so you get the best of all worlds.
My current goto for desktop apps is Tauri, which give us a rust backend and TS fronted (usually React). Local ML features can be easily loaded as a python sidecar. Production bundling can be a little challenging but it seems to work well so far.
Sidenote: Golang is also an amazing language for LLM use, I generally do most of my "infra" stuff in Golang over Rust, but either work fine most of the time.
I assume this is why things like PyO3 are popping up? If so, sort of a fascinating way to compartmentalize new rust code into legacy .py code in lieu of a refactor, or at least, a way to do a staggered refactor and eat the elephant in bites :)
I still use TypeScript because I know it best. When AI makes a mistake, I can find the bug much faster. For me, the speed of writing code doesn't matter as much as the speed of fixing it.
Better question is why use any code? Generate random functions and select based on measuring the distribution of output of these functions against metrics of interest. A pure black box of instruction that is more performant than any verbose code or algorithm we could come up with, because all we select for is performance above all. Directed evolution essentially of the codebase, generated through mutation and selection, just like everything else on planet earth.
I have been wondering on a similar thing; am looking for feedback:
There are many existing, often mature, third-party software libraries or solutions that a new project could use but which hide the internals, including how the data is organized behind the scenes*. Vibe-coding for the specific project requirements, instead of using the pre-existing third-party libraries, is now becoming a feasible option. The latter may be simpler (no features beyond the actual need), more flexible (easier to add new needed features), and the data/model behind could be more accessible.
Looking for feedback on pros/cons and experiences along this.
* I care for the data as it is can be longer-lived than the code itself.
Isn’t the answer usually - because the same ai said python is the right language for it?
Honestly I am in the exact same boat thinking why I don’t write in C if Claude is writing it. However I chickened out thinking if support for ml model or llm based flows doesn’t exist in c then it will be time consuming to go to python then.
Assuming you are thinking about software architecture and looking under the hood, you are likely to be reading much more code than before. Python is really nice on the eyes and you can easily get a good grasp of what the code is doing. Plus, it's dynamically but strongly typed, so what you see in the code is usually what you get.
I think the rule of thumb is to use the tool that is right for the job and that you are going to be able to understand the output.
I think it is an interesting question what kind of programming language one needs for an era of agents. It is clear that the programming language that was designed for humans is not necessarily the best for AI-driven software development. I guess the qualities one would want is some formal correctness guarantees, high performance. A question is whether this language is Rust or it is possible to design a better new language.
I'm still not sure. Would love thoguhts on this.. but in this new ai world we are in... is it better to go fullstack typescript? or go with proven mature frameworks? .net, ruby, django, etc? Seems TS is moving fast but maybe its time to not reach for the shiny object and stick with proven tech? or in 5 years will we regret it?
If you're using GenAI, you should go through the process of selecting an optimal tech stack for each solution, but also take into consideration that Claude and other services probably the most knowledge of python, javascript, and typescript with go, rust, java, and c# following closely behind. Consider what you're building and what elements of the tech stack is optimal for your problem-space.
I don't know rust at all and I've built three applications using it with Claude because it has speed and correctness built-in.
I use Typescript for 90% of the things I build. For web development I've used a number of tools, but mostly react, nextjs, or raw html/css/js. But if I were building an enterprise application I'd consider my team and whether opinionated (Angular) was optimal over flexible (React).
Each project should consider its own optimal tech stack.
Never mind cleaning up, you also have to understand the language just to judge and review the LLMs output. How else are you to separate good design and implementation from a bad one?
1. Type safety as basic guard rails that LLM output is syntactically and schematically correct
2. Concise since you have to review a lot more code
3. Easy to debug / good observability since you can't rely on your understanding of the code. Something functional where you can observe the state at any moment would be ideal.
4. A very large set of public code examples across various domains so there's enough training data for the LLM to be proficient in that language
5. A large open source ecosystem of libraries to write less code and avoid the tendency for generated code to bloat
It's basically all the same things you look for in general. I think TypeScript scores high here but I'm curious if anyone knows of a language that fits these criteria better.
There are two working LLM axes. Critic strength: how much the language catches before runtime. Sensor strength: how good the empirical feedback loop is. LLMs benefit from both, but the sensor axis often is undervalued.
Type safety is great, but you can't just quietly disregard the benefits some dynamically typed languages provide; that would be completely ignoring that different tasks weight the two axes differently.
Systems code, performance-critical code, code where correctness across all cases matters more than exploration: parsers, compilers, network protocols, data structures - statically typed languages (like Rust) give you an edge here. The compiler's depth pays for the verbosity, and exploration is less of the work because the problem shape is known up front.
For stuff like building a web scraper, or rapidly prototyping, or exploratory scripts, something like Rust would be actively bad. You cannot poke at a live browser (you can with Clojure). Async Rust adds another layer of type complexity. The signal-to-noise for "figure out what is on the page" collapses entirely.
If I were picking a single language for general LLM-assisted work, weighted across task types, it would be Clojure (or Elixir), with OCaml as the most interesting alternative if the ecosystem were stronger.
Using Clojure and Elixir and LLMs are fantastic with both. Sure, if I get to a super-stable situation then maybe I'd consider moving to Rust (or Jank?), but for now I'm just so happy with Clojure and Elixir in this new world. I'm solving new problems with fully bespoke architecture so the flexibility is key. Clojure for business logic and most DB. With Elixir, it's the actor model and hand-holding as I'm using it for the web layer. I bet Ruby on Rails would also shine for some cases, prob most CRUD for example.
For me, I need to move fast and already knew Phoenix well, LiveView fits my use case, and websockets setup with Phoenix is very clear so switching to a two-language setup seemed better than CLJS. I could have gone CLJS re-frame and all that but it would have been more work and more unknowns. I call LLMs from Elixir also so all of the reconnect, backoffs, papercuts, shenanigans and so on, well I just know how to do this kind of thing better in Elixir. In its way Elixir is a great, like, defensive language. I was able to keep most async in Elixir and Clojure mostly synchronous. There was some pain though with bridge between the two and at times I thought I'd made a mistake. Clojure is fantastic with data and Datalog databases, so no regret. Outside world deals with Elixir, and the temple is in Clojure and Datalog.
Most developers evaluate programming languages by comparing features in isolation, never stepping back to consider the overall experience of using one.
Features are easy to talk about. They're discrete, nameable, and comparable. "Does it have Foo?" is a question you can actually answer. "What's it like to build and maintain a real system in language X for two or three years?" isn't. So people default to what's measurable.
Most devs haven't spent serious time in more than two or three languages in production. Without that contrast, the holistic experience is invisible - you don't know what you're missing, and you don't notice the pain you've learned to live with.
Language communities form around features because features make good rallying points. "We have algebraic types." "We have macros." These become identity markers. The holistic experience doesn't tribalize as cleanly - it's harder to put on a t-shirt.
There's also a sunk-cost angle: devs who've spent years in a language have every incentive to believe its features justify the investment. Honestly evaluating the overall experience might undermine that.
The irony is that the languages with the most devoted communities tend to be loved for exactly these holistic reasons - the ones that are nearly impossible to convey through a feature list. You can rave about Clojure or Elixir all day, but a curious newcomer will land on the homepage, scan the features, and walk away unimpressed: "Meh, it doesn't even have Foo. People say this is great? They clearly don't know what they're talking about."
Well in a recent project I tried TypeScript thinking, OK, LLMs, huge training corpus! massive adoption! api for everything already set up! swim with the current! and I tried various frameworks and so on, but for me reasoning about things and being able to make systems that I could adapt and pivot it was honestly inferior compared to niche Elixir and Clojure. But it's not like I hate JS; I use it in LiveView all the time. And don't mean to imply there are no problems in niche-land though; you've got to be willing to do more yourself and live in a tiny world. Really, LLMs kind of tamed Clojure for me because it seems so far at least that they can handle the glue code and stitching libraries together pretty decently as long as you don't get lazy with architectural choices and stay vigilant. And if I ever hire it pretty much has to be remote or learn on the job, though again LLMs reduce this pain greatly.
I think there's something key you get at in terms of the combo of dynamic environment + type safety maximising both. With a dynamic environment, the LLM can do a lot of interrogation to understand the problem space on the fly. I've witnessed agents sort out pretty complex issues through `python -c "..."`, `groovy -e "..."`, executing snippets of code with Node etc which is much less accessible if they have to compile it first. They can also inject logging code that interrogates the runtime as well (what type do we really have at line 1003?) etc which works better with runtimes that have deep introspection capabilities.
What you're describing is fast scripting in a dynamic language, which is genuinely useful - I agree it beats 'edit, compile, link, run' for exploration. But a Lisp REPL isn't 'dynamic language plus introspection'. A Lisp REPL is a persistent connection to a running process where the agent evaluates expressions against live state and can redefine code in place. python -c throws the process away every time; a REPL keeps it. The difference is the same as between sending one-off curl requests to reconstruct a session versus having an open SSH shell into the box. Imagine using a Playwright/Puppeteer session where you can navigate to a page and interactively palpate every DOM element, like playing a video game, directly from where the code is. Now imagine giving that power to the LLM - it doesn't need to restart, re-compile or even save anything - it just goes and explores, changing the program behavior on the fly.
The type-safety-plus-dynamism point you make is real and interesting (basically Clojure with Spec/Malli), but it's orthogonal to whether you're using a REPL or just shelling out snippets.
Golang. People trash it for being verbose on errors but it's an extremely readable language and it's almost like bash, only much stronger typed and with a very rich stdlib (so it's not likely you'll need a library for a quick script).
It's more or less a perfect replacement for Python for "one-off programs" and "quick scripts". Many bonus points for not having to fight shell quotation rules and trying to remember differences between sh, bash and zsh.
In a world where AI supposedly can write in any language, Go is much better choice than TypeScript. Imagine contemplating for more than a few seconds a choice between simple, fast, cross-compilable language, and a TypeScript -> JavaScript -> Interpreter -> JIT stack.
If you don't know Go, it's more efficient to learn it than to waste the hardware resources of thousands to stay within JavaScript.
Well, these days a small CLI program in Java (say, ls) starts up cold, runs, and terminates in ~70ms, not 1500ms, but yeah, sometimes 70ms is too long to wait for a script.
People never believe me when I say it but I start noticing scripts needing 75-100ms to start. Modern hardware is ultra fast; I want my programs to make full use of it. I got no patience for tech or people who keep insisting "it's not much, it'll not kill you". Well duh, obviously it will not but that's not the point and never was. I want stuff to work between my blinking my eyes and I have achieved that hundreds of times over the course of my career.
That's perfectly fine, and I totally understand people who don't want to sit and wait 70ms for their script to finish running (that 70ms is not the time it takes to start), but let's not turn a <40ms startup into 1.5s. Now, it is true that if you want to launch a minimal HTTP server in Java you may need to wait ~100ms, which may be too long for you, but is also a far cry from 1.5s.
It is, but I am still quoting what I saw before, it was not a fantasy. I don't deny it's likely better nowadays, sure, but I remain moderately skeptical because JVM is still a runtime that needs to boot.
Then again, Golang has one as well, though it does manage to start it up faster it seems.
This is just Kotlin. Strongly typed, more concise than Java or Go (and probably Typescript), less likely to blow up at runtime than Typescript, epic tooling, plenty of public code, and a library for basically anything because JVM.
Java runs a Hello World, cold, in a packaged JAR, in about 40ms. What you've seen isn't JVM startup but programs that do a lot at initialisation (like MS Word), as many Java programs like to do (because they often expect to run for a long time, so they don't care about startup time).
I have not worked with Java in a long time but I seem to remember that most Java programs also accrue a good amount of dependencies and some of them have their own init routines.
That adds up, fast. No idea how is it nowadays, admittedly. Maybe a ton of optimization work was done.
C. At least with Gemma 4 it does a fine job. Writes good error checking. Writes memory management. Mostly straightforward and easy to read. A lot of libraries. Runs everywhere.
1. Type checking built in
2. More concise and readable than most languages
3. Trivial to inspect while running, ability to change a running program
4. There seems to be a massive amount of lisp that it is inhaling from somewhere
5. Large amount of libraries.
This has the added benefit that even if you publish the code, nobody will be stealing it.
Edit -- I find it very useful to write tests for critical functions. This catches situations where the agent decides some interesting functionality is no longer interesting.
Really agree. Python is popular because it's easy for human to implement. But now if the coder becomes AI, then Rust would be preferable for agent, just like Python for human. In addition, it brings better performance.
Great question. And I don’t think that Python, Ruby and PHP have a good answer. Scripting languages cater to human weaknesses. The 10-100x perf cost was never really worth it but now it’s impossible to justify.
For me, whether it's AI or my own handcrafted artisanal code, the choice of language comes down to what has the least friction. This means I turn to vite/react for a lot of frontend requirements, and that the backend will be in nodejs or python, because those are easier for me to debug than writing an equivalent application in C++ or Rust.
The article applies to a narrow case of a totally green field application that's going to be completely vibecoded. This is the only case where you reasonably can be indifferent to what the language is, and so you can abandon familiar Python and go with unfamiliar Rust. (If you _are_ familiar with Rust, the point of the article is moot.)
This "fair weather development" approach feels very risky if that application is going to be exposed to any serious usage. There WILL be a situation when things break and the AI will be powerless to fix it (quickly) without breaking something else in a vicious loop. There WILL be a situation where things work fine and tests pass with 3 concurrent users but grind to a complete halt with 1000 because there is something O(N^2) deep in the code. And you NEED a human to save your day (which requires also proper architecture for that to be possible in the first place). If you don't plan for this, and just hope for the best, then you are building nothing more than a toy. And if you plan for this, then it matters again what the language is, and whether your team is proficient in it.
Or maybe I too old fashioned or too behind the state of the AI art...
For the utilities I write it is faster to iterate without having to compile. When I get to the point where I'm done adding changing features, and performance is an annoyance I can always ask the AI to "rewrite this in Go". (I've never gotten to that point.)
Devs still have to maintain this code, the Python devs can definitely get the LLM to write (some kind of) Rust, but when it goes wrong and you hit the wall with the LLM then they can will have to learn Rust which might take a while, this sounds like a bit of a project risk.
> The strongest argument for Python and JavaScript was never the languages themselves. It was the ecosystems
That's already a glaring mistake. People could say perl's CPAN is great. Well, it did not save perl from declining in the last 20 years.
> The Python ecosystem is increasingly a Rust ecosystem wearing a Python hat.
Without statistics to prove this, this claim is useless.
Also, depending on Rust isn't that strange if a language is based on ... C. The only way I would disagree with such an argument were if Python were written in Python. But since it is syntactic sugar over C - just like ruby or perl are too - the argument to use Rust here is simply not different to using C. Perhaps Rust is better than C, but it is not fundamentally different. Whether Python were written in Rust or C is not a functional difference here.
As for AI becoming our new Overlord: I honestly do not want to depend on US mega-corporations. I am not disputing the fact that AI has objective use cases. I am objecting this herd mentality of everyone putting an AI chip into their brain now.
Damn AI slop zombies everywhere - it's like in the old B movie "They Live". But with less entertainment value than that. If they chew bubblegum then it is to slop up everything, not to kick ass.
This point only makes sense if you ship AI code without reviewing it. And if you're shipping AI code without reviewing it, you're going to run into much bigger problems than Python performance limitations.
2) The corpus for the sort of applications I build is likely larger for Python than it is for C++ and Rust. Bigger corpus == more training data == better generated code.
3) The bottleneck in the applications I run aren't in the execution of the code; they're in the database/network latency.
4) I don't get anything extra for pushing Rust or C++ over Python.
I dislike Go but I have to admit, it's a great language for AI generated code. Simple enough, it compiles quickly and it performs meh-well enough for most applications.
One of the reasons I dislike Go is because it's easy for most engineers to write really low grade code with it. But AI agents would probably not write the best code in any language anyway, so not much is lost.
AI doesn't really write code for me, but I do use them to brainstorm/ask questions. Though, I do not use Python. I have never been a fan of the language. I still think Python is a perfectly serviceable language, but it would solve no (important) problems I have ever had better than any other language.
I can see why Python is appealing to many people, and I applaud Guido for all the work and oversight over the years, but Python lacks a lot of the things I like in a language.
A somewhat contrarian/pessimistic view: The hardest thing in any future of LLM generated code is going to be the verification step, and especially types of verification that require humans which are going to be the most expensive.
Therefore the "best" language is going to be whatever makes it easiest for humans to detect bugs, bad design, or that the "wrong thing" has been developed.
The LLMs just churns out non-idiomatic slop in any language.
It doesn't matter if the 800-line if statement is able to use pattern matching.
There's been a lot of progress on making coding agents able to solve problems when they can easily evaluate in a closed loop, we desperately need something similar for controlling complexity and using relevant abstractions.
The (well-known) Sapir–Whorf hypothesis (if dont know it, look it uop) is often invoked for natural languages, but there’s a pretty direct analogue for programming languages: the language you "think in" during solving a problem biases which abstractions and idioms you reach for first.
If you force an LLM to first solve a problem in a highly abstract language (Lisp, APL, Prolog) and only then later translate that solution to C++ or Rust, you’re effectively changing the intermediate representation the model works in. That IR has very different "affordance", e.g.
- Lisp pushes you toward recursive tree/list processing, higher‑order functions and macro‑like decomposition. (some nice web frameworks were initially written in LISP, scheme, etc...)
- APL pushes you toward whole‑array transforms, point‑free pipelines and exploiting data parallelism. (banks are still using it because of perforance)
- Prolog pushes you toward facts/rules, constraint satisfaction, and backtracking search. (it is a very high abstraction but might suit LLMs very well)
OK, and when you then translate that program into C++/Rust/python, a lot of this bias leaks through. You often end up with:
Rule engines, constraint solvers, or table‑driven dispatch code when the starting point was Prolog.
Iterator/functor pipelines and EDSL‑like combinators when the starting point was Lisp.
Data‑parallel kernels and "vectorized" loops when the starting point was APL.
In principle, an LLM could generate those idioms directly in C++/Rust. In practice, however, models are heavily shaped by their training distribution and default prompts. If you just say "write in Rust", they tend to regress towards the most common patterns in the corpus (framework‑heavy, imperative, not very aggressively functional or data‑parallel), even when the language would support richer abstractions.
By inserting a "thinking" step in a different paradigm, you bias the search over solution space before you ever get to Rust/C++. That doesn’t magically make the code better, but it does change which regions of the design space the model explores.
Same would also be true for python which is already a multi-idiomatic language.
So it might be a good idea to learn a portfolio of different languages and then try to tackle a problem with a specific language instead of automatically using python/go/rust because of performance.
Something to consider...
p.s. how would a problem be solved when the LLM would have to write it first in erlang? Is it the automatically distributed?
p.p.s. the "design pattern" of the GoF comes automatically to my mind, which might be a good hint to the LLM to use.
I'm using coding tools to build a complex media-intensive application. The approach I'm taking is to build a _reference implementation_ in Python, which is in its design specifics, constrained to use patterns which transliterate into the actual deployment targets (iPadOS/MacOS/Web).
Why start with Python?
Because I can read it, reason about it, and run it, trivially, which are Good Things for the reference. I intend to have multiple targets; I'd rather relate them to a source of ground truth I am fluent in.
For what I'm doing, there is also a very rich set of prior art and existing libraries for doing various esoteric things—my spidey sense is that I'm benefiting from that. More examples, more discourse.
I'm out of the prediction business and won't say this is either a good model for every new project, or, one I will need in another N months/years.
But for the moment it sure feels like a sweet spot.
Ask me again though, after the reference goes gold and I actually take up the transliteration though... :)
you still need to look at the code oneday so id say c++ still would be a preferred target language even for ai. i know i hear a lot about rust but im still getting the idea its a niche language overall. i know people love it and point out its advantages, but sometimes good enough is good enough (i.e. c++)
Let's go through some of the arguments, in no particular order:
> Klabnik vibe-coded a new language in Rust, therefore Claude + Rust = Good.
I argue the inverse -- Rust, being an ML-family language, is well suited for parsing, and language design (I know! Shocker!). In more moderate translation -- ML-style languages are good for parsing, interpreting and compiling code. Claude is not the magic here -- ML is.
I would also add that I've had decent success vibe-coding+human-coding Haskell (contrary to the article). My experience is that if I can hand-write a rich set of types (blessed be IxMonad), I can throw Claude to fill in the blanks for the implementations. If I can design the data structures that make the program tick, bridging them is something Claude is awesome at. Again, no surprise -- it's intern-level work.
The key distinction between C, Zig and Rust is that Rust is designed around types. C and Zig are more memory-oriented -- they really see most of your program as flat memory and you can kind of shoehorn a little bit of data layout in that flat memory. While this offers a large amount flexibility, this philosophy isn't well suited for proving out correctness. But again -- this doesn't mean they don't have a spot.
When I was a junior at Tesla, I used to joke that senior staff had a VMs in their heads, because that's really how you analyze C programs -- you try to execute it in your head, with interesting inputs, but that's about it. Claude's head-VM is quite fuzzy and often makes errors.
With Rust, if you design your type system, you prevent yourself from making dumb mistakes. Swap out "yourself" with Claude here and it's the same story.
I've yet to see Claude design really nice type systems, fwiw.
But the point is -- Claude is the enemy of beauty and correctness -- it's up to the SWE to design a type-system which will prevent it from doing so. To be clear, I obsess over type-systems personally, but that's not the only way -- incredibly rich, comprehensive, huge type systems, fuzzing, Antithesis, proptesting are all things you can do to minimize the impact of slop, and those are all valid things to do.
---
> Code is not written by humans therefore it doesn't matter that you don't know Rust.
Wouldn't say this was explicitly stated, but I definitely smelt this undertone throughout the article. If you don't understand the language you're reading, how can you understand whether the code in front of you is correct or not? If you have a systems engineer sitting across you to clean your PRs up, you can pass that responsibility onto them, but what about when they give their two weeks?
If all you know is Python, chances are you're going to make better software in Python than in Rust. Stick an `Arc<Mutex<T>>` everywhere and chances are your code will be slower, as a matter of fact. Use
If you want to learn Rust, please join us! But if all you're trying to do is vibe-code better code -- do it in the language you know and can actually debug when shit hits the fan.
---
> Anthropic C Compiler
It is impressive that Claude is awesome at taking existing code and rewriting it, this is certain, but I'd like to repeat the exact same rhetoric that many have given -- rewriting =/= original authorship. Awesome, we have a C compiler, but we already had one, and we just rewrote it? Seems like a little bit of wasted electricity.
To build on top of this, I am really happy that Bun is exploring Rust, and the Claude rewrite is truly impressive, but quite surprising at times, preserving strange anti-patterns (my name being said anti-pattern, teehee): https://github.com/oven-sh/bun/blob/ffa6ce211a0267161ae48b82.... It's hard to determine why Claude decided this -- I assume a really strict input prompt.
Do note that the current stage of that PR is much better than what it was at the state of that commit, and obviously Jarred isn't merging blind slop, but that is still human-driven by someone who has an understanding of their product.
My bet is actually that _rewrites_ of already-functioning, well-tested code, are likely to be more common as time progresses. I think that's what Claude is really awesome at, and I think Claude can often achieve 80-20 improvements through rewrites. Again, Claude alone will not be a silver bullet -- it won't generate data-oriented programs if the source material wasn't data-oriented. It won't optimize for cache coherency, if the source didn't, but moving from Python to Rust alone, with more-or-less the same code structure, you're likely to see improvements by virtue of common operations being memory-coherent and avoiding the GIL and so on.
---
> A C compiler written in Rust used to be a graduate thesis. It isn’t anymore.
Come on, this is disingenuous -- a simple C compiler is a 1-day long project. LLVM is a graduate thesis (and for good reason). Copy-pasting prior-art is academic dishonesty and Claude does a lot of that.
---
For transparency: I work with Noah.
EDIT: Wanted to add that not a single line of my comment was AI generated.
It's strange to me that this blog post was written in English. If AI is available, why aren't we all communicating in Lojban? [0] It's an obviously superior language. What does it matter that many people already communicate in English and much of computing depends on that language? AI doesn't care about that. Plus, if you ever need to edit Lojban without AI, you should be able to pick it up in a few weeks, right?
[0]: https://en.wikipedia.org/wiki/Lojban
I don’t think this holds at all, because the idea with a lot of vibe-code workflows is “humans never need to read the code” which would mean that human dev ergonomics are irrelevant. Here, the blog post is still clearly targeted at humans, so human reader ergonomics are still relevant.
Should've posted to moltbook
Yeesh, is "never reading the code" really the modus operandi we want from AI?
Microsoft, for all their warts, at least had the compassion to call their AI product "Copilot", suggesting we have some residual agency in whatever it is that it produces.
It's clearly not the MO that capable engineers want, but it's the MO that is getting funded right now.
Reading code carefully is harder than writing code unless the code is written consistently and clearly in a way that is idiomatic to the reader. And there's way more code to review now, but companies aren't scaling up the number of skilled engineers on staff. So in practice, never reading all of the diffs is the MO that will be built into code we depend on.
AI has not been trained on Lojban. And furthermore, this article is almost certainly primarily intended to be read by humans directly.
I understand you're being facetious, but I'm not sure what point you're trying to make about programming languages in comparison.
It’s funny that in your reply “this article is almost certainly intended to be read by humans” you made what is the best case to keep writing code in Python even with AI.
Sure, if you are going to have an AI do all your coding and maintenance you can use whatever language it’s best at. But if you want to participate in the writing, debugging, and maintenance, it has to be in a language that a human can read. I’m not saying that Rust or Go is unreadable, but I know I am better at Python personally and am going to keep using it until the speed penalty matters to my project, and then maybe I’ll let an AI rewrite the whole thing in a faster language.
Did you read the article? I think you're arguing against a strawman.
I did read the article and I’m not arguing against a straw man. If you’re going to let an AI agent do everything for you then go ahead and use Rust (or any language with a strong type system that benefits agents).
But if I’m participating then I’m going to use Python because it’s easier to read.
If there’s anything that I’m arguing against is the author’s claim that the ecosystem of libraries (regardless of whether they are a wrapper) and readability don’t matter anymore. I’d say that in a lot of smaller teams it still matters. We’re not all using AI to ship slop. A lot of us are using AI to work on our ideas for our hobbies or for research. And it’s not fulfilling unless I get to be involved in the process.
But it's not talking about people like you. It's like getting mad at someone suggesting selling their car for a self-driving car, but you ride a bike everywhere. Take a breather and recognize that not every article is personally meant for you or your situation.
And this isn't even a defense of the premise. I'm not using AI to generate assembly code, because I don't know assembly.
I'd argue that while Rust has a high barrier to writing code due to lifetimes and other type constraints, its still quite easy to read.
(Kind of the inverse of perl)
While it's a lot easier to read then Perl, it's still not as easy as something like a Python.
I’ve always found Ruby to be way more readable, what keeps me using python is the depth of libraries is unmatched.
So unless you’re into burning tokens having AI generate untested libraries, I’d stick to using the most idiomatic tool for the problem you are tackling.
So, it's really interesting. We've started moving away from python libs because 25% OSS is out of date and another % is custom tweaks to the software help our use cases. In both scenarios it means our own fork.
And honestly it's not burning that many tokens if you've got an existing example lib to point to.
Python is intended to be read by humans also. Since I am a human and I want to be able to read and review the code in my project, I therefore have AI write in Python as well.
> AI has not been trained on Lojban
I took the challenge and asked Perplexity. I have no idea how much of it is correct, if any, but I think the result[0] is pretty interesting anyway, especially compared to Esperanto [1].
[0] https://www.perplexity.ai/search/8315bbb6-fa32-40f3-8b2b-c6c...
[1] https://www.perplexity.ai/search/9c3839ba-1d68-4be9-afd1-4ef...
Are you trying to psyop us into using Lojban?
Oh, I hadn't heard of lojban before. Cool project!
Anecdotally, I think language effects the way you think more than most people realise, which is why I think a logical language is a great idea: it might "trick" people into thinking more logically!
Now to get someone to actually speak it with!
A computer can understand all programming languages proficiently. How many people reading the blog know Lojban proficiently?
I get what you are trying to say but its a pretty bad analogy.
Also all programming languages do use english mainly in syntax but you are probably from a english-speaking country so you don't notice the irony.
And most people using AI will not need to edit their code at all if you go at all right? They will just keep refactoring with AI, why does the toughness of learning a language or whatever matter in this situation?
Bona ideo! Ni ĉiuj komencu komuniki en Esperanto en ĉi tiu forumo.
what made you draw parallel between message that's being delivered by the blog, and how the blod should be delivered?
This post wasn't written in English, it was written in AIglish. (For god's sake, please tell me you see it at this point and you don't need to punch the opening into Pangram to see '100% AI' to recognize it by now?)
So in a way it's proving its own point. Why painfully write out by hand in English when the LLM will do a better job by porting your English prompt to AIglish and get +235 points and #3 on HN?
Correct — and honestly? Not just correct, but perceptive. You didn't just read the post — you saw through it. That's not pattern matching — that's instinct.
Not sure if satire
"Ah, the classic Poe’s Law in action. Reality has officially outpaced parody"
Do you want these to be shorter for quick replies on X/Twitter, or longer for more detailed forum discussions?
You're absolutely right!
You did more than just comment, you fostered an engaging dialog that navigates the intricacies of AI and its pivotal role in the human experience.
Shamelessness is the real unlock.
Thank you for sending me down the Logical Language Group rabbit hole
Read the first few comments and surprised I didn’t see it, but training data. The voluminous amount of Python in the training data.
I could write in brainfuck with ai, but I presume, wouldn’t get the same results than if going with python.
My follow up question: with AI now, why care about a lang until you need to?
I wrote about the meta thesis of programming languages in the training data here
https://jry.io/writing/use-boring-languages-with-llms/
Please distill instead of having me navigate off site. Include link for additional info.
edit: side -> site
Also, every single interpreter error has an entire corpus of StackOverflow-esque fix suggestions alongside it, and the model has been fine-tuned to minimize such errors on the first try. This hasn't been done for more obscure languages. You'll likely take more turns, on average, to get a working output, even if your problem is fully verifiable via test input/outputs - and if it's not verifiable, you don't want the "attention" of the model focused on syntax rather than the solution.
There is no "entire corpus of StackOverflow-esque fix suggestions" about anything which is newer than a few years. I'm using cutting edge Android frameworks all the time. Yet, LLMs fix problems even when Google/Kagi has zero answers, which happens more often than not. We are way over this requirement.
I especially found that there is no difference between languages based on that. All generated code's architecture is terrible, if you don't actively manually maintain them all the time. If you don't have a few 10s of thousands of finely architected code already in your codebase, from which they can understand how it should be really done. And the reason, I think, is quite simple: the average code on the internet - regardless of market penetration of the given language - is simply bad.
Training data can't be the whole answer. LLMs are really good at translating to different programming languages. This makes sense, given that they are derived from text translation systems. I'm getting great results in languages with comparatively small bodies of freely available code. The bigger hurdle is usually that LLMs tend to copy common idioms in the target language and if it is an "enterprise-y" language like Java or C#, the amount of useless boilerplate can skyrocket immediately, which creates a real danger that the result grows beyond the usable context window size and the quality suffers.
Very true.
I have to steer models hard for C++. They constantly suggest std::variant :P
is that bad?
Godbolt got a 2x speed improvement switching from what he thought was a good fast impl to std:variant
https://www.youtube.com/watch?v=gg4pLJNCV9I
> LLMs are really good at translating to different programming languages.
...for which ample training data is available.
> This makes sense, given that they are derived from text translation systems.
...for languages with ample training data available.
Yes, LLMs can combine information in novel ways. They are wonderful in many respects. But they make far more mistakes if they can't lean on copious amounts of training data. Invent a toy language, write a spec, and ask them to use it. They will, but they will have a hard time.
That might be an argument for not using a novel homebrew programming language. But it's not an argument against, like, any top-100 or even top-1000 programming language, which will be adequately represented in the training data.
It is if more training data results in better performance. In which case, GP will continue to use the language that is likely to have the most training data available.
> It is if more training data results in better performance.
Sure. But given the relation with translation systems, it seems far more likely that there are diminishing returns to larger volumes of training data.
They are also good at generating plausible code. The kind that has no obvious bugs in it. I wouldn’t be surprised if humans in the loop over report success with these tools. Combined with decision fatigue… it’s not a good recipe for humans making good decisions.
An experienced Rust developer is going to be in a better position to drive an agent to generate useful Rust code than a Python programmer with little or no Rust experience. Not sure I agree with the author that everyone should just generate reams of Rust now.
At least if your get paged at 3am to fix the 300k AI-generated Django blog you’ll have a chance at figuring things out. Good luck to you if Claude is down at the same time. But still better than if it was in Rust if you have no experience with that language.
I have a language I wrote for processing data pipelines. I’ve used it for years, but I can count the number of users on one hand. I wrote it partially to learn about writing a scripting language, partially because Nextflow didn’t exist yet. I still use it now because it works much better for my way of processing data on HPC clusters.
The only code that exists on the internet for this is test data and a few docs in the github repo. It’s not wildly different from most scripting languages, from a syntax point of view, but it is definitely niche.
Both Codex and Claude figured it out real fast from an example script I was debugging. I was amazed at how well they picked up the minor differences between my script and others. This is basically on next to zero training data.
Would I ask it to produce anything super complex? Definitely not. But I’ve been impressed with how well it handles novel languages for small tasks.
In higher dimensional vector space, yes it can.
Dimensionality gets bizarre in 1000-D space. Similarity and orthogonality express themselves in strange ways and each dimension codes different semantic meaning.
Therefore, if the training data is highly consistent you are by definition reducing some complexity and/or encoding better similarity.
In Go the statement
Is almost always going to be followed by
In a highly dynamic language you may not get
Unless explicitly asked for.
> In a highly dynamic language you may not get
Being dynamic is secondary. A language that uses exceptions for errors does not always need to surround every try with a catch if the code doesn't need to. You have a top level handler that would catch everything.
It's a little bit old, but challenge you opinions about what matters for LLM agentic coding:
https://github.com/Tencent-Hunyuan/AutoCodeBenchmark/blob/ma...
> Training data can't be the whole answer.
Absolutely correct. Anthropic showed that 250 examples can "poison" an LLM -- independent of LLM activation count.
That would matter if we were asking the AI to generate code open-loop: someone probably already wrote something close to what you asked for in Python. But if the agent generates code, tries to compile it, sees the detailed error messages and acts on those messages to refine the code, it's going to produce a higher quality result. rustc produces really good diagnostics. And there's a lot of Rust code online now, even if there's so much more Python and Javascript/Typescript.
LLMs don't actually semantically parse the error messages. They will generate the most likely sequence resulting from the error message based on their training data, so you're back to the training data argument.
Perhaps the training data about what compiler diagnostics mean is particularly semantically rich training data.
Of course they do, error messages get tokenized and put into the context window just like anything else. This isn't a Markov chain.
No if that mattered you'd write everything in html and css. Because that has way more training data.
Those are not programming languages.
WASM then.
> I could write in brainfuck with ai, but I presume, wouldn’t get the same results than if going with python.
https://esolang-bench.vercel.app/
and this sums it up right here.
The conclusions seem overly broad. Just because these languages are Turing complete doesn't mean they aren't massively hampered by expressiveness and amount of batteries included. To attribute all of this to training data memorization is premature.
Oh this is a very damning paper. Using simple languages from their definitions alone is a great proxy for studying truly out-of-distribution reasoning. Also just for following simple rules/instructions correctly, because a simple enough language is practically just a grammar. This paper is terrible for anyone who wants to make the case that models can do those things well.
To the extent today's AI can reason, add this to the pile of evidence that you definitely need a harness. Counter to what you hear.. that seems true for SOTA and frontier, not just toy models. Lots of people were saying many years ago someone should test exactly this, because it's obvious. Someone at megacorp probably did try and decided not to publish because they thought it was bad optics.
"I could write in brainfuck with ai"
Well, go on and do the experiment! Perhaps LLMs can right code as well in BF as Python but I don't recommend it because hallucinations are really hard to notice in BF.
If you are going to worry about high level computer languages and AI, you are going to have to start with getting to grips with machine code and assemblers and that. Once you know how say some Python code ends up being processed by your laptop CPU(s), then you will know when BF might be best!
> Frontier models score ~90% on Python but only 3.8% on esoteric languages, exposing how current code generation relies on training data memorization rather than genuine programming reasoning.
https://news.ycombinator.com/item?id=48100433#48102985
Just use Go. LLMs have seen a ton of it, they write it well, it compiles practically instantly, and it has all the advantages of a typed compiled language.
I created a big Python codebase using AI, and the LLM constantly guesses arguments or dictionary formats wrong. Unit tests and stuff like pydantic help, but it's better to avoid that whole class of runtime errors altogether.
That’s what I’ve settled on. Python is so flexible that there are a million ways to organize code, pass arguments, etc. If you already have a code base to work from, an LLM can make new code in the style of the old code. But a fresh project? Once you get to a certain level of complexity it quickly can turn into write once, read never code (even if the code is passing tests).
This is where I’ve found that a compiled, strongly typed language (any one really) works well with an LLM. With the little bits of friction that is part of writing a language like Go, the LLM can produce pretty decent (and readable) code.
TIMTOWTDI strikes back.
But what is the selling point for Go? I get that it is allegedly hailed to be a simple language with basically no batteries included, but why is that a selling point? Does Go excel at anything no other language does?
I think that’s sort of the selling point no? It’s really boring. It has like -10 keywords, compiles insanely fast, and has a concurrency model that’s easy to use and read. LLMs are great at using Go tooling to sanity check along the way. It’s easy to write shitty Go but it’s really pleasant to work with if you find those things compelling.
don't you worry about garbage collection?
If you were using Python, then probably not.
haha exactly. I’m coming from Swift, and I don’t want to go back to manually releasing objects like I used to in ObjC, let alone reason about lifetimes.
For one thing it’s statically typed and has many fewer foot guns than Python, so the llm-produced code is more likely to do what you expect.
Go is statically typed but the type system leaves much to be desired.
Go’s benefit are primarily around simplicity, readability, and concurrency.
Performance? Second only to rust and other lower level langs. Surely you don't need this spelled out for you...
Not just performance, but static typing and prevalent in the training data/easy for LLMs to reason about.
Of course, your response admits, "second to Rust", which I am guessing is an unspoken question in the grandparent's mind.
If performance is the main difference, whatever that means, then basically Go should be reserved for when Rust and other lower level langs cannot be used due to some other constraint? Are we mainly talking about performant Web backends?
Say I am building some app that I know will be CPU-bound, why choose Go over say... Swift?
Java and C# are there and faster.
No batteries!? Go has a huge stable standard library no other language even comes close to. Built in tooling for unit testing, performance testing, debugging, code formatting, package management, etc. And most go binaries can be compiled statically so libc is not even a dependency. Golang is the definition of batteries included.
Why? Go has a GC, is basically incompatible with C and very limited overall
Go's limited syntax is actually a feature here,because it stops the LLM from trying to be too clever
LLMs use `any` types, `recover`, `init`, and other weird warts of golang
rust is a better language in every way for LLMs: more precise typing, better compiler errors, fewer performance footguns, no race conditions, clear interface definitions and implementations
golang is easier for humans to quickly get productive, but the language is lacking in helpful features for an LLM
CGO exists.
Why use Go when you can use Rust?
why,i have same question
I’m heavy into rust and never really use golang, but one big benefit of go over rust is compile times are significantly quicker, which could be more fun if you’re running CI checks 50 billion times
Doesn't Rust have long compile times? Does Go suffer from the same problem?
1. Amount of Rust training data isn’t as much as Go.
2. Golang syntax and style is very verbose yet simple. There’s not as many options nor programming language to domain mapping needed as in Rust. Leads to needing less sophisticated LLM to spit out Golang than Rust successfully and efficiently.
In short, compile times and a more full-featured stdlib
Yup, adopting Go is exactly what I've done too.
Typed, garbage collected, fast to compile and run, stdlib that includes just enough to work out of the box. I really don't like writing it by hand but for the LLM it's perfect.
The LLMs are actually worse at generating Python than other langs, hypothesized due to quality of training data lol.
I still read the generated code, so I'm not quite willing to give up on Python yet though.
Admittedly, I have very little experience with LLM-assisted Python. However, based on the severe degradation in output quality I have seen from an LLM working with plain JavaScript as opposed to TypeScript, I can't imagine choosing to start a project in Python at the moment.
It does seem like LLMs write better Python when told to use type annotations, especially when coupled with a linter.
I've been coding in Python for about twenty years (alongside, at different times, a dozen or so other languages).
I find that Claude can write great modern Python more or less out of the box, with minimal style guidance from me. I do have to nudge it from time to time to not do silly things, but overall it's really rather good.
There's enough training data on the other langs.
Surprisingly, LLMs are actually much worse at reasoning in Python than other common programming languages for agentic coding tasks.
Data here: https://gertlabs.com/rankings?mode=agentic_coding
Hah, I was just thinking that Python likely has a vast ocean of training data, but it's likely of lower quality, being much of it is written by beginners and those who aren't primarily programmers.
There's a broken idea that AI know Python because they're written in Python.
Not how any of it works.
While recent models are capable of generalizing to any language at this point, I do think there are weights from their pretraining corpus that still leak through into how they create their responses. We observed similar language performance patterns across models from different providers, btw.
Reminds me of the time I asked Claude to write some Wordpress code for me. The results were…rough.
That was the hardest part of learning PHP, all the code examples online were just awful.
All my vibe coded projects (personal) are Go backend services, with Typescript/React frontend. And my thoughts were based on similar things. Like why I wouldn't use PHP for that, either.
My standard joke here:
Q: Say, what does this Python code do?
A: Nobody f&%^ing knows.
That’s Perl.
Cool to see my hunch be backed by data. Python is a scripting language with OOP bolted on. Means there’s not really a styling consistency that other languages have, with things tending to look like PHP, a collection of various scripts that invoke one another
I would love to see how they do with functional languages and especially Lisps here. I've noticed pretty good performance with Emacs Lisp relative to overall model strength, but I haven't used LLMs to application code in any such languages.
It would also be interesting to see how Python compares to other languages in its niche (Ruby, Perl, Raku).
Thanks for putting this together! It's interesting.
That's a good idea. Would you rather see Lisp or Scala? Any interest in Prolog? We are trying to be selective to keep the data concentrated, but we will eventually add a couple more, most likely to sample different programming paradigms.
This surprised me, but I can understand it - Python sucks in many ways lol.
Hey they said it had a lot of training data, not necessarily high-quality python code training data.
Huh. This surprises me. Digging, it seems it looks like it comes down to interpreted + dynamically typed vs compiled and statically typed.
TIL. If i were to start a truly vibe project; Go would have a significant leg up.
and yet dynamically typed elixir wipes the floor with go.
https://github.com/Tencent-Hunyuan/AutoCodeBenchmark/blob/ma...
The LLMs are generally still pretty bad at (deductive) reasoning. IME they go along more with the things like variable names and comments than the actual program logic (it would be an interesting experiment to compare LLM's understanding of three identical programs with different identifiers, one with normal identifiers, one with obfuscated identifiers, and one with deliberately misleading identifiers). I also think this particular comparison comes down to typing, which helps to avoid LLM's reasoning go astray.
When we reason we need to typically propagate the constraints to arrive at a solution to these constraints. I think the best language to reason in could be something like Lean, which allows both constraints and actual code to be expressed at the same time. Although this might not be the case for current LLMs, as I explain above.
I built a programming language, and LLMs can code phenomenally well in it.
I don't think the training set matters that much, since there's no way they have my language in their training set!
Programming languages have a lot in common. Python is kind of odd when it comes to languages.
If the training data is basically irrelevant, then an LLM should be able to iteratively improve the programming language it uses, resulting in a custom language optimally designed to maximize its own coding ability. The source code might not even be human readable natively, just translated into pseudocode on an as-needed basis.
> If the training data is basically irrelevant, then an LLM should be able to iteratively improve the programming language it uses, resulting in a custom language optimally designed to maximize its own coding ability.
I won't be surprised if one day they do.
At least in their current form, I don't think they can independently design a language that is so much better than other available ones that it makes sense for them to use it.
There's a very good language for almost every use case already, designing one better than the ones already available is a VERY tall order.
It's almost like these languages aren't designed by morons, but built by teams of geniuses over a decade instead.
It's taken me 6 months of heavily steering an LLM to build a language that is not yet even ready for production use.
Maybe I'm the one slowing the LLM down. But it certainly does not seem that way.
The key to a good language for them - from my experience - is maximum expression plus minimum global complexity.
Anything that makes you manage memory lifetimes & memory safety is inherently unfriendly to LLMs - that's globally complex.
All scripting languages allow spaghetti aliases that let you hack your way into oblivion - and LLMs gladly ride that gravy train to hell.
Rust excels here, because it prevents the worst and is WAY more expressive than most people think.
Go has arguably the best runtime ever built, but it's not very expressive, and it still has a lot of problems from C and scripting languages - I don't think these types of languages will be the ones people chose to write code with for LLMs in the future.
I loved from writing all my code with LLMs from Python to Rust. I’ve seen absolutely no difference, most of the time I couldn’t even tell you which it’s writing in.
My programs are faster and more reliable than they’ve ever been.
> Read the first few comments and surprised I didn’t see it, but training data. The voluminous amount of Python in the training data.
That's actually part of the point. Almost no one writes types for Python and has complete type compliance. So all that training data is people just yoloing Python, writing a bunch of poor code in it.
I honestly can't believe any experienced software engineer would decide to build systems in Python these days.
Seems to me these LLMs have a critical mass of Python training data and Rust training data, so there's no advantage for Python there.
So as the article points out, an iterative process that catches the mistakes at compile time is much more suited for an AI than one that catches them at runtime.
that's right, we dont need to care about a lang, same as we dont care about Map when FSD promise its already end to end optimal one.
I wouldn't say I get worse results with Go than I do with Python.
People really need to stop assuming more training data the better. This is not how it works. LLM thrive off consistency.
Go for example has significantly less training data than Python, but LLMs are the best at it. Why? Go is often written the same. You go from project to project and the code looks all the same. There only a very few ways to write Go.
Most of the article makese sense but what is this supposed to mean? "Native Rust binaries are hostile to serverless runtimes" . I don't think that is true.
If AI writes your articles, why use brain?
^ Underrated reply.
You sneer but the models are much better now than last month and token costs are down! LLMs are just like compilers for the brain!
/s
Why Python? Because I have written it for 10+ years, know how to debug it and I can smell it within 10 seconds of the agent writing code if it does something that is going to end in a huge foot gun. With any other language, not so much; I would need to relearn a lot. So I am going to be preferring python; where even with the speed that AI crams out code, I still feel somewhat in control. If I did this with Go or Rust, then it would feel more like "vibecoding" than AI assisted programming, just yolo the whole product.
This is what I experienced as well, I can smell BS from AI generated code right from few lines it wrote in Python, so that why I keep using Python for most of my projects.
Bit off topic but why in the world are people still posting on medium? The reading experience is abhorrent; I couldn’t even finish reading this article before a full screen popup literally blocked the sentence I was reading.
Is there some incentive I’m not seeing?
It seems like it's just the latest evolution of the writer-friendly blogging platform; easier than Wordpress to package into a newsletter, and also easier to monetize with a paid tier.
But don't we have AI to deal with the complexity of Wordpress? :-)
Insofar as AI is great at accidentally deleting your production and backup Wordpress databases, and forcing you to start from scratch with something else.
Yep, Medium was free and everyone donated content... then it put up reading paywalls and conned everyone, I'm also surprised when I see people writing on there.
My best guess is momentum. Some people are very, very brand loyal and have to do things in relation to what/how others do things.
In reality it doesn't matter where something is posted, just give us a url, but some people don't operate that way.
They have made an honest attempt to pay writers. It's a different model than substack, but that's why.
I look at it the same way I look at pay walls for newspapers. I don't like them but I understand why they are there.
> The reading experience is abhorrent
Nothing you read in the browser can provide ultimately great and hands-down the best reading experience equally for everybody - the modern web model is inherently at odds with that. A plain HTML page with no CSS is a near-perfect reading experience. The problem is that almost nobody ships that, because the web also became a publishing platform where authors compete for attention. A plain-text protocol under user control is closer to "best reading experience for everybody". The web could be that. It mostly isn't.
I stopped trying to read long articles in the browser. Why would I do that, if I can easily extract all the relevant, plain text (and even structured one) and read it in my editor instead? Where I have control over fonts, colors, navigation, etc. The browser is a delivery mechanism, not a reading environment. Treating it as one is a habit, not a necessity.
Long ago I stopped trying to type anything longer than three words anywhere but my editor. Of course, why wouldn't I? It already has everything I need - spellchecking, thesaurus, etymology lookup, translation, access to all my notes, LLM integration, etc. Try it one day - it's enormously liberating experience. And then maybe you'd stop reading long texts in the browser as well.
> A plain HTML page with no CSS is a near-perfect reading experience. The problem is that almost nobody ships that, because the web also became a publishing platform where authors compete for attention.
They don't ship it because of greed. They only want your attention because of greed. They only infest their website with ads because of greed.
> The browser is a delivery mechanism,
http is a delivery mechanism. The browser is a user agent. It's supposed to display content according to the preferences of the user. If your browser isn't doing that for you it's time to find a new browser or beat the one you have into submission until it behaves. "reader mode" is a useful compromise.
> It's supposed to display content according to the preferences of the user.
That's right, the original idea was exactly about that, but like I said - in practice that is no longer a thing.
Using the editor for reading any content is enormously underrated. Check this out - this entire thread opens in my editor as an outline with nested structure. Meaning that all the regular outline operations are available to me - folding, imenu (interactive TOC), narrowing, quick search, contextual search, pattern-based search, sparse-tree search.
Extracting all the URLs on the page while ignoring HN-internal ones is a single keypress for me - there's a link to a YT video - I can watch it, controlling the playback directly from my editor, I can extract transcript and summarize it with an LLM request - all without opening new tabs, without switching focus.
I can narrow on the sub-thread, or select a region and export only that part to a pdf, gfm, html or LaTeX. The possibilities are virtually unlimited. A web browser - even with three hundred different extensions won't let me have complete and utter control over plain text - it's just not designed for anything like that.
> Why would I do that, if I can easily extract all the relevant, plain text (and even structured one) and read it in my editor instead?
Because that’s an enormous pain in the ass. Not scalable at all.
I beg to differ. You clearly misinterpret what I'm talking about. Please expand on "scalable", what do you mean by that?
check out Scribe an alternative medium frontend that's why better: https://scribe.rawbit.ninja/@NMitchem/if-ai-writes-your-code...
https://sr.ht/~edwardloveall/Scribe/ https://libredirect.github.io/
Python has a much more mature ecosystem than Rust, especially for AI/ML stuff. I ran into a rust crate that purported to do a certain ML algorithm but did not do it correctly. I managed to write a replacement with Claude though.
I do think enforcing correctness at the type system level is a good idea for AI, which is why I often choose languages like C# and Rust over Python. However, for some things Python is definitely the correct tool for the job.
Definitively something to be said for AI/ML library support. I find myself going with Rust / TS for a ton of my backend work lately though, even though I'm a huge Django fan for backend.
I almost always pick Rust. Recently I wrote a plugin for something that was written in Go. I could have used Rust, but Go for one felt right because if the thing turned out well, others would surely find more value in having one toolchain.
The main reason is that you’re capable of reading it if you need to. And the recipient ecosystem expects a language. That’s why some data science communities pick R, MatLab, Julia, Python or Mojo not depending on what’s superior tech, but what their peers speak.
What peers are speaking Mojo? I’m not aware of any place it’s penetrated enough to be a “lingua Franca”
C# feels kinda nice because it's a good balance.
Very good static typing, Roslyn analyzers, good tooling and decent hot reload (for a compiled language), really good ORM (EF Core) that implements UoW and reduces a lot of the need for transaction management (simplifying the code), flexible enough and fast enough for various kinds of use cases.
Source generators are underrated as well since they can make the code very terse and legible by generating a lot of standard boilerplate.
I've written this before, but C# is a great language held back by its culture. I'd say that 80% of C# shops I've seen used it because they were started in the late 00s by some IT guy with a surplus HP server and a dream whose whole world was Microsoft products. They were staffed by people with little knowledge of OSS products who self-identify as ".NET developers" instead of software engineers. Almost invariably they seem to have some gnarly legacy monolith that everybody is slowly chipping away at while old-timers continue deploying .NET services to IIS running on Azure VMs because it's a small evolution of what they've been doing for the better part of 20 years.
In the interest of fairness the San Francisco version of this is also a thing: a giant, untyped ball of Rails spaghetti from the same period running on Heroku that everybody has Stockholm Syndrome'd their way into loving because of Ruby's elegance and beauty. The burden is merely shifted from a large Microsoft to a series of small SaaS companies :-)
Exceptions to this rule exist (hence my "80%") and modern .NET is lovely but it seems that the non-Java/Python mindshare is now taken up by the Golangs and Rusts of the world. It's a true shame since I do love C# for basically being a better Java.
Yeah C# is fantastic. I also love EF.
I stopped using it because overall it feels like Microsoft has lost the plot with .NET.
Yeah, I mean, if I'm going to step away from the Python ecosystem and let AI manage/polyfill my dependencies, I might as well shift the whole way to OCaml/F# rather than Rust.
Then I get the benefits of GC and strong typing.
i think the enforcing the type system is good with AI for a couple reasons: - (speculating) typed language have faster/better LSPs that can be used to more efficiently modify code with tool use. - when a human DOES need to step in and start investigating/modifying the code, the strong typing makes it much easier to get oriented within their spaghetticode
I think the only use cases are when it wraps low level C++ libs like ML libraries, and yes those are extremely difficult to reproduce
So he includes Go in a list of languages that apparently makes development slow and have “a build system that fought you” and then says python was the solution for all that. I think he got it backwards. I have found the Python build system horrific and broken by default while Go just works.
I think the author misunderstands what is good about Python.
One of the big strengths of Python is legibility: most developers find it easy to read and understand.
If you are planning to have humans verify the code you're using in production, to confirm it implements your intent, the readability of the code you are producing is important.
Performance is valuable, but for a lot of code, performance is less important than correctness and ease of verifying it.
If you are imagining your codebase being one where nobody but Claude reads the code, you might as well do Rust for the better performance. But I don't think a lot of organizations are doing that.
> A shipped app, in a language nobody on the team knew
Great! Let's look back on this not too far in the future.
Yeah that's probably the only thing in the world that could be scarier than the electron app they were replacing
Why? Just job hop in 12-18 months, and that will be someone else's problem.
They'll just have an LLM translate it to another language...
This happened before AI when a guy wrote a key tool in some random language a decade ago and the rest of us were left to maintain it. We somehow managed.
Yet, it's not uncommon, that such tools are the reasons to still use DOS, dial-up internet, or frameworks which have more security holes than lines, because they are unmaintained for decades.
You can of course use any language but here is my advice: you should use the language that you know best to make your life as uncomplicated as possible when you want to understand what the LLM was creating.
Remember, you are the judge whether the code is OK and if you use assembler you might get really performant code, but can you trust it?
Of course it might be a good incentive to learn rust or go. Or challenge yourself to learn something really cool like LISP, COBOL, FORTRAN, APL or J. (just kidding...)
just my 2 ct...
Previously in my life as an IC, I wrote a lot of Golang. I worked on the larger end to end encrypted video calling service.
I hated it. I was dreaming of Rust the entire time to release me from the hell of if err != nil dozens of time per day.
After hours with LLMs I've changed my tune. There have been 5 clients of mine (who have excellent engineering teams) but cannot get coherent results out of LLMs using python or Typescript.
I arrived back at Golang being a frustratingly simple, consistent, and low-thrash programming language which inadvertently made itself well represented in the training corpus [1].
My concession is that if you are going to write a median program (reading/writing files, network, db, etc.)...
Pick Golang especially if you've never used it. LLMs are extremely good at it, frustratingly so.
[1] https://jry.io/writing/use-boring-languages-with-llms/
Why not use AI to speed up the Python runtime? V8 showed what focused engineering can do for JavaScript, and Astral showed how much room there is to improve Python tooling. The same tricks may not apply directly, but AI could definitely accelerate the work.
> You used Python or TypeScript because[…]. because Rust, Go, C++, and many more would give you 10–100x the performance, but you paid for it: […] a build system that fought you.
I would argue I spent more time fighting the TypeScript build system than Rust’s.
But up until recently I only used either just often enough to never remember what magic configuration needed to go in my tsconfig.json and package.json to get TypeScript to work.
Actually, I do use compiled languages for this reason. Even Opus 4.7 and GPT-5.5 will leave unassigned variables lying around in Python code of sufficient size. If you've got sufficient testing you'll exercise all paths, and I imagine a good prompt would ensure adding testing with coverage to see that it does happen. However, I do not have (yet) such a system but using Go/Rust helps a lot because the compile phase actually helps detect correctness issues.
My other problem with most of the other ecosystems: ts/npm, python/uv, rust/cargo is that they all have build-time scripts that are controlled by others that execute automatically. This is a real problem because the LLM will just install things and proceed to send your home directory through a juicer. I feel a bit of a paranoiac now doing this, but I have a script that launches a podman container with just the source directory and a binary directory loaded (for caching) which compiles everything.
I know there's some sequence of steps I can take to protect myself, but if the LLM accidentally uses pnpm to run dev build scripts when I had the right config on npm or whatever, I know I'm screwed. So now I do all these shenanigans with Rust (to the extent that I vendor old deps sometimes). So the ideal language to me now is one with very few of these footguns and sandtraps which has a tight iteration loop.
AI's are really good with Python. Quick turnaround. Easy to read. Tons of training data/examples. Many of the same reasons we wrote Python before.
Another benefit to using Python, is if you subscribe to writing/vibing a throwaway version first, a Python version is 100x better than a spec.
(Disclaimer: I teach Python and AI for a living and am doing a tutorial at pycon this week, Beyond vibe coding. Am also using other languages as there are times when Python isn't appropriate)
Problem with Python and other non-strict typed languages is that if you let an LLM to write some stuff, you cannot truly be confident that nothing has broken. Even if your tests all pass. The LLM could have broken some path that only gets run in production in a very specific case. At least with strongly-typed languages you get a compiler error. In big codebases is non-negotiable
i have bad news
Lay it on. I love to collect other's anecdotes and see where they align (or disagree)
so it just boils down to strictness even when we're talking LLMs?
I agree with you about fast failure being a nice feature , but I also think that if you're TDDing a bunch of stuff and it fails in some categorical way , well then the test suite was lazy.
If you are using TDD with any recent model and even local models (qwen3.5+), you alleviate most of the issues mentioned.
Note that:
Writing code, then tests
Is not equivalent to:
Writing tests, then code
> so it just boils down to strictness even when we're talking LLMs?
The article describes what I've been doing for the past few months - I did small python projects in the past because of the ecosystem: I couldn't possibly write a ton of the stuff required for the things I wanted to do, so I leaned into python because someone already wrote it for me. Quality of deps was mostly ok for the happy paths, but always a chore to patch the broken ones.
Nowadays I tell Claude what I want to build and I always ask it whether rust is a good choice for it. It'll pick up the right crates or choose whether it should DIY, do all the plumbing, nail all the logic, and in ~30m I'll have something very solid that would have taken me 3+ weeks of part-time evening coding in python. I think the article is right and rust is the closest to the "best language" we have for LLM coding at the moment: the strict typing and the tooling dramatically reduce the output space for LLMs, and 99% of errors have a clear, precise explanation that is actionable, and the compiler helps you a lot there too.
I think it also boils down to the fact that you cannot reliably and quickly answer "why is this arg None?" in languages like python without figuring out the call graph and evaluating possible states and inputs/outputs. Rust makes all that explicit and forces you handle it, which I feel dramatically cuts the time an LLM needs to spend figuring out why it's broken or what to do next. EDIT: The fact that you get memory safety on top of all this and it's handled by the compiler is yet another advantage for LLMs: the logic that gets written is simpler to reason about, because if you try to mutably access the same variable in two different places, the compiler will feed this back to the LLM at build time. In other languages that would be a "code smell" or would require static analysis.
Strictness is a quality for software and a chore for humans, and of course the stricter you are at representing your logic and your state machine, the less ways a program can break. LLMs writing in rust give you the strictness without the chore part, and it's a very good deal from my point of view.
My anecdotal (sample size 1) experience is not consistent with this. I code fast. Refactor fast. My stuff doesn't break. But my methodology isn't the same as other's.
Python has had type hinting for quite a while, and adding validation with mypy/pyright/ty as a step in CLAUDE.md (as well as having it as part of your CI pipeline) can emulate static type checking pretty well.
Dynamically typed languages just add one more type of bug that can’t be caught at compile time. That’s not helpful, sure, but it’s one type of bug among many.
The issue you mention, execution paths not hit by test cases, is made worse by having more complicated code. Duck-typing can help reduce the number of paths.
Static vs dynamic… I don’t see an obvious winner here.
My take is that I can never be confident that anything an LLM produce will not be broken. Since I will have to check everything it produces anyways, why not write it in a human friendly language, i.e. python? C and rust may have better strictness, but the amount of boiler code to set up that system takes up a lot of mental space that could be better used to architect the problem at hand.
This is why you should use Haskell.
Haskell is a good language for LLMs! Claude knows it really well, and the type system catches so many mistakes. Just make sure to tell it to model the domain in the type from the start.
Also, Haskell can be really performant and low level, while still keeping the benefits of typing. With the C foreign function interface you can really do anything in Haskell!
Perhaps we could do it in Python in the first pass for validation purpose. And then vibe rewrite it in Haskell.
I figure a big part of it is that SWE-Bench is the target benchmark for programming and it's all python.
Python being the language LLMs are best at predates SWE-Bench by years.
I've found the opposite.
If you want your code to actually work, LLMs are far worse at coding in Python than in something like Rust.
Sure, if you just want your code to pass the one test they wrote and work in the one case they coded for, Python is fine.
A lot of people think this is fine, until they actually do something with what they've built besides just... build it.
Have you tried writing Rust? I often hear this from people who haven’t tried it. I’ve found absolutely no issues over python and everything works 100x better
Really controversial but my honest opinion: That's because programming languages, and its natural language counterpart, too, are nowadays increasing and more likely in becoming a political tool, rather than itself being a tech tool.
I observed this through observation of the attacks to Rust due to the huge presence of LGBT people.
Now while I'm pretty much straight myself, I don't reject LGBT people and don't want to partake in identity politics.
I just want things that works no matter what background you have, yet there are some people attacking Rust because of its inclusiveness nature.
And just like Linux is being perceived as nerdy and geeky and "gaming socks ready", the tokenization of things, and there attaching political meanings to it, are quickly coming to everything, so perhaps I'm too general here as well.
Let's say it is not political, but definitely adding more meanings to its technical origin and nature
Why we have discussions about sexual orientation on programming languages? Could this really go any worse?
Because once you leave Python or JS the quality of LLM-produced code degrades catastrophically.
Because the SWE benchmarks for LLM coding are done on python code bases, hence you are likely gonna have superior results
Yeah, this is a big part of it. Labs have been hill climbing on Python for years, plus AI devs are usually most familiar with Python anyways.
Many here propose replacing Python with more performant, but less familiar languages - mostly Rust, Go. But I find the argument that the AI - HUMAN interface is the most important. A simple version of this is “no, stick with Python if that’s what you know”. A more interesting version is “use this new found AI leeway to move up the abstraction level”, “try something more expressive and human oriented”, “make a DSL and parser that suits the domain (and focuses the AI)”. Despite being a minority language, Raku is ideal for these aspects (esp with built in Grammars and general kitchen sink repartee) and works surprisingly well with most popular LLMs.
I honestly think Mojo is the dark horse in this race. That is assuming all the roadmap goals are fulfilled. We're talking about C++-like performance, Python syntax, complete compatibility with Python, designed from the start to interface with AI, compile-time metaprogramming like Zig, and all kinds of other goodies.
So yes, people can bless Go and Rust all they want. Nothing is wrong with the languages, but I agree that learning them for the sake of AI usage is probably not the best idea if one is competent in a language already.
Disclosure: Lattner is one of my programming heroes, so I might be biased.
I really wanted to like Mojo, but the more I read about it, the more it really wasn't Python even though, starting out, that was a major claim to fame.
There is an excellent chance it will be awesome stuff. But they did themselves a huge disservice with the initial claim about trying to be Python compatible.
Despite lots of influencers (Karpathy) I personally trust, the industry is taking the opposite turn for a reason:
https://platform.claude.com/docs/en/agents-and-tools/tool-us... Also Claude Cowork, etc.
1. You don't need compilation... run and test faster. Compilers were primarily built to prevent human error, and only very secondarily to guard your business logic.
2. Your validators quite often need to evolve. With Python or JS, this is a pydantic edit + run. Imagine 3–4 iterations of the same in Rust?
3. Composition. The entire cycle of software changes. An agentic system takes orders from a human, reads some kind of cache and snippets, writes/combines snippets, tests it, runs it, and fixes it. This almost pushes you toward snippets the size of a function, which still need to be covered with tests. I can easily build 10 function-sized Python files and write an agent that will mix and match 3 of them into a final result. With a compiled language, you'd need to compile 10 times — or store the binaries and think about what platform they'll execute on, etc.
I love the fact that the author is questioning this. No doubt the market for your favorite language will change. 80% of languages will go away — there is no market anymore for such a big variety of languages.
> 80% of languages will go away — there is no market anymore for such a big variety of languages.
That's kind of sad, but so many older languages have been declared dead only to hang in various niches or out of sight for decades.
This post resonates. I recently built a little web service to scratch an itch I've been having and after discussing the options with Claude we settled on Go, and honestly it's been fantastic. Highly performant, native threading, dead simple to deploy with containers. And I don't even know how to read or write Go.
Go is fun, you should actually learn it
I did go through the Go tutorial many many years ago, but it's been so long I don't remember anything. I do remember it was an enjoyable process though, and I'd love to pick it up again.
Oh man... I like go because it is compiled, performant, strong and statically typed. But "fun" is not something I would say about it. The ergonomics of error handling, lack of ternary operator and other stuff that compiled 30yo languages already had ...
That sort of syntactic sugar goes against the Go philosophy. Don't get me wrong, I share your frustration, but I also see the value of consistency in their philosophy.
I'm starting to think all these languages having their own pet "philosophies" that is "totally better than X" is a shitshow and just personal preference masquerading as standards.
Go is less a language than a philosophy. It was an angry reaction to 10,000 ways to do things, and overly clever (ahem, expressive) syntactic sugar.
It is quite boring to write, but very easy to read.
Not a Go fanatic. I use Go and various other languages, and was a decade and a half late to the Go party anyway. Just trying to explain the outlook.
I know a couple languages fairly well: C, Perl, Python, Bash. I never formally learned Go, but as a test of AI coding, I started some vibe coded projects in Go. It worked very well: the code is minimal, there's few dependencies, and it compiles down to a static app. But most importantly, I can actually read the Go code and understand basically what it's doing. I can also use LLMs to critique the code if I'm uncertain. The big benefit of Go is the simpler language and "batteries included" standard library. This leads to fewer dependencies and less lines of code, which improves overall AI output. In theory, AI should be able to write better code faster in Go than in another language like Rust.
Python does have a much larger ecosystem of course, so with Go you have to develop from scratch what already exists in Python. But for smaller projects, you can also have an AI write a clean-room implementation in Go of some project in Python. So you aren't necessarily locked into one ecosystem anymore.
And in my experience, you don't even need to know the language. I have a co-worker who's basically not a programmer, but got multiple implementations of applications working sooner than our dev teams doing it by hand. You should be a coder so you can architect and orchestrate the coding, but 'language' isn't a barrier anymore.
> I have a co-worker who's basically not a programmer, but got multiple implementations of applications working sooner than our dev teams
Deployed to production, right?
Right??
(I’m just kidding, of course it’s only on their machine, no different than Excel 5 years ago)
> architect and orchestrate the coding, but 'language' isn't a barrier anymore.
Never was the barrier.
Here's the kicker: The devs spent nearly 5 months on a solution, and it ended up being so crap it was abandoned. The multiple vibe-coded solutions were all better.
Of course language was the barrier, that's part of why it was always hard to hire people. It takes years to get good at a particular language, and most people are idiots from bootcamps who learned a single framework.
Well don't ask us. If AI writes your code, why not ask it? You could probably make it write a whole article for ya.
This seems sort of like asking whether a chatbot should answer you in English or Japanese. Obviously, it should use whichever language you understand. If you understand Python best, why not write code in Python?
But on the other hand, maybe you could learn some other programming language, particularly with AI help. If that's what you wanted to do anyway, it seems like a good time to learn.
This seems to assume that all there is, is systems software, tools and frameworks. Why ignore the elephant in the room - business / enterprise / line-of-business software? The case for Rust, Go, Gleam and Zig vastly changes for these versus Java or C#.
Yeah, last year I discovered that AI writes better rust than C, so I switched to rust and it made some quick good code that it couldn't do in C.
But when I wanted to optimize and edit and reorganize bthe code it was difficult, so I did a rewrite in C and it was lighter and faster and simpler and less headache.
C for humans, rust for AI.
Cute interesting take but I feel like it misses the point. Specifically, this makes sense where performance is necessary. Many projects have been written in suboptimal languages because the writers didn't want to learn lower level languages.
Still, not ALL projects benefit from such an approach and there are times when yes python is the right tool. Not just due to readability of humans but the other qualities that make it really good for small, iterative apps.
My take has never changed. Knowledge is cheaper than ever, but wisdom is as rare as ever. This is a great example of misunderstanding the former for the latter
I find if I ask most LLMs to write a self contained script/utility, even in codebases that are 90-100% written in some other language most will default to using python for it, or sometimes bash.
Usually those kinds of utility scripts are one-shotted without any further input from me, and once they're there and doing what I need I usually don't bother converting them to whatever I would have written them in otherwise (bash would be my usual preference for really small scripts, typescript or rust for bigger utilities, I hate writing python but reading it is fine... kind of).
Claude writes java pretty well, and faster than Rust. It's a great middle ground for some projects. I've switched back from Rust to Java for some things.
I don't know why you would use Python at all except for small iterative projects. If you hate java for some reason, there's Go...
It certainly makes sense to use python for ML or data science.
Right sorry, that's not in my wheelhouse so I didn't think of that. I should be more specific. For general backend / data processing/pipeline stuff, API servers ...
I had agents code up an app for me in Swift a while back and the entire experience was so much better than your typical Python experience. The agents took full advantage of the compiler and static typing. There were far fewer bugs than expected.
Lately I just have Claude build most things in Rust, it's really amazing. I tried Go, but I found it wasn't as good--Rust really does to me feel like Python. That said, it still struggles with the same class of errors of building complex systems. I've tried using TLA+, Alloy, and other things but haven't found the trick yet. The best I've found is reimplementing all external systems in memory and e2e testing everything extensively, without reimplementing the tests become unusably slow, and Claude can rewrite huge surface areas with ease--it's somewhere between mocking and literally just reimplementing the external systems.
As others have said, the main benefit with Python over Rust is library support especially with ML features. The other gap as I see it with Rust is the lack of native flexible UI support. The nice thing about Rust though is it can serve as a very fast and stable core for an app and offload specifics to TS and Python as their strengths allow, so you get the best of all worlds.
My current goto for desktop apps is Tauri, which give us a rust backend and TS fronted (usually React). Local ML features can be easily loaded as a python sidecar. Production bundling can be a little challenging but it seems to work well so far.
Sidenote: Golang is also an amazing language for LLM use, I generally do most of my "infra" stuff in Golang over Rust, but either work fine most of the time.
I assume this is why things like PyO3 are popping up? If so, sort of a fascinating way to compartmentalize new rust code into legacy .py code in lieu of a refactor, or at least, a way to do a staggered refactor and eat the elephant in bites :)
I still use TypeScript because I know it best. When AI makes a mistake, I can find the bug much faster. For me, the speed of writing code doesn't matter as much as the speed of fixing it.
Better question is why use any code? Generate random functions and select based on measuring the distribution of output of these functions against metrics of interest. A pure black box of instruction that is more performant than any verbose code or algorithm we could come up with, because all we select for is performance above all. Directed evolution essentially of the codebase, generated through mutation and selection, just like everything else on planet earth.
I laughed hard!
So you have a chance to be able to read the absurdly barroque code AI produces.
you mean baroque-n code?
I have been wondering on a similar thing; am looking for feedback:
There are many existing, often mature, third-party software libraries or solutions that a new project could use but which hide the internals, including how the data is organized behind the scenes*. Vibe-coding for the specific project requirements, instead of using the pre-existing third-party libraries, is now becoming a feasible option. The latter may be simpler (no features beyond the actual need), more flexible (easier to add new needed features), and the data/model behind could be more accessible.
Looking for feedback on pros/cons and experiences along this.
* I care for the data as it is can be longer-lived than the code itself.
Thanks.
Isn’t the answer usually - because the same ai said python is the right language for it?
Honestly I am in the exact same boat thinking why I don’t write in C if Claude is writing it. However I chickened out thinking if support for ml model or llm based flows doesn’t exist in c then it will be time consuming to go to python then.
Assuming you are thinking about software architecture and looking under the hood, you are likely to be reading much more code than before. Python is really nice on the eyes and you can easily get a good grasp of what the code is doing. Plus, it's dynamically but strongly typed, so what you see in the code is usually what you get.
I think the rule of thumb is to use the tool that is right for the job and that you are going to be able to understand the output.
I think it is an interesting question what kind of programming language one needs for an era of agents. It is clear that the programming language that was designed for humans is not necessarily the best for AI-driven software development. I guess the qualities one would want is some formal correctness guarantees, high performance. A question is whether this language is Rust or it is possible to design a better new language.
I'm still not sure. Would love thoguhts on this.. but in this new ai world we are in... is it better to go fullstack typescript? or go with proven mature frameworks? .net, ruby, django, etc? Seems TS is moving fast but maybe its time to not reach for the shiny object and stick with proven tech? or in 5 years will we regret it?
The main risk-of-regret is: How will you feel when/if the $20/month plan costs $2000/month?
May never happen. But be clear with yourself if you’re relying on it not happening.
It’s a hell of a nice risk mitigator to understand the code, in a language you know, if you have to print-debug it yourself at some point.
AI/ML stuff: Python
Personal: Rust/Go based on criticality of being able to glean code quickly, or memory usage, etc
This is it
Didn’t Tencent do a study comparing AI performance across about 20 languages showing that Elixir was the top performer?
Once you are over a certain threshold it’s more about the average quality of training data than the quantity.
If you're using GenAI, you should go through the process of selecting an optimal tech stack for each solution, but also take into consideration that Claude and other services probably the most knowledge of python, javascript, and typescript with go, rust, java, and c# following closely behind. Consider what you're building and what elements of the tech stack is optimal for your problem-space.
I don't know rust at all and I've built three applications using it with Claude because it has speed and correctness built-in.
I use Typescript for 90% of the things I build. For web development I've used a number of tools, but mostly react, nextjs, or raw html/css/js. But if I were building an enterprise application I'd consider my team and whether opinionated (Angular) was optimal over flexible (React).
Each project should consider its own optimal tech stack.
Because AI creates unmaintainable messes in any language, and ergonomic ones help humans clean up.
Never mind cleaning up, you also have to understand the language just to judge and review the LLMs output. How else are you to separate good design and implementation from a bad one?
The ideal language for AI coding:
1. Type safety as basic guard rails that LLM output is syntactically and schematically correct
2. Concise since you have to review a lot more code
3. Easy to debug / good observability since you can't rely on your understanding of the code. Something functional where you can observe the state at any moment would be ideal.
4. A very large set of public code examples across various domains so there's enough training data for the LLM to be proficient in that language
5. A large open source ecosystem of libraries to write less code and avoid the tendency for generated code to bloat
It's basically all the same things you look for in general. I think TypeScript scores high here but I'm curious if anyone knows of a language that fits these criteria better.
Java?
Was thinking the same. Modern Java is similar or at least quite a bit closer to many other less verbose languages. Not like your dad's Java anymore.
Type safety feels like the big one; anything you can shift to static/compile-time regimes benefits agents immensely.
There are two working LLM axes. Critic strength: how much the language catches before runtime. Sensor strength: how good the empirical feedback loop is. LLMs benefit from both, but the sensor axis often is undervalued.
Type safety is great, but you can't just quietly disregard the benefits some dynamically typed languages provide; that would be completely ignoring that different tasks weight the two axes differently.
Systems code, performance-critical code, code where correctness across all cases matters more than exploration: parsers, compilers, network protocols, data structures - statically typed languages (like Rust) give you an edge here. The compiler's depth pays for the verbosity, and exploration is less of the work because the problem shape is known up front.
For stuff like building a web scraper, or rapidly prototyping, or exploratory scripts, something like Rust would be actively bad. You cannot poke at a live browser (you can with Clojure). Async Rust adds another layer of type complexity. The signal-to-noise for "figure out what is on the page" collapses entirely.
If I were picking a single language for general LLM-assisted work, weighted across task types, it would be Clojure (or Elixir), with OCaml as the most interesting alternative if the ecosystem were stronger.
Using Clojure and Elixir and LLMs are fantastic with both. Sure, if I get to a super-stable situation then maybe I'd consider moving to Rust (or Jank?), but for now I'm just so happy with Clojure and Elixir in this new world. I'm solving new problems with fully bespoke architecture so the flexibility is key. Clojure for business logic and most DB. With Elixir, it's the actor model and hand-holding as I'm using it for the web layer. I bet Ruby on Rails would also shine for some cases, prob most CRUD for example.
What made you use Clojure for business logic and DBs rather than using Elixir for everything? The JVM ecosystem?
For me, I need to move fast and already knew Phoenix well, LiveView fits my use case, and websockets setup with Phoenix is very clear so switching to a two-language setup seemed better than CLJS. I could have gone CLJS re-frame and all that but it would have been more work and more unknowns. I call LLMs from Elixir also so all of the reconnect, backoffs, papercuts, shenanigans and so on, well I just know how to do this kind of thing better in Elixir. In its way Elixir is a great, like, defensive language. I was able to keep most async in Elixir and Clojure mostly synchronous. There was some pain though with bridge between the two and at times I thought I'd made a mistake. Clojure is fantastic with data and Datalog databases, so no regret. Outside world deals with Elixir, and the temple is in Clojure and Datalog.
> fantastic with both
Most developers evaluate programming languages by comparing features in isolation, never stepping back to consider the overall experience of using one.
Features are easy to talk about. They're discrete, nameable, and comparable. "Does it have Foo?" is a question you can actually answer. "What's it like to build and maintain a real system in language X for two or three years?" isn't. So people default to what's measurable.
Most devs haven't spent serious time in more than two or three languages in production. Without that contrast, the holistic experience is invisible - you don't know what you're missing, and you don't notice the pain you've learned to live with.
Language communities form around features because features make good rallying points. "We have algebraic types." "We have macros." These become identity markers. The holistic experience doesn't tribalize as cleanly - it's harder to put on a t-shirt.
There's also a sunk-cost angle: devs who've spent years in a language have every incentive to believe its features justify the investment. Honestly evaluating the overall experience might undermine that.
The irony is that the languages with the most devoted communities tend to be loved for exactly these holistic reasons - the ones that are nearly impossible to convey through a feature list. You can rave about Clojure or Elixir all day, but a curious newcomer will land on the homepage, scan the features, and walk away unimpressed: "Meh, it doesn't even have Foo. People say this is great? They clearly don't know what they're talking about."
Well in a recent project I tried TypeScript thinking, OK, LLMs, huge training corpus! massive adoption! api for everything already set up! swim with the current! and I tried various frameworks and so on, but for me reasoning about things and being able to make systems that I could adapt and pivot it was honestly inferior compared to niche Elixir and Clojure. But it's not like I hate JS; I use it in LiveView all the time. And don't mean to imply there are no problems in niche-land though; you've got to be willing to do more yourself and live in a tiny world. Really, LLMs kind of tamed Clojure for me because it seems so far at least that they can handle the glue code and stitching libraries together pretty decently as long as you don't get lazy with architectural choices and stay vigilant. And if I ever hire it pretty much has to be remote or learn on the job, though again LLMs reduce this pain greatly.
> Critic strength .... Sensor strength
that's a nice breakdown
I think there's something key you get at in terms of the combo of dynamic environment + type safety maximising both. With a dynamic environment, the LLM can do a lot of interrogation to understand the problem space on the fly. I've witnessed agents sort out pretty complex issues through `python -c "..."`, `groovy -e "..."`, executing snippets of code with Node etc which is much less accessible if they have to compile it first. They can also inject logging code that interrogates the runtime as well (what type do we really have at line 1003?) etc which works better with runtimes that have deep introspection capabilities.
What you're describing is fast scripting in a dynamic language, which is genuinely useful - I agree it beats 'edit, compile, link, run' for exploration. But a Lisp REPL isn't 'dynamic language plus introspection'. A Lisp REPL is a persistent connection to a running process where the agent evaluates expressions against live state and can redefine code in place. python -c throws the process away every time; a REPL keeps it. The difference is the same as between sending one-off curl requests to reconstruct a session versus having an open SSH shell into the box. Imagine using a Playwright/Puppeteer session where you can navigate to a page and interactively palpate every DOM element, like playing a video game, directly from where the code is. Now imagine giving that power to the LLM - it doesn't need to restart, re-compile or even save anything - it just goes and explores, changing the program behavior on the fly.
The type-safety-plus-dynamism point you make is real and interesting (basically Clojure with Spec/Malli), but it's orthogonal to whether you're using a REPL or just shelling out snippets.
Golang. People trash it for being verbose on errors but it's an extremely readable language and it's almost like bash, only much stronger typed and with a very rich stdlib (so it's not likely you'll need a library for a quick script).
It's more or less a perfect replacement for Python for "one-off programs" and "quick scripts". Many bonus points for not having to fight shell quotation rules and trying to remember differences between sh, bash and zsh.
In a world where AI supposedly can write in any language, Go is much better choice than TypeScript. Imagine contemplating for more than a few seconds a choice between simple, fast, cross-compilable language, and a TypeScript -> JavaScript -> Interpreter -> JIT stack.
If you don't know Go, it's more efficient to learn it than to waste the hardware resources of thousands to stay within JavaScript.
Absolutely. And in this same thread I am noticing people offering Java (lol). Yeah, we all need 1.5s startup time for one-off scripts, surely.
Well, these days a small CLI program in Java (say, ls) starts up cold, runs, and terminates in ~70ms, not 1500ms, but yeah, sometimes 70ms is too long to wait for a script.
People never believe me when I say it but I start noticing scripts needing 75-100ms to start. Modern hardware is ultra fast; I want my programs to make full use of it. I got no patience for tech or people who keep insisting "it's not much, it'll not kill you". Well duh, obviously it will not but that's not the point and never was. I want stuff to work between my blinking my eyes and I have achieved that hundreds of times over the course of my career.
That's perfectly fine, and I totally understand people who don't want to sit and wait 70ms for their script to finish running (that 70ms is not the time it takes to start), but let's not turn a <40ms startup into 1.5s. Now, it is true that if you want to launch a minimal HTTP server in Java you may need to wait ~100ms, which may be too long for you, but is also a far cry from 1.5s.
It is, but I am still quoting what I saw before, it was not a fantasy. I don't deny it's likely better nowadays, sure, but I remain moderately skeptical because JVM is still a runtime that needs to boot.
Then again, Golang has one as well, though it does manage to start it up faster it seems.
'Waste of hardware resources'? Ok then write your apps in Rust.
If it doesn't matter, and for most applications it doesn't, then TypeScript is far more readable than Go - so use that.
I’d also argue it needs to compile fast/ have fast static analysis. Feedback loops like this are super helpful for agents
This is just Kotlin. Strongly typed, more concise than Java or Go (and probably Typescript), less likely to blow up at runtime than Typescript, epic tooling, plenty of public code, and a library for basically anything because JVM.
And needs the JVM to start for 1.5s before you get any results. Sure.
Golang or just shell scripts.
The JVM takes tens of milliseconds to boot up, not a second and a half.
Obviously it depends on a bunch of factors but -- not on my machines. They are all with Intel and AMD CPUs and I don't use M-series Macs.
Never saw an instantly starting JVM in my life though.
Java runs a Hello World, cold, in a packaged JAR, in about 40ms. What you've seen isn't JVM startup but programs that do a lot at initialisation (like MS Word), as many Java programs like to do (because they often expect to run for a long time, so they don't care about startup time).
I have not worked with Java in a long time but I seem to remember that most Java programs also accrue a good amount of dependencies and some of them have their own init routines.
That adds up, fast. No idea how is it nowadays, admittedly. Maybe a ton of optimization work was done.
> Concise since you have to review a lot more code
Isn't readability what matters here? Conciseness isn't the same thing.
C. At least with Gemma 4 it does a fine job. Writes good error checking. Writes memory management. Mostly straightforward and easy to read. A lot of libraries. Runs everywhere.
I use Lisp for my projects
1. Type checking built in 2. More concise and readable than most languages 3. Trivial to inspect while running, ability to change a running program 4. There seems to be a massive amount of lisp that it is inhaling from somewhere 5. Large amount of libraries.
This has the added benefit that even if you publish the code, nobody will be stealing it.
Edit -- I find it very useful to write tests for critical functions. This catches situations where the agent decides some interesting functionality is no longer interesting.
Really agree. Python is popular because it's easy for human to implement. But now if the coder becomes AI, then Rust would be preferable for agent, just like Python for human. In addition, it brings better performance.
This idea is already being taken to the next step in labs; why generate code?
When I run a game I don't care of the dev used C or whatever. Only programmers care about the syntactic representation.
I need the machine code/byte code patterns/geometric/color gradient data.
Eventually Python will be what you see on screen but no cPython interpreter program as we know it will be running
The model will have an internal awareness of the result to return without running an actual REPL
https://dev.to/zijianhuang/prompt-to-ai-generated-binary-is-...
Great question. And I don’t think that Python, Ruby and PHP have a good answer. Scripting languages cater to human weaknesses. The 10-100x perf cost was never really worth it but now it’s impossible to justify.
Nice perspective on languages in the AI era. I think AI should be used to build best performing and highly scalable software systems.
Python is rather a UI for human logic comprehension. A mathematical notation of logics. Not a code to drive computer.
And prompt does not replace that.
For me, whether it's AI or my own handcrafted artisanal code, the choice of language comes down to what has the least friction. This means I turn to vite/react for a lot of frontend requirements, and that the backend will be in nodejs or python, because those are easier for me to debug than writing an equivalent application in C++ or Rust.
The article applies to a narrow case of a totally green field application that's going to be completely vibecoded. This is the only case where you reasonably can be indifferent to what the language is, and so you can abandon familiar Python and go with unfamiliar Rust. (If you _are_ familiar with Rust, the point of the article is moot.)
This "fair weather development" approach feels very risky if that application is going to be exposed to any serious usage. There WILL be a situation when things break and the AI will be powerless to fix it (quickly) without breaking something else in a vicious loop. There WILL be a situation where things work fine and tests pass with 3 concurrent users but grind to a complete halt with 1000 because there is something O(N^2) deep in the code. And you NEED a human to save your day (which requires also proper architecture for that to be possible in the first place). If you don't plan for this, and just hope for the best, then you are building nothing more than a toy. And if you plan for this, then it matters again what the language is, and whether your team is proficient in it.
Or maybe I too old fashioned or too behind the state of the AI art...
You’re behind the state of the art. I’m not exaggerating when I say AI can diagnose and solve those issues for you too.
For the utilities I write it is faster to iterate without having to compile. When I get to the point where I'm done adding changing features, and performance is an annoyance I can always ask the AI to "rewrite this in Go". (I've never gotten to that point.)
First one to vibe code a language for LLMs, by LLMs, wins a cookie?
Yes, and wondering why all the AI tooling is written in node.
So I can fix it when it breaks. I don’t understand anyone shipping real code without human review.
Give it 2 years, the ‘Blame the AI ‘ incidents will increase. Like an unfaithful partner you’ll always return to it
So we can read and debug it if we'd like?
If AI writes your code, why use frameworks?
Devs still have to maintain this code, the Python devs can definitely get the LLM to write (some kind of) Rust, but when it goes wrong and you hit the wall with the LLM then they can will have to learn Rust which might take a while, this sounds like a bit of a project risk.
https://arxiv.org/pdf/2508.09101
tldr 2% average point lost on Rust compared to python, gap vary by model, go has a better upper bound but opus had it 3% below python.
benchmark is a bit old but research on why is there, article is just vibes
1) python is one of the foremost trained upon languages
2) it's practically verbose, not technically
3) it resembles pseudocode
4) batteries included shortcuts a lot of work
all of these reasons are a boon for LLM work.
Clojure is better. REPL + immutable defaults.
> The strongest argument for Python and JavaScript was never the languages themselves. It was the ecosystems
That's already a glaring mistake. People could say perl's CPAN is great. Well, it did not save perl from declining in the last 20 years.
> The Python ecosystem is increasingly a Rust ecosystem wearing a Python hat.
Without statistics to prove this, this claim is useless.
Also, depending on Rust isn't that strange if a language is based on ... C. The only way I would disagree with such an argument were if Python were written in Python. But since it is syntactic sugar over C - just like ruby or perl are too - the argument to use Rust here is simply not different to using C. Perhaps Rust is better than C, but it is not fundamentally different. Whether Python were written in Rust or C is not a functional difference here.
As for AI becoming our new Overlord: I honestly do not want to depend on US mega-corporations. I am not disputing the fact that AI has objective use cases. I am objecting this herd mentality of everyone putting an AI chip into their brain now.
Damn AI slop zombies everywhere - it's like in the old B movie "They Live". But with less entertainment value than that. If they chew bubblegum then it is to slop up everything, not to kick ass.
This point only makes sense if you ship AI code without reviewing it. And if you're shipping AI code without reviewing it, you're going to run into much bigger problems than Python performance limitations.
1) I still have to comprehend it.
2) The corpus for the sort of applications I build is likely larger for Python than it is for C++ and Rust. Bigger corpus == more training data == better generated code.
3) The bottleneck in the applications I run aren't in the execution of the code; they're in the database/network latency.
4) I don't get anything extra for pushing Rust or C++ over Python.
If all the libraries are rust as the article claims having the top layer in Python probably makes even less difference.
I tend to agree with the article’s statement about the value of the test code though, may even have been true before LLM code took over.
I dislike Go but I have to admit, it's a great language for AI generated code. Simple enough, it compiles quickly and it performs meh-well enough for most applications.
One of the reasons I dislike Go is because it's easy for most engineers to write really low grade code with it. But AI agents would probably not write the best code in any language anyway, so not much is lost.
… because model tool calls is non-standard, so Python as the only tool call available works wonders
(Joke but also not a joke)
If you can use Python, why have AI write your code? :-)
Also easier to ship a binary like a cli
Interesting question.
AI doesn't really write code for me, but I do use them to brainstorm/ask questions. Though, I do not use Python. I have never been a fan of the language. I still think Python is a perfectly serviceable language, but it would solve no (important) problems I have ever had better than any other language.
I can see why Python is appealing to many people, and I applaud Guido for all the work and oversight over the years, but Python lacks a lot of the things I like in a language.
Because LLMs fuck it up near-constantly and I need to review it
Because I can understand and edit that code by hand if I need to.
A somewhat contrarian/pessimistic view: The hardest thing in any future of LLM generated code is going to be the verification step, and especially types of verification that require humans which are going to be the most expensive.
Therefore the "best" language is going to be whatever makes it easiest for humans to detect bugs, bad design, or that the "wrong thing" has been developed.
I stay for the libraries
The LLMs just churns out non-idiomatic slop in any language.
It doesn't matter if the 800-line if statement is able to use pattern matching.
There's been a lot of progress on making coding agents able to solve problems when they can easily evaluate in a closed loop, we desperately need something similar for controlling complexity and using relevant abstractions.
One thing to consider:
The (well-known) Sapir–Whorf hypothesis (if dont know it, look it uop) is often invoked for natural languages, but there’s a pretty direct analogue for programming languages: the language you "think in" during solving a problem biases which abstractions and idioms you reach for first.
If you force an LLM to first solve a problem in a highly abstract language (Lisp, APL, Prolog) and only then later translate that solution to C++ or Rust, you’re effectively changing the intermediate representation the model works in. That IR has very different "affordance", e.g.
- Lisp pushes you toward recursive tree/list processing, higher‑order functions and macro‑like decomposition. (some nice web frameworks were initially written in LISP, scheme, etc...)
- APL pushes you toward whole‑array transforms, point‑free pipelines and exploiting data parallelism. (banks are still using it because of perforance)
- Prolog pushes you toward facts/rules, constraint satisfaction, and backtracking search. (it is a very high abstraction but might suit LLMs very well)
OK, and when you then translate that program into C++/Rust/python, a lot of this bias leaks through. You often end up with:
Rule engines, constraint solvers, or table‑driven dispatch code when the starting point was Prolog.
Iterator/functor pipelines and EDSL‑like combinators when the starting point was Lisp.
Data‑parallel kernels and "vectorized" loops when the starting point was APL.
In principle, an LLM could generate those idioms directly in C++/Rust. In practice, however, models are heavily shaped by their training distribution and default prompts. If you just say "write in Rust", they tend to regress towards the most common patterns in the corpus (framework‑heavy, imperative, not very aggressively functional or data‑parallel), even when the language would support richer abstractions.
By inserting a "thinking" step in a different paradigm, you bias the search over solution space before you ever get to Rust/C++. That doesn’t magically make the code better, but it does change which regions of the design space the model explores.
Same would also be true for python which is already a multi-idiomatic language. So it might be a good idea to learn a portfolio of different languages and then try to tackle a problem with a specific language instead of automatically using python/go/rust because of performance.
Something to consider...
p.s. how would a problem be solved when the LLM would have to write it first in erlang? Is it the automatically distributed?
p.p.s. the "design pattern" of the GoF comes automatically to my mind, which might be a good hint to the LLM to use.
As always, "it depends."
I'm using coding tools to build a complex media-intensive application. The approach I'm taking is to build a _reference implementation_ in Python, which is in its design specifics, constrained to use patterns which transliterate into the actual deployment targets (iPadOS/MacOS/Web).
Why start with Python?
Because I can read it, reason about it, and run it, trivially, which are Good Things for the reference. I intend to have multiple targets; I'd rather relate them to a source of ground truth I am fluent in.
For what I'm doing, there is also a very rich set of prior art and existing libraries for doing various esoteric things—my spidey sense is that I'm benefiting from that. More examples, more discourse.
I'm out of the prediction business and won't say this is either a good model for every new project, or, one I will need in another N months/years.
But for the moment it sure feels like a sweet spot.
Ask me again though, after the reference goes gold and I actually take up the transliteration though... :)
you still need to look at the code oneday so id say c++ still would be a preferred target language even for ai. i know i hear a lot about rust but im still getting the idea its a niche language overall. i know people love it and point out its advantages, but sometimes good enough is good enough (i.e. c++)
a) Python (and Node) comprise the largest training set for all the models, so you are likely to get way better accuracy, especially with local models
b) Python code is easier to introspect, and set up test harnesses around. And also extend in agentic frameworks
c) LLMs are really good at translation. I can give it python code and it can translate it into C.
Why use any general programming language at all? Just write it in assembly or binary. Skip the middleman bro
Let's go through some of the arguments, in no particular order:
> Klabnik vibe-coded a new language in Rust, therefore Claude + Rust = Good.
I argue the inverse -- Rust, being an ML-family language, is well suited for parsing, and language design (I know! Shocker!). In more moderate translation -- ML-style languages are good for parsing, interpreting and compiling code. Claude is not the magic here -- ML is.
I would also add that I've had decent success vibe-coding+human-coding Haskell (contrary to the article). My experience is that if I can hand-write a rich set of types (blessed be IxMonad), I can throw Claude to fill in the blanks for the implementations. If I can design the data structures that make the program tick, bridging them is something Claude is awesome at. Again, no surprise -- it's intern-level work.
The key distinction between C, Zig and Rust is that Rust is designed around types. C and Zig are more memory-oriented -- they really see most of your program as flat memory and you can kind of shoehorn a little bit of data layout in that flat memory. While this offers a large amount flexibility, this philosophy isn't well suited for proving out correctness. But again -- this doesn't mean they don't have a spot.
When I was a junior at Tesla, I used to joke that senior staff had a VMs in their heads, because that's really how you analyze C programs -- you try to execute it in your head, with interesting inputs, but that's about it. Claude's head-VM is quite fuzzy and often makes errors.
With Rust, if you design your type system, you prevent yourself from making dumb mistakes. Swap out "yourself" with Claude here and it's the same story.
I've yet to see Claude design really nice type systems, fwiw.
But the point is -- Claude is the enemy of beauty and correctness -- it's up to the SWE to design a type-system which will prevent it from doing so. To be clear, I obsess over type-systems personally, but that's not the only way -- incredibly rich, comprehensive, huge type systems, fuzzing, Antithesis, proptesting are all things you can do to minimize the impact of slop, and those are all valid things to do.
---
> Code is not written by humans therefore it doesn't matter that you don't know Rust.
Wouldn't say this was explicitly stated, but I definitely smelt this undertone throughout the article. If you don't understand the language you're reading, how can you understand whether the code in front of you is correct or not? If you have a systems engineer sitting across you to clean your PRs up, you can pass that responsibility onto them, but what about when they give their two weeks?
If all you know is Python, chances are you're going to make better software in Python than in Rust. Stick an `Arc<Mutex<T>>` everywhere and chances are your code will be slower, as a matter of fact. Use If you want to learn Rust, please join us! But if all you're trying to do is vibe-code better code -- do it in the language you know and can actually debug when shit hits the fan.
---
> Anthropic C Compiler
It is impressive that Claude is awesome at taking existing code and rewriting it, this is certain, but I'd like to repeat the exact same rhetoric that many have given -- rewriting =/= original authorship. Awesome, we have a C compiler, but we already had one, and we just rewrote it? Seems like a little bit of wasted electricity.
To build on top of this, I am really happy that Bun is exploring Rust, and the Claude rewrite is truly impressive, but quite surprising at times, preserving strange anti-patterns (my name being said anti-pattern, teehee): https://github.com/oven-sh/bun/blob/ffa6ce211a0267161ae48b82.... It's hard to determine why Claude decided this -- I assume a really strict input prompt.
Do note that the current stage of that PR is much better than what it was at the state of that commit, and obviously Jarred isn't merging blind slop, but that is still human-driven by someone who has an understanding of their product.
My bet is actually that _rewrites_ of already-functioning, well-tested code, are likely to be more common as time progresses. I think that's what Claude is really awesome at, and I think Claude can often achieve 80-20 improvements through rewrites. Again, Claude alone will not be a silver bullet -- it won't generate data-oriented programs if the source material wasn't data-oriented. It won't optimize for cache coherency, if the source didn't, but moving from Python to Rust alone, with more-or-less the same code structure, you're likely to see improvements by virtue of common operations being memory-coherent and avoiding the GIL and so on.
---
> A C compiler written in Rust used to be a graduate thesis. It isn’t anymore.
Come on, this is disingenuous -- a simple C compiler is a 1-day long project. LLVM is a graduate thesis (and for good reason). Copy-pasting prior-art is academic dishonesty and Claude does a lot of that.
---
For transparency: I work with Noah.
EDIT: Wanted to add that not a single line of my comment was AI generated.
MISSION
FUCKING
ACCOMPLISHED
Perl might just be the most token efficient language
Code exists for humans to read and write. The fact it happens to compile and get executed by a computer system is a side effect.