Linux security mailing list 'almost unmanageable'

www.theregister.com

111 points by jonbaer 2 hours ago

l1k 1 hour ago

Fun fact (or not so fun if you're a subscriber):

Somebody is spamming kernel mailing lists under the name Marian Corcodel with a 26 MByte message multiple times per day containing a collection of nonsensical patches. Looks AI-generated, perhaps with the intention to poison LLMs. This has been going on for a few days now.

https://lore.kernel.org/all/CAGg4U=GNtCObd_Nbm_1Rr5FEvPb69Yz...

probably_wrong 1 hour ago

I'd warn HN users not to click on that link simply because it will load a 26Mb message that will likely cause quite a strain on kernel.org's servers if everyone here does it.
- leonidasrup 1 hour ago
  
  https://web.archive.org/web/20260518134447/https://lore.kern...
  
  OuterVale 1 hour ago
  
  I don't think needlessly straining the Internet Archive's servers is any better.
  
  embedding-shape 2 minutes ago
  
  IA's infra is slightly better for big loads though, they tend to just have higher latency rather than aborted/timed out requests, for better or worse. It can be bit slow, but as long as you're ready to wait, you'll eventually get the response. Usually hosts just cut you off with a hardcoded timeout instead, which for people on high latency/low bandwidth connections can be super fun.
  
  grosswait 1 hour ago
  
  Will clicking on this link download a 26MB message putting extra load on archive.org's servers?
- jmalicki 48 minutes ago
  
  Does a 26MB message actually cause noticeable strain on the server much beyond loading the page? I would think serving a contiguous 26MB chunk would be relatively similar to say 20 normal sized messages.
- shevy-java 37 minutes ago
  
  Thank you for the warning. I rarely click on links these days though; only exception I make for HN links for main articles.
  
  embedding-shape 1 minute ago
  
  How do you navigate the web, everything is CTRL+L then manually type the address, or you have some fancier solution?
- sillysaurusx 8 minutes ago
  
  I was curious how much of an impact HN could have. Napkin math:
  HN gets 24M views a day. Assume those views are evenly distributed across the front page (they aren’t), and that’s about 1M views for each front page post, assuming each user clicks on one post.
  By the rule of 10s (also not exact), there are 10x less views on comment threads. So assume around 100k views on a comment thread as a theoretical average.
  If everyone in this thread clicked on the link, that would be 2.6 TB of transfer across the day. But by the rule of 10’s we have to assume 10x fewer people will interact (upvote, click, anything) than view. So we’re down to 260GB transfer over the course of a day.
  I wonder how close that is. It seems plausible that a link in the top comment of a thread could garner 10,000 clicks.
  That’s still about one click every 8 seconds, which at 10Mbit/s would indeed overwhelm the server by a factor of about 2x. But I clicked through and it loaded in just a few seconds, so presumably the pipe is faster than 10Mbit/s.
  Another caveat is that many websites are already several megabytes, so it seems strange that 26Mb would be the breaking point for a reasonable web host.

st_goliath 2 hours ago

Here's the actual mailing list post: https://lore.kernel.org/lkml/CAHk-=wi+JvcuKF2NaD_rGiYrwkR6rx...

Actual context: Linux 7.1-rc4 release, Linus remarked on a specific documentation change.

The Register somehow turned this into an "article" that says a lot less with roughly the same number of words, and provides "context" by linking to a number of unrelated articles.

throawayonthe 1 hour ago

here is what seems to be the relevant documentation: https://docs.kernel.org/process/security-bugs.html
see "If you resorted to AI assistance to identify a bug, you must treat it as public." and https://docs.kernel.org/process/security-bugs.html#responsib...

Sweepi 2 hours ago

"Torvalds' remarks contrast with recent comments from fellow kernel maintainer Greg Kroah-Hartman, who recently told The Register that AI has become an increasingly useful tool for the FOSS community."

Does it? Both points can be true at the same time.

happytoexplain 2 hours ago

I mean, they are two (of many) contrasting results of AI. The writer didn't say "contradict". But I agree they probably could have chosen better wording.
ses1984 2 hours ago

Linus also said
“AI tools are great, but only if they actually help, rather than cause unnecessary pain and pointless make-believe work,” he wrote. “Feel free to use them, but use them in a way that is productive and makes for a better experience.”
So I think the closing remark from the register isn’t really appropriate given the context from the quotes they pulled.
renegade-otter 1 hour ago

I will argue that ON AVERAGE, humans are lazy, and will use LLMs to generate walls of text and code. We like the easy way out - just pop a pill. Here we have a technology that can finally help us manage the crippling firehose of data, and instead, we are going to make it much worse. As expected.
A few of us will actually use these tools to reduce toil and achieve something useful.
orthoxerox 1 hour ago

AI can amplify your intelligence just as easily as it can amplify your stupidity. All while telling you how smart and brilliant you are.
j16sdiz 1 hour ago

Torvalds didn't say AI isn't useful. He is saying everybody use AI to file same duplicate bug report causing extra churn.

trelbutate 58 minutes ago

Will never understand why some people prefer mailing lists to do development, it always feels like the most convoluted way to hold a discussion, especially if there are multiple topics at the same time.

It probably doesn't really change that much in this scenario but with a forum or any other topics-based platform you can at least just close and ignore these things without it affecting everyone else.

pixl97 46 minutes ago

Because it is an open and widely distributed system that is difficult to take down or otherwise have an extended outage.
PurpleRamen 45 minutes ago

A good mailclient allows a skilled user a much more efficient communication than most forums.
> It probably doesn't really change that much in this scenario but with a forum or any other topics-based platform you can at least just close and ignore these things without it affecting everyone else.
True, external moderation is a benefit of centralized platforms, but a mailclient allows personalized moderation, which allows with a well organized list to only filter out anything you are not interested in. Usenet had the benefit of both, a centralized platform with moderation, and powerful clients for further personalization. Too bad it died for most usages.
- SoKamil 7 minutes ago
  
  Is there a demo of such communication on YouTube, or at least some article with screenshots?

moezd 1 hour ago

I think it's time the report-only intake should stop. If a reporter can't reproduce at least one use case or can't summarise it in two sentences, it should be classified as spam. LLMs write beautiful reports, it's just that sometimes it doesn't bear anything resembling the truth.

nashadelic 28 minutes ago

couldn't an llm be used for verification like we're seeing some OSS projects do? Some projects are moving so fast, its almost certain there's little human involvement.

rnxrx 57 minutes ago

It seems like LLMs are actually pretty good at the sorts of things needed to manage a high-volume mailing list (summarizing, looking for dupes, sentiment, flagging things, etc), even if only as augmentation for human eyes.

That said, I get why this would rankle a lot of the folks involved.

rolandog 9 minutes ago

That's just a security/protection racket with extra steps: "Someone is paying us to hurt your business/site; pay us money to defend your site against our attacks".

olive-n 1 hour ago

I like to imagine that LLM's ability to optimize code is like an extension of the training-loop in deep learning. The loss function is some kind of metric representing security and/or performance (or the lack of it) of the code and we use the LLM as the gradient/diff generator to iterate in batches over the code and fine tune it.

Imagine the current state being for the most part a collection of local maxima in security. To push the system in a more optimal state, you either need skilled people and time to overcome the barrier to a new local maximum or you throw AI at it and evaluate whether you land in a more optimal state.

I think after some time of turbulent exploit/patch cycles we will reach a stable state again, where the code converges against a new local minimum that even with AI requires significant effort (time and tokens) to overcome. Or ideally a global maximum.

With time, the LLMs improve, so the diffs/gradients get better and we will be able to reach optimal points for any software faster.

My problem with the idea is that apparently it is assumed that OSS contributors and especially maintainers will generously donate their time to get this machinery into a state that makes the optimization loop work well - just for the AI labs to turn around and sell access to the optimized models for increasingly larger amounts of money.

AI generated code can be great. Hand rolled code can be bad. The rules are the same in both cases. Make sure your code changes are focused (no random changes just because you happen to be in the file/dir or notice something) and make sure you don't break anything else along the way.

new_account_100 2 hours ago

AI (read: LLM technology) is the most powerful spam weapon ever invented.

stabbles 2 hours ago

Isn't it mostly the medium that's problematic? With an issue tracker it's easier to close as duplicate

cduzz 2 hours ago

If the AI is awesome at identifying security bugs in the linux kernel, it likely can also identify if the thing it's found is similar to something that is already found in the security mailing list?
Or, put another way -- what flags the duplicate? The filer or the system? If my cheese factory is measured by the volume of cheese instead of the quality, I'll churn out the cheese even if it's sloppy duplicated cheese. And that is the case if a person has to flag a new ticket as "same as this" or not.
What's that law that says that any sufficiently large problem turns into a moderation problem?
- crote 1 hour ago
  
  The problem is that the tech companies are paying their research/marketing departments for headlines that go "Researcher uses powerful new Saga 6.2 release to find 597 kernel vulnerabilities! (Can your company afford NOT getting their $1000/month subscription?)", not for headlines that go "Researcher spends $50.000 to find 597 bugs, then spends $25.000 figuring out 540 of them are duplicates".
  Unless the kernel community starts banning & publicly shaming repeat offenders, there's zero incentive for them to put any effort in filtering out duplicates. They are mostly doing it for marketing after all, not out of a genuine interest in making the kernel better.
- flumes_whims_ 1 hour ago
  
  > “AI detected bugs are pretty much by definition not secret, and treating them on some private list is a waste of time for everybody involved – and only makes that duplication worse because the reporters can't even see each other's reports.”
  
  cduzz 1 hour ago
  
  Ah; so it _is_ a tool problem. It is _also_ a moderation problem.
  One could ban orgs that flood the zone with AI generated trash, but is there some potential middle ground where there are sets of filters to identify duplicated bugs, and possibly just internally dump "AI spam" to a lower queue?
  This seems like the sort of problem I'd addressed in the 90s with killfiles and spamassassin. In other words, can't the ingestion just go through some filters to shield the humans at the end of the pipe?
- fiedzia 1 hour ago
  
  > it likely can also identify if the thing it's found is similar to something that is already found in the security mailing list?
  It can not because this mailing list is not public.
Aurornis 1 hour ago

An open visibility tracker would be a goldmine for finding new exploits before a fix is even available.
From what I’ve seen many of the AI bug search operators are newer to security research. They’re burning their tokens trying to find kernel bugs as their claim to fame before other people with AI tools find them first. They don’t spend time de-duplicating their own bugs.
Some of them may not be coming from real people. There are honeypot repos that are entirely fake and only have folders of simple files with clear security problems. They collect automated reports they get from all of the AI bots that people are running.
- smallerfish 1 hour ago
  
  So make it a closed issue tracker with a public email gateway. Get Anthropic to donate LLM time to classify and combine incoming reports.
  
  throwaway85825 1 hour ago
  
  If the LLM hallucinates bugs what makes you think any classification won't be hallucinated?
  
  quuxplusone 1 hour ago
  
  The issue highlighted in Linus's message isn't that the LLM is hallucinating fake bugs; it's that 100 people running the same LLM on the same codebase find the same real bug 100 times, and if they all send it to the private security mailing list, it's (1) unmanageably high volume and (2) stupid security theater [because by definition any bad actor with the same LLM would find that bug — it's effectively public at that point].
stonogo 1 hour ago

And with a mailing list you don't even have to do that! The problem doesn't really change, because you have to figure out whether it is a duplicate before you can mark it as duplicate, and that's the 'managing' part of 'unmanageable'.
dgellow 1 hour ago

You still spend time identifying duplicates and doing triage. That can be very significant for a project like Linux.
Interestingly enough doing that type of triage is something LLMs are actually great at
Cthulhu_ 57 minutes ago

While true, security reports should be treated as confidential until a patch is widely available.

NoSalt 1 hour ago

So ... who, exactly, is AI supposed to be "helping"???

nottorp 18 minutes ago

The "security researches" who post those bugs. Their goal being self promotion.

newswasboring 1 hour ago

> Torvalds' remarks contrast with recent comments from fellow kernel maintainer Greg Kroah-Hartman, who recently told The Register that AI has become an increasingly useful tool for the FOSS community

Thats kinda a misrepresentation. They are talking about two different things. Linus is trying to point out incorrect use of a tool while GKH is praising a correct use. This sentence felt weird at the end of the article, kind like rage bait. And I took it :P.

quotemstr 40 minutes ago

Maybe it's time to require public zero-knowledge proofs of a working exploits before privately-delivered exploit details can be considered.

shevy-java 37 minutes ago

So ... first, AI slop is killing mankind slowly. Skynet is winning here.

On the other hand ... IF the bug report is real, and let's assume that AI slop reports at the least a few bugs that are indeed real, then I really think it should not make a difference WHO or WHAT reports these bugs. I would not disagree on fake bugs or bogus bug reports wasting time of humans, but this is a quality difference then. Surely people can tweak AI models to be better at finding bugs too. Besides, they should auto-fix that. Is AI still too stupid to fully replace humans? Other than killing them with spam, as it does right now.