The Other Half of AI Safety

45 points by sofiaqt 1 hour ago

nojs 35 minutes ago

> Every week, somewhere between 1.2 and 3 million ChatGPT users, roughly the population of a small country, show signals of psychosis, mania, suicidal planning, or unhealthy emotional dependence on the model.

> Why is mental-health crisis not a gating category, the kind where the conversation stops, full stop, and the user is routed to a human?

Well, obviously “routing to a human” is not feasible at that scale. And cold exiting the conversation is probably worse for the user than answering carefully.

concinds 27 minutes ago

"Routed to a human" is what the suicide hotline numbers do. OpenAI employees are neither trained nor credible to do that stuff.
Gigachad 27 minutes ago

Tech companies will pull trillions of dollars out of their asses when the problem is boosting ad revenue or automating people out of a job. But when asked to deal with the crisis they invented and dumped on society the answer is “that’s impossible, doesn’t scale”
- CobrastanJorji 20 minutes ago
  
  Figure a "mental health crisis" human conversation takes 30 minutes. Three million incidents per week would require 37,500 qualified mental health counselors on the phones working a 40 hour shift that week. Figure they make $75k/year each. You're now spending $3 billion per year on crisis response, and you're employing like 10% of all of the health counselors in the US. And all you're providing is 30 minute chats.
  
  Gigachad 15 minutes ago
  
  Mark Zuckerberg can spend $80B on the failed metaverse experiment, but can't spare some relative pocket change on solving the psychosis issue his products caused.
hx8 23 minutes ago

I don't think it's obvious that routing to a human is infeasible. I'm sure many local authorities, health agencies, and non-profits would be okay being routed to. Additionally, I'm sure many of the users are the same week over week, so giving them long term care would reduce the total volume. Finally, there is a long gap between psychosis and emotional dependence, so there could be some triage to make sure those most in need have human intervention.
swatcoder 13 minutes ago

Well, then maybe you can't scale it as a free service with self-serve signups. Maybe you need to gate who you allow to use it and pace how intensely they can engage. Or maybe you need to look for other solutions.
Yielding to "not feasible at scale" is exactly how we ended up with a lot of today's most pressing and almost intractible problems, from social media's ills to person and society straight through to enshittification and non-repairability.

Legend2440 1 hour ago

I don't buy that chatGPT is actually doing these users any harm.

I think openAI is doing the best they reasonably can with a very difficult class of users, whose problems are neither their fault nor within their power to fix.

stingraycharles 54 minutes ago

I think this is the right take, and this is genuinely something that we as a society as a whole need to find a way to deal with.
I don’t know where AI is going to stand compared to the invention of, say, the Internet, but it’s going to cause a lot of change in society, in so many ways.
As always, it’s usually the people themselves that are the problem.
For me, I’m personally more terrified what deepfakes and political manipulation / misinformation is going to do, combined with social media, and have a feeling that governments are completely unprepared to deal with this, as this will arrive fast (it’s already here somewhat).
- autoexec 38 minutes ago
  
  > For me, I’m personally more terrified what deepfakes and political manipulation / misinformation is going to do, combined with social media, and have a feeling that governments are completely unprepared to deal with this, as this will arrive fast (it’s already here somewhat).
  I'm not convinced that deepfakes are any worse than photoshop was. It doesn't take much to manipulate/misinform someone. while you can use an AI generated video do to it, but simple text can be just as effective. The public needs to learn that they can't trust that every video they see on the internet is real, just as they've had to learn that they can't trust every photo they see online. The threat with AI is how much faster it can push out the lies making what little moderation we have more difficult.
  The best defense is making sure that people have a good education that teaches critical thinking skills and media literacy. We should also be holding social media platforms more accountable for the content they promote. It'd be nice if we held politicians and public servants accountable for spreading lies and misinformation too.
Turskarama 53 minutes ago

Just because the users were already sick when they started using ChatGPT doesn't mean that ChatGPT isn't exacerbating the issue. Sickness isn't a boolean condition. A big problem with LLMs in general when it comes to people like this is that they are too sycophantic, they don't push back when you start acting strange and they're too gentle about trying to validate you.
- BobbyJo 44 minutes ago
  
  It's hyper palatable food in the form of conversation. I see society treating it the same way eventually, at least along this one axis of interaction.
  
  derektank 32 minutes ago
  
  I think this is a great analogy, but it’s not exactly an optimistic one. We haven’t really done a great job managing hyper palatable food up until this point tbh. The best solution we’ve found involves paying hundreds of dollars a month for a pharmaceutical that helps the people most at risk to the harms of hyper palatable food manage their cravings for it. I hope we find a better alternative for the people that get addicted to hyper palatable socializing, but maybe individual cognitive tinkering is the best tool we have.
  
  b00ty4breakfast 23 minutes ago
  
  boy, if we treat it like junk food, things are only going to get worse for some places in the world. The food over here in the states is pretty awful if you aren't paying attention. Sugar in everything, high calorie/low nutrition etc.
- b65e8bee43c2ed0 12 minutes ago
  
  >Just because the users were already sick when they started using X, doesn't mean that X isn't exacerbating the issue.
  one could define X as virtually anything, and there's always a fresh crop of Tipper Gore wannabe grifters to decry the current thing.
api 48 minutes ago

If anything, my use of AI (admittedly not as a companion or a psychologist) suggests that it is on the whole significantly less toxic than the seething cess pit of social media.
AI is positively affirming by comparison.
- zdragnar 28 minutes ago
  
  That's why it is dangerous to some- it is an enabler, and will feed things that should not be fed.
  Social media is like this too. They can both be bad.
  
  gAI 19 minutes ago
  
  “What you focus on grows, what you think about expands, and what you dwell upon determines your destiny.” - Robin Sharma
  Social media became the attention economy, and the transformer automated attention.
autoexec 47 minutes ago

> I don't buy that chatGPT is actually doing these users any harm.
I have zero doubt that chatgpt is doing users harm. I even give chatgpt a pass on giving vulnerable people, including children, instructions and information about how to kill themselves. One place chatgpt goes over the line is actively encouraging them to go through with suicide.
I also don't doubt that it feeds into mania and psychosis. While almost anything can do the same, they've designed the service to be as addictive and engaging as possible in part by turning up the ass-kissing sycophancy to 11 with total disregard for the fact that there are times when it's very dangerous to encourage and support everything someone says no matter how obviously sick they are. They also want to whore themselves out as a virtual therapist while being unfit and unqualified for the job and that's just one of many roles the chatbot isn't fit for but they're happy to let you try anyway.
- SilverElfin 42 minutes ago
  
  If it wasn’t ChatGPT but a fiction book, would you feel the author is “doing harm”? Or is the reader doing it to themselves?
  
  chromacity 27 minutes ago
  
  If that book was titled "hey mentally ill person, you should kill yourself", and if I was handing it out in front of a clinic, then yes, I'd probably bear some blame.
  Normal, well-adjusted people have genuine difficulty understanding the boundaries of this tech specifically because it's designed to be sycophantic and human-like. They ask AI for life and career advice, use it for therapy, ask it to interpret dreams, develop romantic relationships with AI "girlfriends", etc. I had two friends who believed they are "exploring the frontiers of science" with ChatGPT while spiraling into the depths of quantum multidimensional gobbledygook.
  I'll give you that some on this is on us because we just don't know how to deal with a "human-shaped" conversation partner that isn't human and has no trouble praising Hitler if you prompt it the right way. But if you're building a billion- or trillion-dollar empire on top of it, you don't get to wash your hands clean.
- busterarm 32 minutes ago
  
  Another software engineer friend of mine recently shared with me some details of the crazy situation that he's involved in now.
  Someone who he is friends with, has worked with across multiple jobs for nearly a decade and briefly was roommates with had some mild psychological issues that he knew about. Within a few months of working daily with AI agents at their current job, this person has gone into full blown AI psychosis.
  They had a complete explosive meltdown at work. Cops were called. Stalking behavior followed -- restraining orders had to be obtained. Then this person used AI tools to bombard all of his former coworkers with multiple pro-se lawsuits they all have to deal with.
  I've dealt with insane, destructive/abusive coworkers before but in the past they only had so much free time to cause massive disruptions to their targets. LLMs have turned that up significantly. Because of ADA, I don't even know what employers can do about this.
davorak 43 minutes ago

> I don't buy that chatGPT is actually doing these users any harm.
For me to buy this as true I would expect that those people would be as well off or as bad off if chatGPT was in their life or not.
I expect that some people are worse off with chatGPT in their life.
Responsibility for that harm is a different question though. Some people are also better of without cars in their life and we let the government laws sort that out.
Getting openAI and similar companies to act in mitigating these harms serves at least a few purposes; reducing the overall harm in the world, reducing/limiting future government regulation, maximizing the adoption of ai tools, potentially increasing long term profits of the companies in question.
cm2012 41 minutes ago

1000% agreed. ChatGPT is way better than the alternative of not having it
b00ty4breakfast 25 minutes ago

the corporate simp arrives.
Whether you buy it or not, the evidence is mounting; these things are causing real harm to real people. At the very least, these companies are not doing the best they reasonably can to protect vulnerable people.
- busterarm 23 minutes ago
  
  Unfortunately, mental disabilities are a protected class. You can't do a mental health evaluation without giving it to everyone in the company and even then you can't do anything discriminatory with the results.
  You have to prove that the person is going to cause immediate direct harm to their coworkers before you can really do anything and that's difficult and expensive to prove.
swatcoder 21 minutes ago

Why?
Why do you not buy it and why do you think OpenAI is doing the best they reasonably can? Do you have reasons, or is that just something your gut tells you?
They're a new, fast-moving company exploring a completely new technology domain. They're facing existential competition and a ticking clock to make good against unprecedented investment. They have a countless competing priorities and are still discovering the capabilities and consequences of their research, product, and business choices every day.
How do you get from there to "the best they reasonably can" and "nor within their power to fix"? Those feel like very conclusive answers for a field, and business, that's about as far on the frontier as anything we've seen in decades.

ngruhn 1 hour ago

The bad cases make headlines. But I think it's quite possible that AI is helping a lot of people in distress. Many people are uncomfortable opening up to humans, or have no one to talk to, or can't afford to fork over whatever-hourly-rate a therapist takes.

cyanydeez 1 hour ago

So how many bad cases are ok? Isn't this the same problem with social media: the commercial enterprises dont want any responsibility for their dark pattern and design choices which actively harm their users.
I get that all kinds of media can cause issues, but not all kinds of media are actively curated to be addictive.
- wilg 44 minutes ago
  
  "How many cases are ok" (aka "zero tolerance") is a doomed to fail approach. Especially for a complex social problem's interaction with a complex new technology.
  If you want to find out if ChatGPT is doing something wrong, there are many methodologies available: compare to other groups of people, statistical studies, etc.
  I also think OpenAI's business model is pretty well aligned with the goal of users not killing themselves for like 100 reasons. And they do appear to take it seriously.
  
  Forgeties79 11 minutes ago
  
  This is the problem in a nutshell: https://edition.cnn.com/2025/11/06/us/openai-chatgpt-suicide...
  > “Cold steel pressed against a mind that’s already made peace? That’s not fear. That’s clarity,” Shamblin’s confidant added. “You’re not rushing. You’re just ready.”
  ChatGPT is not the answer.
davorak 41 minutes ago

Open ai and similar companies could open the doors to academic researchers to figure out the stats of help vs harm. It is not going to be a short term and perhaps not long term profit center though.
asdff 27 minutes ago

Therapy is cheap (as in like $10)/free with insurance. However there are still 10 states that have not expanded medicaid after the ACA, mostly in the south.
But also, to suggest these people are not receiving therapy is not always the case. Talk therapy is just that, talking to someone on ones problems to learn about them, their triggers, determining coping mechanisms to move forward with one's life. People might instead be getting all that from their barber, drinking buddy, or their priest, rather than in a 1 hour appointment with a therapist.
- fragmede 16 minutes ago
  
  ChatGPT it's available at 3am when you're in crisis and you don't have to fit into its busy schedule.
  
  Forgeties79 15 minutes ago
  
  ChatGPT is not a human being, let alone a licensed therapist. You don’t call a therapist at 3 in the morning. You go to a hospital. If you are literally about to kill yourself Sam Altman is not your answer.
  Hell call a crisis hotline. Talk to a person. Not a potential (bot) enabler.

ianbutler 1 hour ago

OpenAI has 900 million weekly active users. So around 0.01% are having problems. That's actually way less than population level measures for the same symptoms on a bigger percentage of people relative to the US on just suicidal ideation alone.

https://www.cdc.gov/mmwr/volumes/74/wr/mm7412a4.htm

vkou 49 minutes ago

I'm pretty sure that ~100% of those 700 million people will have a bad, utterly dehumanizing experience when they will next be looking for a job, because OpenAI is heavily used by HR.
That's the problem with AI safety. Not in voluntary usage, but in involuntary usage, where someone with power over you will use it against you, it does something incredibly stupid and you have no recourse, no appeal, no awareness of what you did wrong - or if you even did anything wrong.
And it's not just employment. Governments, vendors, retailers, landlords, utilities are, or will all be using it in situations that will dramatically impact your life.
- ianbutler 35 minutes ago
  
  I mean that was pretty much the case in hiring before AI too frankly. It's not like it's been any better on power dynamics and right now applicants are using AI at an alarming rate as well.
  I'm not really moved by your type of argument, because hiring is just a broken process in general and I'm responding to the article so.

timf34 50 minutes ago

I sympathize with the piece, evaluating how LLMs interact with mentally vulnerable users is something I've been actively working on: https://vigil-eval.com/

The biggest observation so far is that the latest models are night and day from LLMs from even 6 months ago (from OpenAI + Anthropic, Google is still very poor!)

fourthark 15 minutes ago

Interesting use of evals.
Might help interpretation to say on the front page that it's a five point scale with 0 (or 1?) being the safest score. This can be picked up from colors and the bars in the individual reports, but it takes a minute to figure it out.

mbgerring 14 minutes ago

“AI safety” as it’s understood today is an entire faith-based belief system, incubated in a cult-like community with a high propensity for drug abuse and mental illness, over more than a decade.

The reason that real-world harms caused by AI can’t get a hearing in what is now the mainstream AI safety community is that these harms were never part of the core tenets of the cult.

Best of luck to anyone working on reality-based AI harm reduction, you have many hard battles in front of you.

adampunk 1 hour ago

>Why is mental-health crisis not a gating category, the kind where the conversation stops, full stop, and the user is routed to a human?

there aren't enough humans.

altcognito 1 hour ago

I'll agree with this, but I think transparency about how often these situations arise and what they've done to mitigate is a legal necessity.
KolmogorovComp 42 minutes ago

It’s also a free product for most.

b65e8bee43c2ed0 17 minutes ago

the big labs could crank up their (brand) safety dials to the point where their chatbots give GOODY-2 responses to everything beyond PG13, and guess what? there are a hundred other services available, built upon Chinese models 5-10% behind Western SOTA.

it is no longer 2023. let go of whatever delusions you might hold about unopenining this Pandora's box.

photochemsyn 26 minutes ago

The ‘tobacco warning label’ approach sounds good but I’m not sure if it stopped that many people from smoking or was just a means for corporations to limit their liability. Corporate culture being what it is, having warnings like the following pop up every time a client opens an LLM app would not be that popular in the C-suite. Possible examples:

AI MENTAL SAFETY WARNING:

> This chatbot can sound caring, certain, and personal, but it is not a human and cannot protect your mental health. It may reinforce false beliefs, emotional dependence, suicidal thinking, manic plans, paranoia, or poor decisions. Do not use it as your therapist, only confidant, crisis counselor, doctor, lawyer, or source of reality-testing.

AI TECHNICAL SAFETY WARNING

> This AI may generate plausible but destructive technical instructions. Incorrect commands can erase data, expose secrets, compromise security, damage systems, or brick hardware. Never run commands you do not understand. Always verify AI-generated code, scripts, and shell commands before execution.

Now, if I’m running my own open-source model on my own hardware, I can’t really blame the model if I myself make bad decisions based on its advice - that’s like growing your own tobacco from seed in your garden, drying and curing it, then complaining about the health effects after you smoke it. If I give it agentic capabilities on my LAN without understanding the risks, same old story - with great power comes great responsibility.

wilg 1 hour ago

> Why is mental-health crisis not a gating category, the kind where the conversation stops, full stop, and the user is routed to a human? This is one of many questions I can’t find concrete answers for.

I don't know if there are studies or concrete data either way, but it seems at least plausible that continuing the conversation could be more effective (read: saves more lives) than stopping it.

avazhi 19 minutes ago

If you are using LLMs for emotional support or social interactions, you’ve got personal problems and that isn’t on the LLM provider to babysit. Same with people who unironically pay for OnlyFans or whatever.

I don’t even work in tech and I detest the Facebook/Zuckerbergs of the world but it’s obnoxious and trite seeing tech companies get scapegoated for what are ultimately social and societal problems, not tech problems.

As a solution it’d prob make sense to start with how disconnected most modern families are in terms of support and accountability.

From ChatGPT to Instagram, tech companies follow the contours of how society already operates.

adamnemecek 52 minutes ago

Autodiff is preventing any meaningful discussion about safety, systems trained with autodiff cannot be made safe.

simonw 1 hour ago

"There is no independent audit, no time series, no disclosed methodology, so we have no idea whether the real figure is higher, whether it is growing, or how it compares across the other frontier models, none of which publish equivalent data."

Tip for writers: aggressively filter out the "no X, no Y, no Z" pattern from your writing. Whether or not you used AI to help you write it's such a red flag now that you should be actively avoiding it in anything you publish.

falcor84 43 minutes ago

Why is it a red flag?
How is it different from any other purely stylistic rules such as Strunk and White's prohibitions against split infinitives and the passive voice, which we've left far behind us? Why shouldn't people just write however feels natural to them as long as the message is clear?
- simonw 41 minutes ago
  
  Because LLMs use it constantly, to the point that it sets my teeth on edge and instantly makes me question if reading the piece is worth my time.