> Cherny responded to the feedback by making changes. "We have repurposed the existing verbose mode setting for this," he said, so that it "shows file paths for read/searches. Does not show full thinking, hook output, or subagent output (coming in tomorrow's release)."
How to comply with a demand to show more information by showing less information.
Words have lost all meaning. "Verbose" no longer means "containing more words than necessary" but instead "Bit more than usual". "Fast" no longer mean "characterized by quick motion, operation, or effect" but instead depends on the company, some of them use slightly different way, but same "speed", but it's called "fast mode".
It's just a whole new world where words suddenly mean something completely different, and you can no longer understand programs by just reading what labels they use for various things, you need to also lookup if what they think "verbose" means matches with the meaning you've built up understanding of first.
Out of principle I'm never paying them a cent for "fast mode". I've already started using Codex anyway, will probably just cancel my sub since I've found I actually haven't needed CC at all since making the switch.
This is really the kind of things Claude sometimes does. "Actually, wait... let's repurpose the existing verbose mode for this, simpler, and it fits the user's request to limit bloating"
Yeah, I understood it such that the information was first moved from standard to verbose mode, and when people pointed out that they will drowned out in noise there, tge response was to cut down verbose mode as well.
I didn't know about the ^o mode though, so good that the verbose information is at least still available somewhere. Even though now it seems like an enormously complicated maneuver with no purpose.
Yeah but did he actually try to use the repurposed "verbose" mode?
I did, and it's way more verbose than I need, but the regular mode now is basically like mute mode. In addition, recently it started running a lot of stuff in the background and that causes some crazy flicker and Claude has become stubbornly autonomous. It just runs stuff in a flyby mode, asks me a question and then waits a couple seconds and proceeds with a default choice while I am still reading and considering options. I am left mashing Esc and that sometimes does not stop stuff either. Last couple updates have really annoyed me tbh.
Anthropic is walking a very thin line here. The competition between models is intense and the only differentiator right now is the so-called harness that gets put over them. Anthropic needs a niche and they tried to find one by addressing developers. And they have been doing very well!
What I think they are forgetting in this silly stubbornness is that competition is really fierce, and just as they have gained appreciation from developers, they might very quickly lose it because of this sort of stupidity (for no good reason).
Well they've successfully burned a bridge with me. I had 2 max subs, cancelled one of them and have been using Codex religiously for the last couple of weeks. Haven't had a need for Claude Code at all, and every time I open it I get annoyed at how slow it is and the lack of feedback - looking at it spin for 20 minutes on a simple prompt with no feedback is infuriating. Honestly, I don't miss it at all.
You have to go into /models then use the left/right arrow keys to change it. It’s a horrible UI design and I had no idea mine was set to high. You can only tell by the dim text at the bottom and the 3 potentially highlighted bars.
On high It would think for 30+ minutes, make a plan, then when I started the plan it would either compact and reread all my files, or start fresh and read my files, then compact after 2-3 changes and reread the files.
High reasoning is unusable with Opus 4.6 in my opinion. They need at least 1M context for this to work.
Is Claude Code really what makes them money, or is it their models? Both? Neither?
Do they believe that owning the harness (Claude Code) itself will lead to significantly more money? I can sort of see that, but I wouldn't think they are necessarily betting on it?
I use Anthropic's models wherever, whenever I can, be it cursor, copilot, you name it. I can't stand Claude Code for some reason, but I'll kill for those models.
On the other hand, I've seen some non-tech people have their "Holy shit!" moment with Claude Co-work (which I personally haven't tried yet) — and that's a market I can see them want to hold on to to branch out of the dev niche. The same moment happened when they tried their excel integration — they were completely mindblown.
I was experimenting with other tools a while back and the visibility/interactivity was one of the main benefits of Claude Code. Now that it's gone, maybe I can move on and just learn to work with the new tool/model.
Well, there is OpenCode [1] as an alternative, among many others. I have found OpenCode being the closest to Claude Code experience, and I find it quite good. Having said that I still prefer Claude Code for the moment.
Terminal scrolling opens a big can of worms for them, I doubt they'll ever implement it. The best you can do is enable scrollbars in opencode so you can quickly jump places.
It's a client/server architecture with an Open API spec at the boundary. You can tear off either side, put a proxy in the middle, whatever. Few hundred lines of diff weaponizes it.
I haven't tried it myself but there was a plenty of people in the other thread complaining that even on the Max subscription they couldn't use OpenCode.
What does Claude-Code do different that you still prefer it? I'm so in love with OpenCode, I just can't go back. It's such a nicer way of working. I even love the more advanced TUI
If one has a github sub, you can use OpenCode -> github -> \A models. It's not 100% (the context window I think is smaller, and they can be behind on the model version updates), but it's another way to get to \A models and not use CC.
Yup, the context window there is only half of what you get in CC so only a weak alternative. They burned bridges with the dev community by their decision to block any other clients
When did they successfully close the loophole? I know they tried a few times, but even the last attempt from a week or two ago was circumvented rather easily.
Oh, sounds like I'm just out of the loop then. I had an Opencode install that I was planning to check out, and then like, the next day there was the announcement from a week or two ago, so I just kinda shrugged and forgot about it.
Claude Code's handling of multiple choice questions is awfully nice (it uses an interactive interface to let you use arrows to select answers, and supports multiple answers). I haven't seen opencode do that yet, although I don't know if that's just a model integration issue -- I've only tried with GLM 4.7, GPT 5.1 Codex Mini, and GPT 5.2 Codex.
Indeed, Opencode has it too. They've been improving it the past few weeks to look more like the one in Claude-Code. I disable it all the time though, I find it such a pain (in both Claude-Code and OpenCode)
It's probably in their interest to have as many vibed codebases out there as possible, that no human would ever want to look at. Incentivising never-look-at-the-code is effectively a workflow lockin.
I always review every single change / file in full and spend around 40% of the time it takes to produce something doing so. I assume it's the same for a lot of people who used to develop code and swapped to mostly code generation (since it's just faster). The spend I time looking at it depends on how much I care about it - something you don't really get writing things manually.
Not trying to tell anyone else how to live, just want to make sure the other side of this argument is visible. I run 5+ agents all day every day. I measure, test, and validate outputs exhaustively. I value the decrease in noise in output here because I am very much not looking to micromanage process because I am simply too slow to keep up. When I want logging I can follow to understand “thought process” I ask for that in a specific format in my prompt something like “talk through the problem and your exploration of the data step by step as you go before you make any changes or do any work and use that plan as the basis of your actions”.
I still think it’d be nice to allow an output mode for you folks who are married to the previous approach since it clearly means a lot to you.
My primary plan is the $200 Claude max. They only operate during my working hours and there is significant downtime as they deliver results and await my review.
First, I agree with most commentators that they should just offer 3 modes of visibility: "default", "high", "verbose" or whatever
But I'm with you that this mode of working where you watch the agent work in real-time seems like it will be outdated soon. Even if we're not quite there, we've all seen how quickly these models improve. Last year I was saying Cursor was better because it allowed me to better understand every single change. I'm not really saying that anymore.
I noticed this too, but I think there's a much bigger problem.
The way Claude does research has dramatically changed for the worse. Instead of piping through code logically, it's now spawning dozens of completely unrelated research threads to look at simple problems. I let it spin for over 30 minutes last night before realizing it was just "lost".
I have since been looking for these moments and killing it immediately. I tell Claude "just look at the related code" and it says, "sorry I'll look at this directly".
Calling it “hiding” assumes the default should be full exposure of internal reasoning. That’s not obviously true.
There are three separate layers here:
What the model internally computes
What the product exposes to the user
What developers need for debugging and control
Most outrage conflates all three.
Exposing raw reasoning tokens sounds transparent, but in practice it often leaks messy intermediate steps, half-formed logic, or artifacts that were never meant to be user-facing. That doesn’t automatically make a product more trustworthy. Sometimes it just creates noise.
The real issue is not whether internal thoughts are hidden. It’s whether developers can:
If those are restricted, that’s a serious product problem. If what’s being “hidden” is just chain-of-thought verbosity, that’s a UI decision, not deception.
There’s also a business angle people don’t want to acknowledge. As models become productized infrastructure, vendors will protect internal mechanics the same way cloud providers abstract away hardware-level details. Full introspection is rarely a permanent feature in mature platforms.
Developers don’t actually want full transparency. They want reliability and control. If the system behaves predictably and exposes the right operational hooks, most people won’t care about hidden internal tokens.
The real question is: where should the abstraction boundary sit for a developer tool?
Claude logs the conversation to ~/.claude/projects, so you can write a tool to view them. I made a quick tool that has been valuable the last few weeks: https://github.com/panozzaj/cc-tail
Unless I'm mixing up stuff, this was addressed explicitly by an Antrophoc Dev on HN (I am not a developer, don't use the product, have zero equine animals in the game :)
And in turn, that discussion was addressed explicitly by this blog post, which is essentially a summary of the conversation that has been taking place across multiple venues.
I always get Claude Code to create a plan unless its trivial, it will describe all the changes its going to make and to which files, then let it rip in a new context.
Recently, Claude gives you these options when asking you to accept a plan:
Would you like to proceed?
> 1. Yes, clear context and auto-accept edits (shift+tab)
2. Yes, auto-accept edits
3. Yes, manually approve edits
4. Type here to tell Claude what to change
So the default is to do it in a new context.
If you examine what this actually does, it clears the context, and then says "here's the plan", points to the plan file, and also points to the logs of the previous discussion so that if it determines it should go back and look at them, it can.
It breaks a spec (or freeform input) down into a structured json plan, then kicks off a new non-interactive session of Claude or codex for each task. Sounds like it could fit your workflow pretty well.
"Hiding" is doing some heavy lifting here. You can run --json and see everything pretty much (besides the system prompt and tool descriptions)....
I love the terminal more than the next guy but at some point it feels like you're looking at production nginx logs, just a useless stream of info that is very difficult to parse.
I vibe coded my own ADE for this called OpenADE (https://github.com/bearlyai/openade) it uses the native harnesses, has nice UIs and even comes with things like letting Claude and Codex work together on plans. Still very beta but has been my daily driver for a few weeks now.
Given that we're talking about terminals, I'd argue there's a pretty good precedent for "hidden" meaning "not visible by default but possible to view at the expense of less clarity and extra noise"; no one th
ADE! first time I've heard that acronym. (I assume it means Agent development environment?)
Your interface looks pretty cool! I built something similar-ish though with a different featureset / priority (https://github.com/kzahel/yepanywhere - meant to be a mobile first interface but I also use it at my desk almost exclusively)
It sounds like you have some features to comment directly on markdown? That sounds pretty useful. I love how Antigravity has that feature.
the project just does subprocess calls to claude code (the product/cli). I think services like open code were using it to make raw requests to claude api. Have any more context I can look into?
Debugging an LLM integration without seeing the reasoning is like debugging a microservice with no logs. You end up cargo-culting prompt changes until something works, with no idea why.
Honestly, this feels like a massive step back. When I use an agent, I'm not just a user, I'm a supervisor. I need observability. If Claude starts digging into node_modules or opening some stale config from 2019, I need to know immediately so I can smash Ctrl+C
Hiding filenames turns the workflow into a black box. It’s like removing the speedometer from a car because "it distracts the driver". Sure it looks clean, but it's deadly for both my wallet and my context window
The issue of it burning through tokens grepping around should be fixed with language server integration, but that’s broken in Claude Code and the MCP code nav tools seem to use more tokens than just a home-built code map in markdown files.
They got so many things right in the beginning but now seem to lose touch with their core fan base, the developers. It's the typical corporate grind, a million competing interests arise where it's not anymore about the user but about politics and whoops, that's when you know you're not anymore a startup.
The real issue isn’t whether Claude hides actions or shows them. It’s that once you move from “assistant” to “agent”, observability becomes a hard requirement, not a nice-to-have.
When an agent can read, modify, and orchestrate multiple parts of a codebase, you need the equivalent of logs, traces, and diffs — not just summaries. Otherwise debugging becomes guesswork.
Traditional software became reliable only after we built strong observability tooling around it. Agent workflows will need the same evolution: clear execution traces, deterministic diffs, and full transparency into what happened and why.
Anthropic optimized for "clean UI" metrics and forgot developers care more about not having their codebase silently corrupted. Every AI company relearns the same lesson: autonomy is the enemy of trust.
Can anybody break my black glasses and offer an anecdote of a high-employee count firm actually involving humans for reading feedback? I suspect its just there for "later", but never actually looked at by anyone...
You know when your game crashes on PS5 and you get a little popup that offers you the opportunity to write feedback/description of the crash?
Yeah, I used to sit and read all of these(at one of the largest video game publishers - does that count?). 95% of them were "your game sucks" but we fixed many bugs thanks to detailed descriptions that people have provided through that box.
Between this and 4.6's tendency to do so much more "exploratory" work, I am back to using ChatGPT Codex for some tasks.
Two months ago, Claude was great for "here is a specific task I want you to do to this file". Today, they seem to be pivoting towards "I don't know how to code but want this feature" usage. Which might be a good product decision, but makes it worse as a substitute for writing the code myself.
I feel the exact same way. Trying to cater to the "no-code" crowd is blurring the product's focus. It seems they've stuffed the system prompt with "be creative and explore" instructions, which kills determinism - so now we have to burn tokens just to tell it: "Don't think, just write the code"
Same here, both Claude Code due to this change, and how Opus 4.6 is setup, they think they can do things autonomously. But in my experience, they really can't. Letting it overthink something while being on the wrong track is what leads to AI slop.
How long until the status display is just an optimized display of what the human wants to see while being fully disconnected from what is actually happening?
Seems like this is the most probable outcome: LLM gets to fix the issues undisrupted while keeping the operator happy.
I find it interesting that this does lead to a pattern that consumes more tokens (and by extension usage and money). If you don’t interrupt something going wrong, you’ll burn more tokens faster. Food for thought, but it does seem like a perverse incentive.
Hopefully with the advent of AI coding, OSS frontends for all sorts of commercial backends will be more frequent, have higher quality, and consumers would be able to vote with their wallets for high-quality APIs enabling said frontends.
It's all well and good for Anthropic developers who have 10x the model speed us regular users have and so their TUI is streaming quickly. But over here, it takes 20 minutes for Claude to do a basic task.
It feels like they're optimizing the UI for demo reels rather than real-world work. A clean screen is cool when everything is flying, but when things start lagging, I need verbose mode to see exactly where we're stuck and if I should even bother waiting
The root cause is that the TUI is just not the right surface an agentic coding tool. If it were a GUI with mouse support and other rich affordances, it'd be trivial to click to expand
We have been playing with glm4.7 on cerebras which I hope to be the near future for any model; it generates 1000s of lines when you recover from a sneeze : it's absolutely irrelevant if you can see what it does because there is no way you can read it live (at 1000s of tokens/s) and you are not going to read it afterwards. Catching it before it does something weird is just silly; you won't be able to react. Works great for us combined with Claude Code; claude does the senior work like planning and takes its time: glm does the implementation in a few seconds.
That holds up for code generation (where tokens fly by), but not for tool use. The agent often stalls between tool calls, and those are exactly the moments I need to see what it's planning, not just stare at a blank screen
That is such silly framing. They are not "trying" to hide anything. They are trying to create a better product -- and might be making unpopular or simply bad choices along the way -- but the objective here is not to obfuscate which files are edited. It's a side effect.
Instead of adding a settings option to hide the filenames they hide them for everyone AND rewrite verbose mode, which is no longer a verbose mode, but the way to see filenames, thus breaking everyone's (depending on these) workflows for...... what exactly?
If they tried to create a better product I'd expect them to just add the awesome option, not hide something that saves thousands of tokens and context if the model goes the wrong way.
Again, the framing is simply not sensible. Why would they want to break "everyone's" workflow ("everyone" including the people working at Anthropic, who use the product themselves, which should give us some pause)? Why would you ever want to make a bad decision?
The answer in both cases is: You don't. If it happens, it's because you sometimes make bad decisions, because it's hard to make good decisions.
"Boris Cherny" seems pretty good at this enshittification stuff. Think about it, normal coders would consider having a config like show details or don't, you know, a developers preference but no this guy wants you to control-o all the time, read the article its right there what this guy says:
" A GitHub issue on the subject drew a response from Boris Cherny, creator and head of Claude Code at Anthropic, that "this
isn't a vibe coding feature, it's a way to simplify the UI so you can focus on what matters, diffs and bash/mcp outputs." He
suggested that developers "try it out for a few days" and said that Anthropic's own developers "appreciated the reduced noise.""
Seriously man, whatever happened to configs that you can set once. They obviously realise that people want it with the control-o but why make them do this over and over without a way to just config it, or whatever the cli does like maybe:
./clod-code -v
or something. Man I dislike these AI bros so much, there always about "your personal preferences are wrong" but you know they are lying through their smirking teeth they want you to burn tokens so the earth's inhabitability can die a few minutes earlier.
Speaking of burning tokens, they also like to waste our tokens with paragraphs of system messages for every single file read you do with Claude. Take a look at your jsonl files, search for <system-reminder>.
Are you one of those developers that hates debuggers and stack traces, and would rather spend three hours looking at the output or adding prints for something that would take 5 minutes to any sane developer?
This is very much a tangent, and was asked in bad faith, but I’ll answer anyways!
One of the interesting things about working on distributed systems, is that you can reproduce problems without having to reproduce or mock a long stack trace
So I certainly don’t see the case you’re talking about where it takes hours to reproduce or understand a problem without a debugger. Of course there are still many times when a debugger should be consulted! There is always a right tool for a given job.
The nice thing about the competition in the CLI space is that... you can just move? CC has always been a bit wonky/ this is active enshittification- there is the likes of Codex etc...
"You can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem" — https://news.ycombinator.com/item?id=9224
> Cherny responded to the feedback by making changes. "We have repurposed the existing verbose mode setting for this," he said, so that it "shows file paths for read/searches. Does not show full thinking, hook output, or subagent output (coming in tomorrow's release)."
How to comply with a demand to show more information by showing less information.
Words have lost all meaning. "Verbose" no longer means "containing more words than necessary" but instead "Bit more than usual". "Fast" no longer mean "characterized by quick motion, operation, or effect" but instead depends on the company, some of them use slightly different way, but same "speed", but it's called "fast mode".
It's just a whole new world where words suddenly mean something completely different, and you can no longer understand programs by just reading what labels they use for various things, you need to also lookup if what they think "verbose" means matches with the meaning you've built up understanding of first.
I’m literally dead
My condolences.
You thought "fast mode" was describing the agent? No no no, it's describing your spend, since it only uses "extra usage."
Out of principle I'm never paying them a cent for "fast mode". I've already started using Codex anyway, will probably just cancel my sub since I've found I actually haven't needed CC at all since making the switch.
To be fair, some of the best software out there has multiple levels of verbosity. Usually enabled with extra ‘-v’ short options.
This is really the kind of things Claude sometimes does. "Actually, wait... let's repurpose the existing verbose mode for this, simpler, and it fits the user's request to limit bloating"
They changed it from showing just number of files read to showing the actual paths/filenames. IE, it shows more information.
EDIT: Ah, looks like verbose mode might show less than it used to, and you need to use a new mode (^o) to show very verbose.
Yeah, I understood it such that the information was first moved from standard to verbose mode, and when people pointed out that they will drowned out in noise there, tge response was to cut down verbose mode as well.
I didn't know about the ^o mode though, so good that the verbose information is at least still available somewhere. Even though now it seems like an enormously complicated maneuver with no purpose.
Yeah but did he actually try to use the repurposed "verbose" mode? I did, and it's way more verbose than I need, but the regular mode now is basically like mute mode. In addition, recently it started running a lot of stuff in the background and that causes some crazy flicker and Claude has become stubbornly autonomous. It just runs stuff in a flyby mode, asks me a question and then waits a couple seconds and proceeds with a default choice while I am still reading and considering options. I am left mashing Esc and that sometimes does not stop stuff either. Last couple updates have really annoyed me tbh.
Did you enable verbose output in the settings? That gives you a different view then pressing ctrl-o. And this naming is endlessly confusing!
Yes I did. It was an overwhelming amount of stuff. It was noisy.
Anthropic is walking a very thin line here. The competition between models is intense and the only differentiator right now is the so-called harness that gets put over them. Anthropic needs a niche and they tried to find one by addressing developers. And they have been doing very well!
What I think they are forgetting in this silly stubbornness is that competition is really fierce, and just as they have gained appreciation from developers, they might very quickly lose it because of this sort of stupidity (for no good reason).
Well they've successfully burned a bridge with me. I had 2 max subs, cancelled one of them and have been using Codex religiously for the last couple of weeks. Haven't had a need for Claude Code at all, and every time I open it I get annoyed at how slow it is and the lack of feedback - looking at it spin for 20 minutes on a simple prompt with no feedback is infuriating. Honestly, I don't miss it at all.
Check your model thinking effort.
You have to go into /models then use the left/right arrow keys to change it. It’s a horrible UI design and I had no idea mine was set to high. You can only tell by the dim text at the bottom and the 3 potentially highlighted bars.
On high It would think for 30+ minutes, make a plan, then when I started the plan it would either compact and reread all my files, or start fresh and read my files, then compact after 2-3 changes and reread the files.
High reasoning is unusable with Opus 4.6 in my opinion. They need at least 1M context for this to work.
You can press Ctrl-P instead of typing /model too
Doesn't work on Windows in a VS Code terminal window, unfortunately.
Is Claude Code really what makes them money, or is it their models? Both? Neither?
Do they believe that owning the harness (Claude Code) itself will lead to significantly more money? I can sort of see that, but I wouldn't think they are necessarily betting on it?
I use Anthropic's models wherever, whenever I can, be it cursor, copilot, you name it. I can't stand Claude Code for some reason, but I'll kill for those models.
On the other hand, I've seen some non-tech people have their "Holy shit!" moment with Claude Co-work (which I personally haven't tried yet) — and that's a market I can see them want to hold on to to branch out of the dev niche. The same moment happened when they tried their excel integration — they were completely mindblown.
Well they can just wait until other AI companies like Cursor become dependent on them, then jack up the prices without warning[0].
[0] https://generativeai.pub/cursors-pricing-change-sparks-outra...
I was experimenting with other tools a while back and the visibility/interactivity was one of the main benefits of Claude Code. Now that it's gone, maybe I can move on and just learn to work with the new tool/model.
Well, there is OpenCode [1] as an alternative, among many others. I have found OpenCode being the closest to Claude Code experience, and I find it quite good. Having said that I still prefer Claude Code for the moment.
[1] https://opencode.ai/
oh-my-pi plug https://github.com/can1357/oh-my-pi
i don't get this. isn't it contradictory to the philosophy of pi to start as slick as possible?
Yes it is. However, I played with it a bit and it feels good. You can modify pretty much anything.
I have some cheap gpt 5 mini tokens that I can burn on sub agents. Each sub agent is configurable down to which llm to use
OpenCode would be nicer if they used normal terminal scrolling and not their own thing :(
Terminal scrolling opens a big can of worms for them, I doubt they'll ever implement it. The best you can do is enable scrollbars in opencode so you can quickly jump places.
we are going to implement this
lmao
It's a client/server architecture with an Open API spec at the boundary. You can tear off either side, put a proxy in the middle, whatever. Few hundred lines of diff weaponizes it.
I haven't tried it myself but there was a plenty of people in the other thread complaining that even on the Max subscription they couldn't use OpenCode.
What does Claude-Code do different that you still prefer it? I'm so in love with OpenCode, I just can't go back. It's such a nicer way of working. I even love the more advanced TUI
Are you paying per-token after Anthropic closed the loophole on letting you log in to OpenCode?
If one has a github sub, you can use OpenCode -> github -> \A models. It's not 100% (the context window I think is smaller, and they can be behind on the model version updates), but it's another way to get to \A models and not use CC.
Yup, the context window there is only half of what you get in CC so only a weak alternative. They burned bridges with the dev community by their decision to block any other clients
When did they successfully close the loophole? I know they tried a few times, but even the last attempt from a week or two ago was circumvented rather easily.
Oh, sounds like I'm just out of the loop then. I had an Opencode install that I was planning to check out, and then like, the next day there was the announcement from a week or two ago, so I just kinda shrugged and forgot about it.
Claude Code's handling of multiple choice questions is awfully nice (it uses an interactive interface to let you use arrows to select answers, and supports multiple answers). I haven't seen opencode do that yet, although I don't know if that's just a model integration issue -- I've only tried with GLM 4.7, GPT 5.1 Codex Mini, and GPT 5.2 Codex.
Opencode also has that feature, I've seen it multiple times in the last days (mostly using Opus 4.5/4.6/Gemini 3)
Indeed, Opencode has it too. They've been improving it the past few weeks to look more like the one in Claude-Code. I disable it all the time though, I find it such a pain (in both Claude-Code and OpenCode)
Interesting. I wonder if it's just a matter of prompting properly or if GPT/GLM just doesn't have that training.
I've liked opencode+glm5 quite a bit so far.
It's probably in their interest to have as many vibed codebases out there as possible, that no human would ever want to look at. Incentivising never-look-at-the-code is effectively a workflow lockin.
I always review every single change / file in full and spend around 40% of the time it takes to produce something doing so. I assume it's the same for a lot of people who used to develop code and swapped to mostly code generation (since it's just faster). The spend I time looking at it depends on how much I care about it - something you don't really get writing things manually.
Not trying to tell anyone else how to live, just want to make sure the other side of this argument is visible. I run 5+ agents all day every day. I measure, test, and validate outputs exhaustively. I value the decrease in noise in output here because I am very much not looking to micromanage process because I am simply too slow to keep up. When I want logging I can follow to understand “thought process” I ask for that in a specific format in my prompt something like “talk through the problem and your exploration of the data step by step as you go before you make any changes or do any work and use that plan as the basis of your actions”.
I still think it’d be nice to allow an output mode for you folks who are married to the previous approach since it clearly means a lot to you.
> I run 5+ agents all day every day
Curious what plans you’re using? running 24/7 x 5 agents would eat up several $200 subscriptions pretty fast
My primary plan is the $200 Claude max. They only operate during my working hours and there is significant downtime as they deliver results and await my review.
This is the fundamental tension in this story, yes.
+1
First, I agree with most commentators that they should just offer 3 modes of visibility: "default", "high", "verbose" or whatever
But I'm with you that this mode of working where you watch the agent work in real-time seems like it will be outdated soon. Even if we're not quite there, we've all seen how quickly these models improve. Last year I was saying Cursor was better because it allowed me to better understand every single change. I'm not really saying that anymore.
> I measure, test, and validate outputs exhaustively.
How do you do this? Do you follow traditional testing practices or do you have novel strategies like agents with separate responsibilities?
I noticed this too, but I think there's a much bigger problem.
The way Claude does research has dramatically changed for the worse. Instead of piping through code logically, it's now spawning dozens of completely unrelated research threads to look at simple problems. I let it spin for over 30 minutes last night before realizing it was just "lost".
I have since been looking for these moments and killing it immediately. I tell Claude "just look at the related code" and it says, "sorry I'll look at this directly".
WTF Anthropic?
Was this from a specific model or all of them?
I mostly run on 4.6.
It is not related to the model, I think, it is the newer Claude Code versions.
Calling it “hiding” assumes the default should be full exposure of internal reasoning. That’s not obviously true.
There are three separate layers here:
What the model internally computes
What the product exposes to the user
What developers need for debugging and control
Most outrage conflates all three.
Exposing raw reasoning tokens sounds transparent, but in practice it often leaks messy intermediate steps, half-formed logic, or artifacts that were never meant to be user-facing. That doesn’t automatically make a product more trustworthy. Sometimes it just creates noise.
The real issue is not whether internal thoughts are hidden. It’s whether developers can:
• Inspect tool calls • See execution traces • Debug failure modes • Reproduce behavior deterministically
If those are restricted, that’s a serious product problem. If what’s being “hidden” is just chain-of-thought verbosity, that’s a UI decision, not deception.
There’s also a business angle people don’t want to acknowledge. As models become productized infrastructure, vendors will protect internal mechanics the same way cloud providers abstract away hardware-level details. Full introspection is rarely a permanent feature in mature platforms.
Developers don’t actually want full transparency. They want reliability and control. If the system behaves predictably and exposes the right operational hooks, most people won’t care about hidden internal tokens.
The real question is: where should the abstraction boundary sit for a developer tool?
Claude logs the conversation to ~/.claude/projects, so you can write a tool to view them. I made a quick tool that has been valuable the last few weeks: https://github.com/panozzaj/cc-tail
Unless I'm mixing up stuff, this was addressed explicitly by an Antrophoc Dev on HN (I am not a developer, don't use the product, have zero equine animals in the game :)
https://news.ycombinator.com/item?id=46981968
And in turn, that discussion was addressed explicitly by this blog post, which is essentially a summary of the conversation that has been taking place across multiple venues.
I don't know a lot about The Register, but I thought it was a news platform?
Although, this post surely isn't "news" as much as it is, as you said, a summary of a conversation being held on other platform(s).
So maybe it is just a blog post?
I always get Claude Code to create a plan unless its trivial, it will describe all the changes its going to make and to which files, then let it rip in a new context.
Why use a new context? Or you mean, just accept the plan and it automatically clears the context.
Yes sorry, CC does it, then rereads all the files from scratch with the plan in mind.
Recently, Claude gives you these options when asking you to accept a plan:
So the default is to do it in a new context.
If you examine what this actually does, it clears the context, and then says "here's the plan", points to the plan file, and also points to the logs of the previous discussion so that if it determines it should go back and look at them, it can.
Yes, its basically another way to compact context, means it there is less chance start compacting part way through the plan.
(Mildly) shameless plug, but you might be interested in a tool I’ve been building: https://github.com/jbonatakis/blackbird
It breaks a spec (or freeform input) down into a structured json plan, then kicks off a new non-interactive session of Claude or codex for each task. Sounds like it could fit your workflow pretty well.
"Hiding" is doing some heavy lifting here. You can run --json and see everything pretty much (besides the system prompt and tool descriptions)....
I love the terminal more than the next guy but at some point it feels like you're looking at production nginx logs, just a useless stream of info that is very difficult to parse.
I vibe coded my own ADE for this called OpenADE (https://github.com/bearlyai/openade) it uses the native harnesses, has nice UIs and even comes with things like letting Claude and Codex work together on plans. Still very beta but has been my daily driver for a few weeks now.
> --json
Seriously? This can't be a comparable experience in terms of UX.
I think my read of "hiding" was more of a "trying to hide the secret sauce" which was implied in a few places.
Otherwise it seems like a minor UI decision any other app would make and it surprising there's whole articles on it.
> I think my read of "hiding" was more of a "trying to hide the secret sauce" which was implied in a few places.
That was very much not my read of it.
Given that we're talking about terminals, I'd argue there's a pretty good precedent for "hidden" meaning "not visible by default but possible to view at the expense of less clarity and extra noise"; no one th
ADE! first time I've heard that acronym. (I assume it means Agent development environment?)
Your interface looks pretty cool! I built something similar-ish though with a different featureset / priority (https://github.com/kzahel/yepanywhere - meant to be a mobile first interface but I also use it at my desk almost exclusively)
It sounds like you have some features to comment directly on markdown? That sounds pretty useful. I love how Antigravity has that feature.
Why does it say "Works with your existing Claude Code subscription"? I thought Anthropic banned use of CC subscriptions in third-party software?
the project just does subprocess calls to claude code (the product/cli). I think services like open code were using it to make raw requests to claude api. Have any more context I can look into?
This article is mostly about this discussion on hn: https://news.ycombinator.com/item?id=46978710
Debugging an LLM integration without seeing the reasoning is like debugging a microservice with no logs. You end up cargo-culting prompt changes until something works, with no idea why.
Honestly, this feels like a massive step back. When I use an agent, I'm not just a user, I'm a supervisor. I need observability. If Claude starts digging into node_modules or opening some stale config from 2019, I need to know immediately so I can smash Ctrl+C
Hiding filenames turns the workflow into a black box. It’s like removing the speedometer from a car because "it distracts the driver". Sure it looks clean, but it's deadly for both my wallet and my context window
The issue of it burning through tokens grepping around should be fixed with language server integration, but that’s broken in Claude Code and the MCP code nav tools seem to use more tokens than just a home-built code map in markdown files.
They got so many things right in the beginning but now seem to lose touch with their core fan base, the developers. It's the typical corporate grind, a million competing interests arise where it's not anymore about the user but about politics and whoops, that's when you know you're not anymore a startup.
I've noticed more and more of the llm providers are trying to hide as much as possible of their thinking and inner working.
Anthropic doesn't want you to be easily able to jump off claude code into open code + open weight llm.
The problem with monetizing AI is that a useful model can build a better version of itself and will guide you through the process.
The real issue isn’t whether Claude hides actions or shows them. It’s that once you move from “assistant” to “agent”, observability becomes a hard requirement, not a nice-to-have.
When an agent can read, modify, and orchestrate multiple parts of a codebase, you need the equivalent of logs, traces, and diffs — not just summaries. Otherwise debugging becomes guesswork.
Traditional software became reliable only after we built strong observability tooling around it. Agent workflows will need the same evolution: clear execution traces, deterministic diffs, and full transparency into what happened and why.
Anthropic optimized for "clean UI" metrics and forgot developers care more about not having their codebase silently corrupted. Every AI company relearns the same lesson: autonomy is the enemy of trust.
When their questionnaire asked me for feedback I specifically mentioned that I hoped they would not reduce visibility to the point of Github Actions.
I guess that fell on deaf ears.
Can anybody break my black glasses and offer an anecdote of a high-employee count firm actually involving humans for reading feedback? I suspect its just there for "later", but never actually looked at by anyone...
You know when your game crashes on PS5 and you get a little popup that offers you the opportunity to write feedback/description of the crash?
Yeah, I used to sit and read all of these(at one of the largest video game publishers - does that count?). 95% of them were "your game sucks" but we fixed many bugs thanks to detailed descriptions that people have provided through that box.
Between this and 4.6's tendency to do so much more "exploratory" work, I am back to using ChatGPT Codex for some tasks.
Two months ago, Claude was great for "here is a specific task I want you to do to this file". Today, they seem to be pivoting towards "I don't know how to code but want this feature" usage. Which might be a good product decision, but makes it worse as a substitute for writing the code myself.
Have you played with the effort setting? I'm finding medium effort on 4.6 to give more satisfactory results for that kind of thing.
I feel the exact same way. Trying to cater to the "no-code" crowd is blurring the product's focus. It seems they've stuffed the system prompt with "be creative and explore" instructions, which kills determinism - so now we have to burn tokens just to tell it: "Don't think, just write the code"
Same here, both Claude Code due to this change, and how Opus 4.6 is setup, they think they can do things autonomously. But in my experience, they really can't. Letting it overthink something while being on the wrong track is what leads to AI slop.
Perhaps they can just make it an option??
How long until the status display is just an optimized display of what the human wants to see while being fully disconnected from what is actually happening?
Seems like this is the most probable outcome: LLM gets to fix the issues undisrupted while keeping the operator happy.
heh kind of like giving an engineering manager a nice dashboards with lots of graphs and knobs. it keeps them out of your hair.
I find it interesting that this does lead to a pattern that consumes more tokens (and by extension usage and money). If you don’t interrupt something going wrong, you’ll burn more tokens faster. Food for thought, but it does seem like a perverse incentive.
Hopefully with the advent of AI coding, OSS frontends for all sorts of commercial backends will be more frequent, have higher quality, and consumers would be able to vote with their wallets for high-quality APIs enabling said frontends.
I made a little TUI app to monitor CC sessions and show you the commands. https://github.com/joshpearce/cc_session_mon
It's all well and good for Anthropic developers who have 10x the model speed us regular users have and so their TUI is streaming quickly. But over here, it takes 20 minutes for Claude to do a basic task.
It feels like they're optimizing the UI for demo reels rather than real-world work. A clean screen is cool when everything is flying, but when things start lagging, I need verbose mode to see exactly where we're stuck and if I should even bother waiting
The root cause is that the TUI is just not the right surface an agentic coding tool. If it were a GUI with mouse support and other rich affordances, it'd be trivial to click to expand
Claude Code in VSCode works exactly like this. Click to expand file search results, command output, etc.
I am not surprised they do that. Traditionally, there doesn't seem to be that much money in interactive theorem proving.
If you use Claude Code in Nimbalyst it tracks every file change for you and gives you red/green diffs for your session.
We have been playing with glm4.7 on cerebras which I hope to be the near future for any model; it generates 1000s of lines when you recover from a sneeze : it's absolutely irrelevant if you can see what it does because there is no way you can read it live (at 1000s of tokens/s) and you are not going to read it afterwards. Catching it before it does something weird is just silly; you won't be able to react. Works great for us combined with Claude Code; claude does the senior work like planning and takes its time: glm does the implementation in a few seconds.
That holds up for code generation (where tokens fly by), but not for tool use. The agent often stalls between tool calls, and those are exactly the moments I need to see what it's planning, not just stare at a blank screen
Depends on the tools I guess. It can race through 100s of commands in bash in a blink.
They are doing this so they can eventually remove the feature entirely in the future.
That is such silly framing. They are not "trying" to hide anything. They are trying to create a better product -- and might be making unpopular or simply bad choices along the way -- but the objective here is not to obfuscate which files are edited. It's a side effect.
How can you combat one unprovable framing by insisting on another unprovable framing?
Instead of adding a settings option to hide the filenames they hide them for everyone AND rewrite verbose mode, which is no longer a verbose mode, but the way to see filenames, thus breaking everyone's (depending on these) workflows for...... what exactly?
If they tried to create a better product I'd expect them to just add the awesome option, not hide something that saves thousands of tokens and context if the model goes the wrong way.
Again, the framing is simply not sensible. Why would they want to break "everyone's" workflow ("everyone" including the people working at Anthropic, who use the product themselves, which should give us some pause)? Why would you ever want to make a bad decision?
The answer in both cases is: You don't. If it happens, it's because you sometimes make bad decisions, because it's hard to make good decisions.
"Boris Cherny" seems pretty good at this enshittification stuff. Think about it, normal coders would consider having a config like show details or don't, you know, a developers preference but no this guy wants you to control-o all the time, read the article its right there what this guy says:
" A GitHub issue on the subject drew a response from Boris Cherny, creator and head of Claude Code at Anthropic, that "this isn't a vibe coding feature, it's a way to simplify the UI so you can focus on what matters, diffs and bash/mcp outputs." He suggested that developers "try it out for a few days" and said that Anthropic's own developers "appreciated the reduced noise.""
Seriously man, whatever happened to configs that you can set once. They obviously realise that people want it with the control-o but why make them do this over and over without a way to just config it, or whatever the cli does like maybe:
./clod-code -v
or something. Man I dislike these AI bros so much, there always about "your personal preferences are wrong" but you know they are lying through their smirking teeth they want you to burn tokens so the earth's inhabitability can die a few minutes earlier.
Speaking of burning tokens, they also like to waste our tokens with paragraphs of system messages for every single file read you do with Claude. Take a look at your jsonl files, search for <system-reminder>.
Keep cattle, not pets! The advice that used to apply for managing large numbers of machines also applies to managing coding agents.
If you rely on monitoring the behaviors of an individual coding agent to produce the output you want, you won't scale
Are you one of those developers that hates debuggers and stack traces, and would rather spend three hours looking at the output or adding prints for something that would take 5 minutes to any sane developer?
This is very much a tangent, and was asked in bad faith, but I’ll answer anyways!
One of the interesting things about working on distributed systems, is that you can reproduce problems without having to reproduce or mock a long stack trace
So I certainly don’t see the case you’re talking about where it takes hours to reproduce or understand a problem without a debugger. Of course there are still many times when a debugger should be consulted! There is always a right tool for a given job.
Correction: "some devs hate it"
lol the title of this post immediately feels like something I'd see on buzzfeed or my google news feed in mobile chrome.
The nice thing about the competition in the CLI space is that... you can just move? CC has always been a bit wonky/ this is active enshittification- there is the likes of Codex etc...
Feels like moment that would be looked back at as the beginning of enshittification
Srsly? People actually watch all the chatter in the little window?
Pro tip: "git diff"
ctrl+o ?
Why not run Claude on an FUSE based filesystem, and make a script that shows the user which files are being accessed?
"You can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem" — https://news.ycombinator.com/item?id=9224
You can basically ask Claude to build it for you :)
If you read their immediate reply you’ll see it invalidates this tired point: https://news.ycombinator.com/item?id=9479
Better link shows exchange:
https://news.ycombinator.com/item?id=9224
Or this pulls the exchange under the famous HN post itself:
https://news.ycombinator.com/item?id=8863
Why not script automatic C-o injection?
Ultimately, the problem is the tool turning against the user. Maybe it is time to get a new tool.