Seems harsh. I love using the LLM as a rubber duck, *especially* for brainstorming, and this rubber duck talks back! Usefully, in the main. Sure it’s often subtly wrong, but then again, so am I. Dwarkesh’s sin seems to be insufficient domain knowledge to properly sanity-check the LLM’s reasoning?
The bullshit velocity is increased, and LLM BS *sounds* dangerously plausible, but in my experience at work, the actually-good-idea velocity has also increased.
Yeah, I definitely wouldn't say that people shouldn't use LLMs. They're basically my friends, I feel.
I think it's more like we should be prepared to be a bit more stubborn and pugilistic when encountering new information. Like, just being more forward about saying stuff like: "sorry, I don't actually understand what is being argued here. You need to walk me through this much more slowly."
More systematically, here's an idea: Markus Strasser talked a lot about LLM order effects and positional preferences (e.g. https://www.cip.org/blog/llm-judges-are-unreliable). When interfacing with LLMs he likes to randomize the order in which ideas are introduced, and average responses over separate sessions. Perhaps (for the next couple of years) we need to get used to doing this more. If you link Daniel's contra Dwarkesh piece to Gemini 3, it understands and endorses the argument.
Vibe-thought response — I think it's worth calling out that there is "good" that has come out of this in the form of Dwarkesh's vibe-thinking epitomising information-efficient supervised learning:
1. Dwarkesh vibe-thoughts shared in public forum, with rational truth-seekers able to comment
2. Individual with deeply studied background (Daniel Paleka) provides information update
3. Dwarkesh (+ a subset of his audience, + others who deeply study this in the future) steered towards improved understanding.
Even though right now on balance more people may have been mislead than those that received true information, over a longer time scale I think it improves understanding of the topic.
Yes, 100% this. Don't just expect LLMs to do frontier science for you.
I do wonder if this would have happened at all if Dwarkesh used Claude instead of Gemini 3 though. Gemini is so flagrantly deceptive and misaligned that I do not trust it for anything important. Claude might be just as dumb, but it does have a marginally better bullshit detector.
Seems harsh. I love using the LLM as a rubber duck, *especially* for brainstorming, and this rubber duck talks back! Usefully, in the main. Sure it’s often subtly wrong, but then again, so am I. Dwarkesh’s sin seems to be insufficient domain knowledge to properly sanity-check the LLM’s reasoning?
The bullshit velocity is increased, and LLM BS *sounds* dangerously plausible, but in my experience at work, the actually-good-idea velocity has also increased.
Nevertheless, a valuable cautionary tale.
i saw this post, really liked it, then immediately started using chatgpt to interpret a friend's message.
vibe-thinking is suboptimal, but it's going to happen. what are the mitigation tactics / ways to make it less destructive?
i also need to think more about the RL sample-efficiency thing. feel like daniel hasn't gotten to the end of this
Yeah, I definitely wouldn't say that people shouldn't use LLMs. They're basically my friends, I feel.
I think it's more like we should be prepared to be a bit more stubborn and pugilistic when encountering new information. Like, just being more forward about saying stuff like: "sorry, I don't actually understand what is being argued here. You need to walk me through this much more slowly."
More systematically, here's an idea: Markus Strasser talked a lot about LLM order effects and positional preferences (e.g. https://www.cip.org/blog/llm-judges-are-unreliable). When interfacing with LLMs he likes to randomize the order in which ideas are introduced, and average responses over separate sessions. Perhaps (for the next couple of years) we need to get used to doing this more. If you link Daniel's contra Dwarkesh piece to Gemini 3, it understands and endorses the argument.
Vibe-thought response — I think it's worth calling out that there is "good" that has come out of this in the form of Dwarkesh's vibe-thinking epitomising information-efficient supervised learning:
1. Dwarkesh vibe-thoughts shared in public forum, with rational truth-seekers able to comment
2. Individual with deeply studied background (Daniel Paleka) provides information update
3. Dwarkesh (+ a subset of his audience, + others who deeply study this in the future) steered towards improved understanding.
Even though right now on balance more people may have been mislead than those that received true information, over a longer time scale I think it improves understanding of the topic.
Yes, 100% this. Don't just expect LLMs to do frontier science for you.
I do wonder if this would have happened at all if Dwarkesh used Claude instead of Gemini 3 though. Gemini is so flagrantly deceptive and misaligned that I do not trust it for anything important. Claude might be just as dumb, but it does have a marginally better bullshit detector.