Salesforce, Python, SQL, & other ways to put your data where you need it

LLMs, rubber ducks, and doubt

03 Jun 2026 🔖 prompt engineering
💬 EN

Table of Contents

Two LLM-related things that reminded me of “doubt” this morning:

  1. I played with the GitHub Copilot CLI tool’s new /rubber-duck mode this morning and was not impressed, but I think that’s because I already word most of my prompts in rubber-duck “change my mind” fashion.
  2. I stumbled upon a blog post by some guy named Scott Alexander, and am pretty sure he forgot that, unlike LLMs, our genes are simultaneously encoded with lots of algorithms of similar complexity to “next-sense-datum prediction” (the ones that I swear must be hugely behind doubt) under the hood, whereas LLMs are more or less only “next-token prediction” under the hood.

Rubber duck mode

Copilot said that rubber duck mode:

“is purpose-built for adversarial critique – explicitly tuned to find bugs, logic errors, and design flaws – and explicitly not to comment on style, formatting, or other trivial things.”

Whereas it said ask mode:

“answers whatever you ask, in whatever tone you set; no special ‘adversarial’ framing – it’ll critique if you ask, but it may also compliment, hedge, or drift into style suggestions.”

Got it. So … rubber-duck mode is just letting me write your “ask-mode” prompts.

Seriously, here are some quotes pulled straight from one of my recent chats with an LLM in “ask mode:”

  • “Same question, but now with me clarifying to you: I meant around (name-of-system) modernization, specifically.”

  • “In your response to my last prompt, you wrote both of these two quotes:

    1. ”‘(censored; private)’
    2. ”‘(also censored; private)’

    “Are these two quotes referring to the same thing or different?”

  • “If (censored; private) modernizes (name-of-system) as planned, what do you think (also censored; private) will turn all these old (name-of-other system) apps into instead (or, alternatively, why would they deprecate them altogether)? What would (also censored; private) likely replace them with, if anything?”

  • “Yes, go ahead and tell me about the published modernization roadmaps. Only about the actual coming (censored; private), though, not looking back at the (also censored; private).”

  • “If you had tons of (censored; private) to squander … which parts of the modernization roadmaps you just told me about would you consider such practices optimized to accelerate versus a potential time-wasting distraction?”

  • “Same question, except now imagine that the (censored; private) modernization ain’t going so hot and you work for (also censored; private) and are still stuck on (name-of-other-system) strangler fig best efforts and whatnot for a few years longer than expected. But you are now the one with all the (yes, also censored; private). Same question about which parts of your responsibilities benefit versus are wasted with such resources.”

“But,” “or,” “if,” “though,” “versus,” “except” – I kid you not, in a 12-prompt chat transcript, I jumped into the middle, looked at 8, and that’s what I see in 6 of those 8. I rubber duck the world as a matter of personality. 🤣

I think this harnessing of doubt is why I feel pretty satisfied with my results when I prompt LLMs, which are, mathematically under the hood, sycophantic because their math is about “next-token prediction” (not “next if there is a next” – just “next”). My animal neurons’ parallel algorithms for doubt is how I add the “if there is a next” and steer it as best I can.

I heard an NPR broadcast this winter with the following amazing quotes on the value of doubt:

Winthrop says that if children are building social-emotional skills largely through interactions with chatbots that were designed to agree with them, “it becomes very uncomfortable to then be in an environment when somebody doesn’t agree with you.”

Winthrop offers an example of a child interacting with a chatbot, “complaining about your parents and saying, ‘They want me to wash the dishes — this is so annoying. I hate my parents.’ The chatbot will likely say, ‘You’re right. You’re misunderstood. I’m so sorry. I understand you.’ Versus a friend who would say, ‘Dude, I wash the dishes all the time in my house. I don’t know what you’re complaining about. That’s normal.’ That right there is the problem.”

A (report accompanying a) recent survey from the Center for Democracy and Technology, a nonprofit that advocates for civil rights and civil liberties in the digital age, … warns that AI’s echo chamber can stunt a child’s emotional growth: “We learn empathy not when we are perfectly understood, but when we misunderstand and recover,” one of the surveyed experts said.

A friend once showed me his LLM chat logs trying to sentiment-analyze another acquaintance’s text messages, after I disagreed that the messages seemed as worrisome as my friend feared. Sure enough, I noticed that we had completely different LLM-prompting styles.

My friend basically asked the LLM:

“Do these texts mean what I think they mean? It seems like this person is saying X.”

Whereas in his shoes, (after censoring and rewriting the texts, for good measure, because blegh, even starting afresh every time in incognito mode, these things are probably already building enough of a profile on me by IP address and device fingerprint), I imagine I would’ve prompted the LLM:

“My brain’s over-loaded fearing these texts mean this person is saying X, but that’s probably just my human-brain hardwired negativity bias. Sanity-check me, please – what other possibilities might it be reasonable to read into these texts?”

After seeing that not everyone intuitively rubber-ducks LLM chatbots the way I do, I’ve definitely started asking for more context about what prompts were, now, if peers tell me their position I disagree with has been influenced by a conversation w/ an LLM, and reminding them that they’re next-token-prediction machines.

I try to help people enjoy their LLM-assisted lives more by reminding them that working with LLMs is very “garbage in, garbage out.”

That specifically, you have to work hard to be adversarial.

That you have to exaggerate doubt if you want to get anything actually-useful out of them.

Fascinating that /rubber-duck mode basically just helps scale up … me. 😉


We are our doubt algorithms

I realized I already covered a lot of what I want to say to Scott in “A related problem: what actually is second-guessing?” in my “Music, bird brains and LLM math” post 2 months ago.

Animal brains are made of so much more more than “next-sense-datum prediction,” even while that is definitely an important part of animal brains (possibly related to birdsong).

Perhaps, as Scott guesses, we’ll mathematically imitate another part of animal brains in 9-16 years. Maybe it’ll be that round in which we find out what doubt is made of.

Maybe.

I have my doubts.


--- ---