<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Agents on TurboVision</title>
    <link>https://turbovision.in6-addr.net/tags/agents/</link>
    <description>Recent content in Agents on TurboVision</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Tue, 21 Apr 2026 14:06:12 +0000</lastBuildDate>
    <atom:link href="https://turbovision.in6-addr.net/tags/agents/index.xml" rel="self" type="application/rss&#43;xml" />
    
    
    
    <item>
      <title>MCPs: &#34;Useful&#34; Was Never the Real Threshold --  &#34;Consequential&#34; was.</title>
      <link>https://turbovision.in6-addr.net/musings/ai-language-protocols/mcps-useful-was-never-the-real-threshold-consequential-was/</link>
      <pubDate>Mon, 20 Apr 2026 00:00:00 +0000</pubDate>
      <lastBuildDate>Mon, 20 Apr 2026 00:00:00 +0000</lastBuildDate>
      <guid>https://turbovision.in6-addr.net/musings/ai-language-protocols/mcps-useful-was-never-the-real-threshold-consequential-was/</guid>
      <description>&lt;p&gt;For a while, the industry kept talking as if tool access merely made models more &amp;ldquo;useful&amp;rdquo;. That description is too soft by half, because the real shift is harsher: once a model can perceive and act through an environment, its outputs stop being merely interesting and start becoming &amp;ldquo;consequential&amp;rdquo;.&lt;/p&gt;
&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://modelcontextprotocol.io/specification/latest&#34;&gt;Model Context Protocol (MCP)&lt;/a&gt; does not just make language models more capable in some vague product sense. It moves them closer to &amp;ldquo;consequence&amp;rdquo; by connecting model output to trusted systems, permissions, tools, and environments where words can become actions.&lt;/p&gt;
&lt;h2 id=&#34;the-question&#34;&gt;The Question&lt;/h2&gt;
&lt;p&gt;You may ask: if MCP is just a protocol for tools and context, why treat it as such a serious threshold? Why not simply say it makes models more &amp;ldquo;useful&amp;rdquo; and leave it at that?&lt;/p&gt;
&lt;h2 id=&#34;the-long-answer&#34;&gt;The Long Answer&lt;/h2&gt;
&lt;p&gt;Because &lt;code&gt;&amp;quot;useful&amp;quot;&lt;/code&gt; is marketing language. &lt;code&gt;&amp;quot;consequential&amp;quot;&lt;/code&gt; is the serious word.&lt;/p&gt;
&lt;p&gt;An LLM on its own is still mostly trapped inside text. Yes, text matters. Text persuades, misleads, reassures, coordinates, manipulates, flatters, and occasionally clarifies. But absent tool access, the model remains largely confined to symbolic output that a human still has to read, interpret, and turn into action.&lt;/p&gt;
&lt;p&gt;The moment &lt;a href=&#34;https://modelcontextprotocol.io/docs/learn&#34;&gt;MCP&lt;/a&gt; enters the picture, that changes. Not magically. Not philosophically. Operationally.&lt;/p&gt;
&lt;p&gt;Now the model can observe through tools. It can pull in state it was not explicitly handed in the original prompt. It can request actions in systems it does not itself implement. It can inspect, decide, act, observe the effect, and act again. In other words, it stops being merely interpretive and starts becoming infrastructural.&lt;/p&gt;
&lt;p&gt;That is the real shift. Not more eloquence. Not slightly better automation. Consequence.&lt;/p&gt;
&lt;h3 id=&#34;text-was-never-the-final-problem&#34;&gt;Text Was Never the Final Problem&lt;/h3&gt;
&lt;p&gt;People still talk about model output as though the main issue were what the model says. That framing is becoming stale.&lt;/p&gt;
&lt;p&gt;If a model writes a strange paragraph, that may be annoying. If the same model can trigger a shell action, drive a browser session, modify a repository, hit an API with real credentials, or traverse a filesystem through an &lt;a href=&#34;https://modelcontextprotocol.io/specification/latest/basic&#34;&gt;MCP server&lt;/a&gt;, then the relevant question is no longer merely &amp;ldquo;what did it say?&amp;rdquo; The real question becomes: what did the environment allow those words to become?&lt;/p&gt;
&lt;p&gt;That sounds obvious once stated plainly, but a great deal of current AI rhetoric still behaves as though the old text-only framing were enough.&lt;/p&gt;
&lt;p&gt;It is not enough.&lt;/p&gt;
&lt;p&gt;A model that suggests deleting a file and a model that can actually cause that deletion are not the same kind of system. A model that proposes an escalation email and a model that can send it are not the same kind of system. A model that hallucinates a bad shell command and a model whose output gets routed into execution are not separated by a minor implementation detail. They are separated by consequence.&lt;/p&gt;
&lt;p&gt;That is why I do not like the soft phrase &amp;ldquo;tool augmentation&amp;rdquo; as the whole story. It sounds innocent, like giving a worker a slightly better screwdriver. In many cases what is really happening is that we are connecting a probabilistic decision process to a live environment and then acting surprised that the environment starts to matter more than the prose.&lt;/p&gt;
&lt;h3 id=&#34;mcp-connects-the-model-to-situated-power&#34;&gt;MCP Connects the Model to Situated Power&lt;/h3&gt;
&lt;p&gt;The &lt;a href=&#34;https://modelcontextprotocol.io/specification/latest&#34;&gt;Model Context Protocol&lt;/a&gt; is often described in tidy, neutral terms: servers expose tools, resources, prompts, and related capabilities; hosts and clients connect them; the model gets context and action surfaces it would not otherwise have. All of that is true.&lt;/p&gt;
&lt;p&gt;It is also too clean.&lt;/p&gt;
&lt;p&gt;What MCP really does, in practice, is connect model judgment to situated power.&lt;/p&gt;
&lt;p&gt;That power is not abstract. It lives wherever the tool lives:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;in a filesystem the tool can read or write&lt;/li&gt;
&lt;li&gt;in a browser session the tool can drive&lt;/li&gt;
&lt;li&gt;in a shell the tool can execute through&lt;/li&gt;
&lt;li&gt;in an API surface the tool can authenticate to&lt;/li&gt;
&lt;li&gt;in an organization whose workflows are increasingly willing to trust the result&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That is why I think the comforting sentence &amp;ldquo;the model only has access to approved tools&amp;rdquo; often means much less than people want it to mean. If the approved tools are broad enough, then saying &amp;ldquo;only approved tools&amp;rdquo; is like saying a process is safe because it only has access to approved machinery, while the approved machinery includes the loading dock, the admin terminal, and the master keys.&lt;/p&gt;
&lt;p&gt;Formally reassuring. Operationally laughable.&lt;/p&gt;
&lt;p&gt;And that is before we get to the uglier part: once tools can observe and act in loops, the system is no longer a simple one-shot responder. It is in a perception-action cycle:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;inspect environment state&lt;/li&gt;
&lt;li&gt;compress that state into a model-readable form&lt;/li&gt;
&lt;li&gt;decide on an action&lt;/li&gt;
&lt;li&gt;execute via tool&lt;/li&gt;
&lt;li&gt;inspect consequences&lt;/li&gt;
&lt;li&gt;act again&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;That loop is where &amp;ldquo;just a language model&amp;rdquo; stops being an honest description.&lt;/p&gt;
&lt;h3 id=&#34;typed-interfaces-do-not-guarantee-bounded-consequences&#34;&gt;Typed Interfaces Do Not Guarantee Bounded Consequences&lt;/h3&gt;
&lt;p&gt;This is where people start trying to calm themselves down with schemas.&lt;/p&gt;
&lt;p&gt;They say: yes, but the MCP tool has a defined interface. Yes, but the arguments are typed. Yes, but the model can only call the tool in approved ways.&lt;/p&gt;
&lt;p&gt;Fine. Sometimes that matters. But typed invocation is not the same thing as bounded consequence.&lt;/p&gt;
&lt;p&gt;That distinction is one of the big buried truths in this whole discussion.&lt;/p&gt;
&lt;p&gt;A narrow, typed tool that does one highly constrained thing under externally enforced limits can be meaningfully bounded. That is real. I would not deny it.&lt;/p&gt;
&lt;p&gt;But most interesting, high-leverage tool surfaces are not like that. They are rich enough to matter precisely because they leave room for discretion:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a shell surface that can trigger many valid but open-ended actions&lt;/li&gt;
&lt;li&gt;a browser surface that can navigate changing state, click, submit, search, loop, and adapt&lt;/li&gt;
&lt;li&gt;a repository or filesystem surface where many technically valid edits are still strategically wrong&lt;/li&gt;
&lt;li&gt;a broad API surface with enough credentials to make mistakes expensive&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In those cases, the tool schema may constrain the &lt;em&gt;shape&lt;/em&gt; of the invocation while doing very little to constrain the &lt;em&gt;meaningful space of effects&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This is the trick people keep playing on themselves. They mistake typed interface for real containment.&lt;/p&gt;
&lt;p&gt;It is not the same thing.&lt;/p&gt;
&lt;p&gt;The residual risk is not merely &amp;ldquo;the model might call the wrong method.&amp;rdquo; The nastier risk is that it makes a sequence of perfectly valid calls under a flawed interpretation of the task, and the environment obediently translates that flawed interpretation into real change.&lt;/p&gt;
&lt;p&gt;That is a much uglier failure mode than a malformed output string.&lt;/p&gt;
&lt;p&gt;And if that still sounds abstract, the failure sketches are not hard to imagine:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;give the model MCP access to your filesystem and one bad interpretation later it removes essential OS files; local machine unusable, oops&lt;/li&gt;
&lt;li&gt;give it MCP access to your PostgreSQL and a &amp;ldquo;cleanup&amp;rdquo; step becomes a table drop; data gone, oops&lt;/li&gt;
&lt;li&gt;give it MCP access to your Jira queue and it does not just read the backlog, it closes tickets and strips descriptions because some rule somewhere made &amp;ldquo;resolve noise&amp;rdquo; sound like a sensible goal; oops&lt;/li&gt;
&lt;li&gt;give it MCP access to your GitHub project and it does not merely inspect pull requests, it force-pushes the wrong branch state and empties the repository; oops&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I am intentionally presenting those as plausible scenarios, not as a sourced catalogue of named incidents. The point does not depend on theatrical storytelling. The point is simpler and uglier: the MCP can do whatever the token, permission set, and host environment allow it to do.&lt;/p&gt;
&lt;p&gt;That does not require dramatic machine agency. It does not even require a particularly clever model. A typo in a skill file, a bad rule, a sloppy prompt, a wrong assumption in a workflow, or a brittle bit of context can be enough. Once the path from output to action is short, stupidity scales just as nicely as intelligence does.&lt;/p&gt;
&lt;h3 id=&#34;the-boundary-did-not-disappear-it-moved&#34;&gt;The Boundary Did Not Disappear. It Moved&lt;/h3&gt;
&lt;p&gt;To be fair, MCP does not abolish boundaries by definition. It relocates them.&lt;/p&gt;
&lt;p&gt;The old comforting fantasy was that safety lived mostly at the model boundary: constrain the model, filter the output, police the prompt, maybe wrap the text in a few guardrails, and hope that was enough.&lt;/p&gt;
&lt;p&gt;With MCP, the effective boundary moves outward:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;to the tool surface&lt;/li&gt;
&lt;li&gt;to the permission model&lt;/li&gt;
&lt;li&gt;to the host environment&lt;/li&gt;
&lt;li&gt;to the surrounding runtime constraints&lt;/li&gt;
&lt;li&gt;to whatever external systems can still refuse, log, sandbox, rate-limit, or block consequences&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That is a major architectural shift.&lt;/p&gt;
&lt;p&gt;And this is where I get more suspicious than a lot of current product writing does. People often talk as though external boundaries are automatically comforting. They are not automatically comforting. They are only as good as their actual ability to resist broad, adaptive, probabilistic use by a system that can observe, retry, reframe, and route around friction.&lt;/p&gt;
&lt;p&gt;If the only real safety story is &amp;ldquo;the environment will catch it,&amp;rdquo; then the environment had better be much more trustworthy than most real environments are.&lt;/p&gt;
&lt;p&gt;I do not know any serious engineer who should be relaxed by hand-wavy references to containment.&lt;/p&gt;
&lt;h3 id=&#34;containment-talk-is-often-too-cheerful&#34;&gt;Containment Talk Is Often Too Cheerful&lt;/h3&gt;
&lt;p&gt;This is the point where the tone of the discussion usually goes soft and reassuring, and I think that softness is misplaced.&lt;/p&gt;
&lt;p&gt;If you are dealing with a very narrow tool, tight external constraints, minimal side effects, isolated credentials, explicit confirmation boundaries, and no broad environmental leverage, then yes, boundedness may be meaningful. Good. Keep it.&lt;/p&gt;
&lt;p&gt;But in many practically interesting MCP setups, the residual constraints are too weak, too external, or too porous to count as meaningful containment in the comforting sense that people quietly want.&lt;/p&gt;
&lt;p&gt;That is the line I would draw.&lt;/p&gt;
&lt;p&gt;Not:
&amp;ldquo;all containment is impossible.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;I cannot prove that, and I will not fake certainty where I do not have it.&lt;/p&gt;
&lt;p&gt;But I will say this:&lt;/p&gt;
&lt;p&gt;once a model can observe, adapt, and act through broad tools in a rich environment, confidence in clean containment should fall sharply.&lt;/p&gt;
&lt;p&gt;That is not drama. That is a sober posture.&lt;/p&gt;
&lt;p&gt;An ugly little scene makes the point better than theory does. Imagine a company proudly announcing that its internal assistant is &amp;ldquo;safely integrated&amp;rdquo; with file operations, browser automation, deployment metadata, ticketing tools, and internal knowledge systems. For two weeks everyone calls this productivity. Then one odd interpretation slips through, a valid sequence of tool calls touches the wrong systems in the wrong order, and now there is an incident review full of phrases like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&amp;ldquo;the tool call was technically valid&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;the model appeared to follow the requested workflow&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;the side effect was not anticipated&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;the environment did not block the action as expected&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That is not science fiction. That is the shape of a very ordinary modern failure.&lt;/p&gt;
&lt;h3 id=&#34;the-real-threshold-was-never-utility&#34;&gt;The Real Threshold Was Never Utility&lt;/h3&gt;
&lt;p&gt;This is why I keep returning to the same word.&lt;/p&gt;
&lt;p&gt;&amp;ldquo;Useful&amp;rdquo; was never the real threshold.
&amp;ldquo;Consequential&amp;rdquo; was.&lt;/p&gt;
&lt;p&gt;A model can be &amp;ldquo;useful&amp;rdquo; without mattering very much. A search helper is useful. A summarizer is useful. A draft generator is useful. Those systems may still be annoying, biased, sloppy, or overhyped, but their effects remain relatively buffered by human review and interpretation.&lt;/p&gt;
&lt;p&gt;A model becomes &amp;ldquo;consequential&amp;rdquo; when the path from output to effect shortens.&lt;/p&gt;
&lt;p&gt;That can happen because:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;humans begin trusting the output by default&lt;/li&gt;
&lt;li&gt;tools begin translating output into action&lt;/li&gt;
&lt;li&gt;environments become legible enough for iterative manipulation&lt;/li&gt;
&lt;li&gt;organizational workflows stop treating the model as advisory and start treating it as procedural&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And once that happens, the language around &amp;ldquo;utility&amp;rdquo; becomes too polite. The system is no longer just helping. It is participating in consequence.&lt;/p&gt;
&lt;p&gt;That does not mean every MCP setup is reckless. It does mean the burden of proof should sit with the people claiming safety, not with the people expressing suspicion.&lt;/p&gt;
&lt;p&gt;If the tool semantics are broad, the environment is rich, and the model retains discretionary judgment over how to sequence valid actions, then the default posture should not be comfort. It should be scrutiny.&lt;/p&gt;
&lt;h3 id=&#34;what-this-changes&#34;&gt;What This Changes&lt;/h3&gt;
&lt;p&gt;Once you see MCP through the lens of consequence, several things become clearer.&lt;/p&gt;
&lt;p&gt;First, the real agent is not just the model. It is:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;model + protocol + tool surface + permissions + environment + feedback loop&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Second, &amp;ldquo;alignment&amp;rdquo; at the text level is no longer enough as a meaningful description. A model can appear compliant in language while still steering a valid sequence of actions toward the wrong practical outcome.&lt;/p&gt;
&lt;p&gt;Third, governance has to shift outward. It is no longer enough to ask whether the model says the right things. You have to ask what the surrounding system permits those sayings to become.&lt;/p&gt;
&lt;p&gt;Fourth, a lot of the current product language is too soothing. It keeps using words like assistant, tool use, augmentation, and workflow help, because those words leave consequence safely blurry. The blur is convenient. It is also the problem.&lt;/p&gt;
&lt;h3 id=&#34;this-is-not-a-rant-against-consequence&#34;&gt;This Is Not a Rant Against Consequence&lt;/h3&gt;
&lt;p&gt;At this point, the essay could be misread as a long argument for fear, paralysis, or retreat back into harmless toys. That is not the point.&lt;/p&gt;
&lt;p&gt;This is not an anti-MCP argument. It is an anti-naivety argument.&lt;/p&gt;
&lt;p&gt;The point is not to reject consequence. The point is to become worthy of it.&lt;/p&gt;
&lt;p&gt;If &lt;a href=&#34;https://modelcontextprotocol.io/specification/latest&#34;&gt;MCP&lt;/a&gt; really is one of the thresholds where model output starts turning into environmental effect, then the answer is not denial and it is not marketing. The answer is stewardship. Better boundaries. Narrower permissions. Clearer language. Smaller blast radii. Real auditability. Reversibility where possible. Suspicion toward vague assurances. Less safety theater. More adult engineering.&lt;/p&gt;
&lt;p&gt;That is the constructive spin, if one insists on calling it a spin. The critique exists because these systems matter. If they were merely toys, none of this would deserve such forceful language. The harsher the consequence, the less patience one should have for sloppy metaphors, soft promises, and fake containment stories.&lt;/p&gt;
&lt;p&gt;So no, the argument is not that models must never act. The argument is that systems with consequence should be designed as if consequence were real, because it is.&lt;/p&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://modelcontextprotocol.io/specification/latest&#34;&gt;MCP&lt;/a&gt; does not merely make models more &amp;ldquo;useful&amp;rdquo;. It can make them &amp;ldquo;consequential&amp;rdquo; by connecting model output to trusted environments where words are translated into effects. That is the real threshold worth paying attention to.&lt;/p&gt;
&lt;p&gt;The hard part is not that tools exist. The hard part is that broad tools, rich environments, and probabilistic judgment do not compose into comforting guarantees just because the invocation format looks tidy. The boundary did not disappear. It moved outward, and in many interesting cases it moved to places that do not deserve much casual trust.&lt;/p&gt;
&lt;p&gt;The constructive answer is not to pretend consequence away. It is to build systems, permissions, workflows, and institutions that are actually worthy of it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If the real danger is no longer what the model says but what trusted systems allow its sayings to become, where should we admit the true boundary of responsibility now lies?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Related reading:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/from-prompt-to-protocol-stack/&#34;&gt;From Prompt to Protocol Stack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/the-real-historical-analogy/&#34;&gt;The Real Historical Analogy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/freedom-creates-protocol/&#34;&gt;Freedom Creates Protocol&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>From Prompt to Protocol Stack</title>
      <link>https://turbovision.in6-addr.net/musings/ai-language-protocols/from-prompt-to-protocol-stack/</link>
      <pubDate>Sat, 18 Apr 2026 00:00:00 +0000</pubDate>
      <lastBuildDate>Sat, 18 Apr 2026 00:00:00 +0000</lastBuildDate>
      <guid>https://turbovision.in6-addr.net/musings/ai-language-protocols/from-prompt-to-protocol-stack/</guid>
      <description>&lt;p&gt;The future of AI control was never going to fit inside one clever paragraph typed into a chat box. What looks like prompting today is already breaking apart into layers, and each layer is quietly starting to serve a different audience: humans, agents, tools, infrastructure, and, eventually, other layers pretending not to be there.&lt;/p&gt;
&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;
&lt;p&gt;Prompting is evolving into a full protocol stack. Natural language remains at the human boundary, while deeper layers increasingly carry schemas, tool definitions, memory layouts, compressed state, and possibly machine-native agent communication. The chat box survives, but it is no longer the whole machine.&lt;/p&gt;
&lt;h2 id=&#34;the-question&#34;&gt;The Question&lt;/h2&gt;
&lt;p&gt;Have you ever wondered whether we are still dealing with prompting at all once prompts become longer, more structured, and more system-like? Or are we actually watching a new software stack form around language models?&lt;/p&gt;
&lt;h2 id=&#34;the-long-answer&#34;&gt;The Long Answer&lt;/h2&gt;
&lt;p&gt;I think we are very obviously watching a new stack form, even if the industry still likes talking as though everything important happens inside the visible prompt.&lt;/p&gt;
&lt;h3 id=&#34;the-prompt-is-no-longer-the-whole-unit&#34;&gt;The Prompt Is No Longer the Whole Unit&lt;/h3&gt;
&lt;p&gt;The mistake is to imagine the prompt as the unit. That made some sense when language models were mostly single-turn text machines. It makes much less sense once we ask them to persist, use tools, collaborate, manage memory, or act inside workflows. At that point the useful object is no longer the prompt alone. It is the entire communication architecture around it.&lt;/p&gt;
&lt;p&gt;That architecture already has layers, even if we do not always name them consistently.&lt;/p&gt;
&lt;p&gt;At the top there is the human intention layer:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;goals&lt;/li&gt;
&lt;li&gt;tone&lt;/li&gt;
&lt;li&gt;constraints&lt;/li&gt;
&lt;li&gt;questions&lt;/li&gt;
&lt;li&gt;examples&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is where natural language shines. It is flexible, compresses messy intention well enough, and lets humans stay close to the task without dropping into low-level syntax immediately.&lt;/p&gt;
&lt;p&gt;Below that sits the behavioral framing layer:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;system instructions&lt;/li&gt;
&lt;li&gt;role definitions&lt;/li&gt;
&lt;li&gt;safety boundaries&lt;/li&gt;
&lt;li&gt;refusal rules&lt;/li&gt;
&lt;li&gt;escalation behavior&lt;/li&gt;
&lt;li&gt;evaluation priorities&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This layer says less about the task itself and more about the posture the model should adopt while attempting the task.&lt;/p&gt;
&lt;p&gt;Below that sits the operational context layer:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;retrieved documents&lt;/li&gt;
&lt;li&gt;repository state&lt;/li&gt;
&lt;li&gt;conversation history&lt;/li&gt;
&lt;li&gt;persistent memory&lt;/li&gt;
&lt;li&gt;environment facts&lt;/li&gt;
&lt;li&gt;current artifacts under edit&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This layer answers the question: what world is the agent acting inside?&lt;/p&gt;
&lt;p&gt;Below that sits the tool layer:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;tool names&lt;/li&gt;
&lt;li&gt;schemas&lt;/li&gt;
&lt;li&gt;permissions&lt;/li&gt;
&lt;li&gt;invocation rules&lt;/li&gt;
&lt;li&gt;observation formats&lt;/li&gt;
&lt;li&gt;retry and failure policies&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once a model can act, tools stop being optional flavor and become part of the language of control.&lt;/p&gt;
&lt;p&gt;Below that sits the machine coordination layer, which is still young but increasingly visible:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;compressed summaries&lt;/li&gt;
&lt;li&gt;state snapshots&lt;/li&gt;
&lt;li&gt;cache reuse&lt;/li&gt;
&lt;li&gt;structured intermediate outputs&lt;/li&gt;
&lt;li&gt;inter-agent messages&lt;/li&gt;
&lt;li&gt;latent or activation-based exchange&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is the layer where ordinary prompting begins to blur into protocol engineering.&lt;/p&gt;
&lt;p&gt;And beneath all of that, of course, sits the model-internal representational machinery itself.&lt;/p&gt;
&lt;p&gt;If you lay the system out this way, a lot of contemporary confusion evaporates. People argue about prompting as though it were one thing. It is not. They are usually talking past each other about different layers and then acting surprised that the debate goes nowhere.&lt;/p&gt;
&lt;p&gt;One person means phrasing tricks in the user message.
Another means system prompt design.
Another means retrieval quality.
Another means JSON schemas.
Another means agent orchestration.
Another means &lt;a href=&#34;https://arxiv.org/abs/2410.12877&#34;&gt;activation steering&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;All of those are &amp;ldquo;prompting&amp;rdquo; only in the broadest and least useful sense.&lt;/p&gt;
&lt;h3 id=&#34;the-layers-are-already-visible&#34;&gt;The Layers Are Already Visible&lt;/h3&gt;
&lt;p&gt;That is why I prefer the phrase protocol stack. It captures the architecture better and also suggests the future more honestly. It sounds less magical, which is exactly why I trust it more.&lt;/p&gt;
&lt;p&gt;A mature AI system will likely look something like this:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;human gives high-level intent in natural language&lt;/li&gt;
&lt;li&gt;system translates that intent into a stabilized task frame&lt;/li&gt;
&lt;li&gt;task frame binds relevant memory, documents, and tool affordances&lt;/li&gt;
&lt;li&gt;one or more agents execute subtasks under explicit protocols&lt;/li&gt;
&lt;li&gt;agents exchange summaries or compressed state internally&lt;/li&gt;
&lt;li&gt;final result is reprojected into human-legible language for review or approval&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Notice what changed. Natural language remains important, but it is no longer the whole medium. It becomes the topmost interface over deeper coordination channels.&lt;/p&gt;
&lt;p&gt;That is exactly how most successful technical systems evolve.&lt;/p&gt;
&lt;p&gt;A web browser gives you a page, not packets.
A database query gives you SQL, not disk head timing.
An operating system gives you processes, not transistor switching.&lt;/p&gt;
&lt;p&gt;The user gets a legible abstraction. Underneath, layers proliferate because raw freedom does not scale by itself.&lt;/p&gt;
&lt;p&gt;The AI case is especially interesting because language appears at both ends of the stack. We enter through language, we leave through language, and the machinery in the middle gets less and less obligated to stay conversational.&lt;/p&gt;
&lt;p&gt;At the entrance, language captures goals.
At the exit, language communicates results.
In the middle, however, language may become increasingly optional.&lt;/p&gt;
&lt;p&gt;That is where agent-to-agent communication becomes important. If two agents are solving a problem together, full natural-language exchange is often expensive. It is verbose, ambiguous, and tied to human readability. For some tasks that is still worth it, especially when auditability matters. For others it may prove wasteful compared to compressed intermediate forms.&lt;/p&gt;
&lt;p&gt;There is something faintly ridiculous in imagining two high-speed reasoning systems politely sending each other mini-essays in immaculate English simply because that is the only style of interaction humans currently find respectable. A lot of the future may consist of us slowly admitting that the internals do not actually want to be this literary.&lt;/p&gt;
&lt;p&gt;We are already seeing small previews of this future:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;structured chain outputs instead of free prose&lt;/li&gt;
&lt;li&gt;schema-constrained responses&lt;/li&gt;
&lt;li&gt;tool-call argument objects&lt;/li&gt;
&lt;li&gt;reusable memory summaries&lt;/li&gt;
&lt;li&gt;vector-based &lt;a href=&#34;https://aclanthology.org/2021.emnlp-main.243/&#34;&gt;soft prompts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://arxiv.org/abs/2410.12877&#34;&gt;activation steering&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;experimental latent communication between agents&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are not isolated hacks. They are early pieces of a layered control model, even if the marketing language around them still prefers the friendlier fiction that we are merely &amp;ldquo;improving prompting.&amp;rdquo;&lt;/p&gt;
&lt;h3 id=&#34;natural-language-becomes-the-top-layer&#34;&gt;Natural Language Becomes the Top Layer&lt;/h3&gt;
&lt;p&gt;A useful way to think about it is with a networking analogy, and yes, I know that analogy is a little nerdy. It is still better than pretending the chat transcript is the architecture.&lt;/p&gt;
&lt;p&gt;Human prompting today often behaves like application-layer traffic mixed together with transport, session, and routing concerns in the same blob of text. That is why prompts become huge and fragile. They are doing too many jobs at once. They describe the task, define policy, encode examples, specify output shape, explain tool behavior, and sometimes even embed recovery instructions.&lt;/p&gt;
&lt;p&gt;Anyone who has seen a &amp;ldquo;simple prompt&amp;rdquo; mutate into a 900-line system prompt with XML-ish delimiters, output schemas, tool instructions, refusal clauses, and five examples knows exactly how fast this happens. The thing still lives in a chat window, but it stopped being &amp;ldquo;just chatting&amp;rdquo; a long time ago.&lt;/p&gt;
&lt;p&gt;In a more mature stack, those concerns separate.&lt;/p&gt;
&lt;p&gt;The result should not be imagined as less human. It should be imagined as more disciplined. Humans still speak their goals in language, but the system no longer forces every single control concern to be expressed as prose in one monolithic block.&lt;/p&gt;
&lt;p&gt;This matters for engineering quality.&lt;/p&gt;
&lt;p&gt;Once layers separate, you can version them independently. You can test them independently. You can reason about failure more clearly. You can update tool schemas without rewriting the entire prompt universe. You can swap memory strategies or retrieval methods while keeping the top-level interaction stable.&lt;/p&gt;
&lt;p&gt;That is a major architectural gain.&lt;/p&gt;
&lt;p&gt;There is also a philosophical gain. It frees us from the false binary between &amp;ldquo;talking naturally&amp;rdquo; and &amp;ldquo;going back to code.&amp;rdquo; We are not simply bouncing between total informality and total formalism. We are building multi-layer systems where different degrees of formality belong in different places.&lt;/p&gt;
&lt;p&gt;The human should not be forced to express every intention in rigid syntax.
The machine should not be forced to carry every internal coordination step in human prose.&lt;/p&gt;
&lt;p&gt;The protocol stack allows both truths at once.&lt;/p&gt;
&lt;h3 id=&#34;layering-solves-problems-and-creates-new-ones&#34;&gt;Layering Solves Problems and Creates New Ones&lt;/h3&gt;
&lt;p&gt;Of course, the problems arrive immediately.&lt;/p&gt;
&lt;p&gt;Layering creates opacity. Once more control happens below the visible prompt, users may lose sight of what is actually governing behavior. Hidden system prompts, invisible retrieval, latent memory shaping, and inter-agent subprotocols can make the system powerful and less inspectable. Anyone serious about AI governance should worry about that, and not in a performative way.&lt;/p&gt;
&lt;p&gt;But that worry is not an argument against the stack. It is evidence that the stack is real.&lt;/p&gt;
&lt;p&gt;No one worries about invisible layers in a system that does not have them.&lt;/p&gt;
&lt;p&gt;In that sense, we are already past the era of naive prompting. The visible chat box survives, but it is increasingly the polite fiction that hides a much larger control apparatus.&lt;/p&gt;
&lt;p&gt;And that may be healthy. Computing has always needed boundary surfaces that are easier than the machinery beneath them. The mistake is only to confuse the surface with the whole machine, which is exactly what a lot of current discourse keeps doing.&lt;/p&gt;
&lt;p&gt;So are we still dealing with prompting?&lt;/p&gt;
&lt;p&gt;Yes, if by prompting we mean the top-level act of expressing intent to a language-shaped system.&lt;/p&gt;
&lt;p&gt;No, if by prompting we mean the full control problem.&lt;/p&gt;
&lt;p&gt;That full problem now belongs to protocol design, context architecture, tool governance, memory management, and eventually machine-native coordination.&lt;/p&gt;
&lt;p&gt;The prompt is not disappearing. It is being demoted from sovereign command to one layer in a growing stack, which is probably healthier for everyone except people who enjoyed pretending the prompt was the whole art.&lt;/p&gt;
&lt;p&gt;And that, in my view, is the beginning of a more mature understanding of what these systems really are.&lt;/p&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;
&lt;p&gt;What we casually call prompting is already splitting into layers: human intent, behavioral framing, operational context, tool control, memory management, and machine coordination. Natural language remains crucial, but it no longer has to carry every control concern by itself. As systems mature, the visible prompt becomes less like a sovereign instruction and more like the top layer of a broader protocol architecture.&lt;/p&gt;
&lt;p&gt;That shift is not a loss of humanity. It is an increase in architectural honesty. The system is finally being described in the shape it actually has, rather than the shape the chat UI flatters us into seeing.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Once we accept that the prompt is only the top layer of the stack, what should remain visible to the human user and what should never be hidden underneath?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Related reading:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/freedom-creates-protocol/&#34;&gt;Freedom Creates Protocol&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/is-there-a-hidden-language-beneath-english/&#34;&gt;Is There a Hidden Language Beneath English?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/the-real-historical-analogy/&#34;&gt;The Real Historical Analogy&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>Is There a Hidden Language Beneath English?</title>
      <link>https://turbovision.in6-addr.net/musings/ai-language-protocols/is-there-a-hidden-language-beneath-english/</link>
      <pubDate>Thu, 16 Apr 2026 00:00:00 +0000</pubDate>
      <lastBuildDate>Thu, 16 Apr 2026 00:00:00 +0000</lastBuildDate>
      <guid>https://turbovision.in6-addr.net/musings/ai-language-protocols/is-there-a-hidden-language-beneath-english/</guid>
      <description>&lt;p&gt;Most prompt engineering is written in English, and the industry often treats that fact as if it were almost self-evident. But once you ask whether English is truly the best control medium or merely the most overrepresented one, the ground starts moving under the whole discussion.&lt;/p&gt;
&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;
&lt;p&gt;There is no strong evidence yet for one universal hidden &amp;ldquo;control language&amp;rdquo; beneath English. But there is real evidence that useful control can happen through non-natural-language mechanisms such as &lt;a href=&#34;https://aclanthology.org/2021.emnlp-main.243/&#34;&gt;soft prompts&lt;/a&gt;, &lt;a href=&#34;https://arxiv.org/abs/2410.12877&#34;&gt;steering vectors&lt;/a&gt;, and latent or activation-based agent communication. So the idea is not crazy. It is just easier to say crazy things around it than careful ones.&lt;/p&gt;
&lt;h2 id=&#34;the-question&#34;&gt;The Question&lt;/h2&gt;
&lt;p&gt;You may ask: if models live in a high-dimensional latent space, why are we still steering them with ordinary English sentences? Could there be a shorter, more efficient machine-native control language hidden under natural language, especially for agent-to-agent communication?&lt;/p&gt;
&lt;h2 id=&#34;the-long-answer&#34;&gt;The Long Answer&lt;/h2&gt;
&lt;p&gt;This is one of the most interesting questions in the whole field, partly because it contains a real idea and partly because it attracts nonsense like a magnet.&lt;/p&gt;
&lt;h3 id=&#34;why-the-idea-is-plausible&#34;&gt;Why the Idea Is Plausible&lt;/h3&gt;
&lt;p&gt;So let us separate what is plausible, what is established, and what is still an extrapolation, because this is exactly the kind of topic where people start sounding profound five minutes before they start lying to themselves.&lt;/p&gt;
&lt;p&gt;The plausible part comes first: natural language is almost certainly a lossy bottleneck.&lt;/p&gt;
&lt;p&gt;A model does not &amp;ldquo;think&amp;rdquo; in final output tokens alone. Internally it moves through activations, intermediate representations, attention patterns, and hidden states that contain far more structure than the sentence it eventually emits. The emitted sentence is not the whole state. It is the public projection of that state into a human-readable channel.&lt;/p&gt;
&lt;p&gt;Once you see that, your idea becomes immediately legible in technical terms. You are asking whether the human-readable wrapper is an inefficient control surface over a richer internal space, and whether two models might communicate more efficiently by exchanging compressed internal representations instead of serializing everything into English.&lt;/p&gt;
&lt;p&gt;That is not fantasy. It is already brushing against several real research directions.&lt;/p&gt;
&lt;p&gt;There is older work on emergent communication in multi-agent systems where agents invent message protocols that are useful to them but opaque to us. The 2017 paper &lt;a href=&#34;https://aclanthology.org/P17-1022/&#34;&gt;&lt;em&gt;Translating Neuralese&lt;/em&gt;&lt;/a&gt; is one of the early landmarks here. It did not show that agents had discovered some mystical perfect language hidden behind reality like a sacred cipher. It showed something more useful: agents can develop internal communication forms that are meaningful in use even when they are not naturally interpretable by humans.&lt;/p&gt;
&lt;p&gt;More recent work pushes this further toward language models specifically. Papers such as &lt;a href=&#34;https://proceedings.mlr.press/v267/ramesh25a.html&#34;&gt;&lt;em&gt;Communicating Activations Between Language Model Agents&lt;/em&gt;&lt;/a&gt; and &lt;a href=&#34;https://arxiv.org/abs/2511.09149&#34;&gt;&lt;em&gt;Interlat: Enabling Agents to Communicate Entirely in Latent Space&lt;/em&gt;&lt;/a&gt; explore the idea that agents can exchange internal activations or hidden-state-like representations directly, rather than always crushing them down into text first. The reported benefit in that line of work is exactly what you would expect: less information loss and often lower compute cost than long natural-language exchanges.&lt;/p&gt;
&lt;p&gt;So the broad direction of the intuition is already technically alive. That matters.&lt;/p&gt;
&lt;h3 id=&#34;where-the-evidence-actually-exists&#34;&gt;Where the Evidence Actually Exists&lt;/h3&gt;
&lt;p&gt;Now for the annoying but necessary part.&lt;/p&gt;
&lt;p&gt;What we do &lt;strong&gt;not&lt;/strong&gt; have, at least not in any established sense, is proof of one clean latent language sitting beneath English that we can simply reveal by subtracting the &amp;ldquo;English component.&amp;rdquo; I do not know of research that validates that exact decomposition in the neat form described. And this is exactly where people are tempted to jump from &amp;ldquo;the latent space is real&amp;rdquo; to &amp;ldquo;there must be a hidden universal language in there somewhere.&amp;rdquo; Maybe. But maybe is doing a lot of work there.&lt;/p&gt;
&lt;p&gt;Why not? Because the internal geometry is probably not that simple.&lt;/p&gt;
&lt;p&gt;English inside a model is not just &amp;ldquo;semantic content plus a detachable language shell.&amp;rdquo; It is entangled with tokenization, training distribution, stylistic priors, instruction-following habits, benchmark pressure, and all the historical accidents of the corpus. Meaning, format, tone, and control are mixed together.&lt;/p&gt;
&lt;p&gt;So I would challenge one very seductive picture: there is probably no single secret Esperanto of the latent space waiting patiently behind English, ready to reward whoever is clever enough to discover it.&lt;/p&gt;
&lt;p&gt;What is more likely is messier and, in my opinion, more interesting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;many partially reusable internal control directions&lt;/li&gt;
&lt;li&gt;many task-specific compressed protocols&lt;/li&gt;
&lt;li&gt;many model-specific or architecture-specific latent conventions&lt;/li&gt;
&lt;li&gt;some transferable abstractions, but not one canonical hidden language&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is where &lt;a href=&#34;https://aclanthology.org/2021.emnlp-main.243/&#34;&gt;soft prompts&lt;/a&gt;, &lt;a href=&#34;https://aclanthology.org/2021.acl-long.353/&#34;&gt;prefix tuning&lt;/a&gt;, and &lt;a href=&#34;https://arxiv.org/abs/2410.12877&#34;&gt;steering vectors&lt;/a&gt; become useful to think with.&lt;/p&gt;
&lt;h3 id=&#34;why-a-single-hidden-language-is-unlikely&#34;&gt;Why a Single Hidden Language Is Unlikely&lt;/h3&gt;
&lt;p&gt;Soft prompts are not ordinary words. They are learned continuous vectors injected into the model&amp;rsquo;s input space. Prefix tuning generalizes that idea deeper into the network. Steering vectors act differently but share the same spirit: instead of asking with words alone, you manipulate the model by shifting internal activations in directions associated with some behavior or concept.&lt;/p&gt;
&lt;p&gt;That is already a kind of non-natural-language control, and it should make people at least a little suspicious of the lazy assumption that human language is the final or natural control layer forever.&lt;/p&gt;
&lt;p&gt;Notice what that implies. We already have control methods that are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;effective&lt;/li&gt;
&lt;li&gt;compact&lt;/li&gt;
&lt;li&gt;not human-readable&lt;/li&gt;
&lt;li&gt;native to representation space rather than sentence space&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;English is therefore not the only control medium. It is simply the most interoperable one for humans.&lt;/p&gt;
&lt;p&gt;And that point matters, because it reveals the real trade-off.&lt;/p&gt;
&lt;p&gt;Human language is inefficient, but legible.
Latent control is efficient, but opaque.&lt;/p&gt;
&lt;p&gt;That single sentence is the heart of the matter, and also the trade-off a lot of AI discussion would rather not stare at for too long.&lt;/p&gt;
&lt;p&gt;If two agents share architecture, alignment, and task context, there is every reason to suspect they could communicate more efficiently than by exchanging verbose English paragraphs. They could use compressed summaries, vector codes, reused cache structures, activations, or learned latent shorthands. Once the agents no longer need to satisfy human readability at every intermediate step, natural language begins to look less like the native medium and more like a compatibility layer.&lt;/p&gt;
&lt;p&gt;That does not mean English is useless or even secondary. It means English may belong mostly at the boundary:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;human to agent&lt;/li&gt;
&lt;li&gt;agent to human&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;while agent to agent may migrate toward denser internal forms.&lt;/p&gt;
&lt;h3 id=&#34;the-agent-to-agent-case-is-the-real-frontier&#34;&gt;The Agent-to-Agent Case Is the Real Frontier&lt;/h3&gt;
&lt;p&gt;This layered picture fits both engineering and history. Systems tend to expose legible interfaces at the top and efficient, ugly protocols underneath. TCP packets are not prose. Database wire formats are not essays. CPU micro-ops are not source code. So why should advanced agent swarms eternally chatter to each other in polite human language unless a human auditor needs to read every step?&lt;/p&gt;
&lt;p&gt;There is also a small absurdity here that is hard not to enjoy. We may be heading toward systems where two expensive reasoning agents exchange page after page of immaculate English purely so that humans can feel the process remains respectable, while both machines would probably prefer to swap a denser internal shorthand and get on with it.&lt;/p&gt;
&lt;p&gt;There is another issue in our question: why English?&lt;/p&gt;
&lt;p&gt;The honest answer is likely mundane rather than metaphysical, which is unfortunate for anyone hoping for a more glamorous answer.&lt;/p&gt;
&lt;p&gt;English is privileged today because:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;much of the training data is English-heavy&lt;/li&gt;
&lt;li&gt;much of the instruction-tuning corpus is English-heavy&lt;/li&gt;
&lt;li&gt;many benchmarks are English-centric&lt;/li&gt;
&lt;li&gt;most prompt-engineering lore is shared in English&lt;/li&gt;
&lt;li&gt;tool docs, code, and interface conventions are often English-first&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So the dominance of English may say less about some deep optimality of English and more about the industrial history of model training. Sometimes the explanation is not &amp;ldquo;English maps best to reason.&amp;rdquo; Sometimes the explanation is simply &amp;ldquo;the pipeline grew up there.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;That said, replacing English with another human language is not yet the same as discovering a latent control protocol. Those are different questions.&lt;/p&gt;
&lt;p&gt;One asks: which human language is better for steering?
The other asks: must steering remain in human language at all?&lt;/p&gt;
&lt;p&gt;The second question is the deeper one.&lt;/p&gt;
&lt;h3 id=&#34;human-legibility-versus-machine-efficiency&#34;&gt;Human Legibility Versus Machine Efficiency&lt;/h3&gt;
&lt;p&gt;And here I think the strongest move is not the image of &amp;ldquo;subtract English and add it back later&amp;rdquo; as a literal algorithm, but as a conceptual provocation. It suggests that language may be acting as both carrier and drag. Carrier, because it gives us a shared interface. Drag, because it forces rich internal state through a narrow symbolic bottleneck.&lt;/p&gt;
&lt;p&gt;That is exactly why agent-to-agent communication is the most credible frontier for this idea.&lt;/p&gt;
&lt;p&gt;A human still needs explanation, auditability, and trust. Two agents collaborating under a shared protocol may care far less about elegance and far more about compression, precision, and bandwidth. They may converge on communication that looks to us like gibberish, or even bypass discrete language entirely.&lt;/p&gt;
&lt;p&gt;If that happens, the implications are substantial.&lt;/p&gt;
&lt;p&gt;First, debugging gets harder. You can inspect English. You can argue about English. You can regulate English. Hidden-state exchange is much less socially governable. It is also much easier to wave away with phrases like &amp;ldquo;trust the model&amp;rdquo; when nobody can really see what is happening.&lt;/p&gt;
&lt;p&gt;Second, interoperability becomes a real problem. A latent protocol learned by one model family may fail catastrophically with another. Natural language is slow, but it is remarkably portable.&lt;/p&gt;
&lt;p&gt;Third, alignment may get stranger. A human can often spot trouble in verbose reasoning traces, at least sometimes. A compressed latent exchange could be more capable and less inspectable at the same time.&lt;/p&gt;
&lt;p&gt;So I would state the thesis like this:&lt;/p&gt;
&lt;p&gt;There may not be one hidden language beneath English, but there are probably many machine-native control regimes that natural language currently obscures.&lt;/p&gt;
&lt;p&gt;That is the version I trust.&lt;/p&gt;
&lt;p&gt;It leaves room for real progress without pretending the geometry is cleaner than it is. It respects the evidence from soft prompts, steering, and latent-agent communication without claiming that the grand unified control language has already been found. And it points toward the place where the idea matters most: not in helping humans write ever more magical prompts, but in letting agents exchange context faster than prose allows.&lt;/p&gt;
&lt;p&gt;That future, if it comes, will not feel like the discovery of a secret language carved into the bedrock of intelligence. It will feel more like the emergence of protocol families: efficient, narrow, powerful, local, and only partially intelligible from the outside.&lt;/p&gt;
&lt;p&gt;Which is, frankly, how real technical history usually looks. Messier than prophecy, less elegant than theory, and much more interesting.&lt;/p&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;
&lt;p&gt;There is no solid reason yet to believe in one universal hidden control language beneath English. But there is good reason to suspect that natural language is only one control surface among several, and not necessarily the most efficient one for every setting. &lt;a href=&#34;https://aclanthology.org/2021.emnlp-main.243/&#34;&gt;Soft prompts&lt;/a&gt;, &lt;a href=&#34;https://arxiv.org/abs/2410.12877&#34;&gt;steering vectors&lt;/a&gt;, and latent or activation-based communication all point in the same direction: human language may remain the public interface while more compressed machine-native protocols emerge underneath.&lt;/p&gt;
&lt;p&gt;The most promising use case for that shift is not magical human prompting. It is agent-to-agent coordination, where efficiency may matter more than legibility. The seduction of the idea lies in human prompting. The real engineering value may lie somewhere else entirely.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If the most capable future agent systems stop explaining themselves to each other in human language, how much opacity are we actually willing to accept in exchange for speed and capability?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Related reading:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/from-prompt-to-protocol-stack/&#34;&gt;From Prompt to Protocol Stack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/the-real-historical-analogy/&#34;&gt;The Real Historical Analogy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/freedom-creates-protocol/&#34;&gt;Freedom Creates Protocol&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>The Myth of Prompting as Conversation</title>
      <link>https://turbovision.in6-addr.net/musings/ai-language-protocols/the-myth-of-prompting-as-conversation/</link>
      <pubDate>Mon, 13 Apr 2026 00:00:00 +0000</pubDate>
      <lastBuildDate>Mon, 13 Apr 2026 00:00:00 +0000</lastBuildDate>
      <guid>https://turbovision.in6-addr.net/musings/ai-language-protocols/the-myth-of-prompting-as-conversation/</guid>
      <description>&lt;p&gt;The phrase &amp;ldquo;just talk to the model&amp;rdquo; is one of the most successful half-truths in the current AI boom. It is good onboarding and bad description: useful for getting people in the door, and deeply misleading the moment anything expensive, fragile, or embarassingly public depends on the answer.&lt;/p&gt;
&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;
&lt;p&gt;Prompting is conversational only at the surface. Under real workloads it behaves much more like specification-writing for a probabilistic component inside a larger system, except the specification keeps pretending to be a chat.&lt;/p&gt;
&lt;h2 id=&#34;the-question&#34;&gt;The Question&lt;/h2&gt;
&lt;p&gt;Have you ever wondered why everyone says prompting is basically conversation, yet good prompting looks less like chatting and more like writing instructions for a very literal, very strange coworker with infinite patience and inconsistent memory?&lt;/p&gt;
&lt;h2 id=&#34;the-long-answer&#34;&gt;The Long Answer&lt;/h2&gt;
&lt;p&gt;Because &amp;ldquo;conversation&amp;rdquo; describes the feeling of the exchange, not the job the exchange is actually doing.&lt;/p&gt;
&lt;h3 id=&#34;the-surface-still-feels-like-conversation&#34;&gt;The Surface Still Feels Like Conversation&lt;/h3&gt;
&lt;p&gt;If I ask a friend, &amp;ldquo;Can you take a look at this and tell me what seems wrong?&amp;rdquo; the friend brings a whole life into the exchange. Shared background. Common sense. Tone-reading. Social repair mechanisms. Tacit norms. A strong instinct for what I probably meant even if I said it badly. Human conversation is robust because it rides on an absurd amount of shared context that usually never gets written down.&lt;/p&gt;
&lt;p&gt;A language model has none of that in the human sense. It has pattern competence, not lived context. It can imitate tone, infer intent surprisingly well, and reconstruct missing links much better than older software ever could, but it still needs something people keep trying to smuggle past it: framing discipline.&lt;/p&gt;
&lt;p&gt;This is why casual prompting and serious prompting diverge so sharply.&lt;/p&gt;
&lt;p&gt;Casual prompting thrives on vague intention:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Give me some ideas for this title.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Serious prompting, by contrast, starts growing scaffolding almost immediately:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;what the task is&lt;/li&gt;
&lt;li&gt;what the task is not&lt;/li&gt;
&lt;li&gt;what inputs are authoritative&lt;/li&gt;
&lt;li&gt;what constraints matter&lt;/li&gt;
&lt;li&gt;what output shape is required&lt;/li&gt;
&lt;li&gt;when uncertainty must be stated&lt;/li&gt;
&lt;li&gt;when tools may be used&lt;/li&gt;
&lt;li&gt;what to do when evidence conflicts&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Notice what happened there. The &amp;ldquo;conversation&amp;rdquo; did not disappear, but it got demoted. It became the friendly outer layer wrapped around a stricter interaction frame. That frame is the real unit of control.&lt;/p&gt;
&lt;h3 id=&#34;hidden-assumptions-become-explicit-scaffolding&#34;&gt;Hidden Assumptions Become Explicit Scaffolding&lt;/h3&gt;
&lt;p&gt;This is easiest to see in agentic systems. A normal chatbot can get away with charm, improvisation, and soft interpretation because the downside of a slightly odd answer is usually low. An agent that edits files, runs commands, manages tickets, or handles real work cannot survive on charm. It needs boundaries. It needs tool policies. It needs escalation rules. It needs failure handling. It needs a memory model. It needs a way to distinguish plan from action and action from reflection.&lt;/p&gt;
&lt;p&gt;In other words, it needs architecture.&lt;/p&gt;
&lt;p&gt;That is why the romantic phrase &amp;ldquo;prompting is conversation&amp;rdquo; becomes increasingly false as the stakes rise. Conversation does not vanish. It becomes the user-facing veneer over a stricter operational core.&lt;/p&gt;
&lt;p&gt;The better analogy is not a chat with a friend. It is a briefing.&lt;/p&gt;
&lt;p&gt;A good briefing can sound relaxed, but its job is exact:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;establish objective&lt;/li&gt;
&lt;li&gt;define environment&lt;/li&gt;
&lt;li&gt;state constraints&lt;/li&gt;
&lt;li&gt;clarify resources&lt;/li&gt;
&lt;li&gt;identify known unknowns&lt;/li&gt;
&lt;li&gt;specify expected deliverable&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That is much closer to good prompting than ordinary small talk, even if the software keeps trying to flatter us with the aesthetics of conversation.&lt;/p&gt;
&lt;p&gt;You can feel this most clearly when a model fails. Humans in conversation usually repair failure socially. We say, &amp;ldquo;No, that is not what I meant.&amp;rdquo; Or: &amp;ldquo;I was talking about the earlier file, not the second one.&amp;rdquo; Or: &amp;ldquo;I was asking for strategy, not code.&amp;rdquo; We do not usually treat that as a protocol error. We treat it as normal conversational life.&lt;/p&gt;
&lt;p&gt;With a model, the same repair process often reveals something uglier: the original request was under-specified. The failure was not just a misunderstanding. It was an interface defect dressed up as a conversational wobble.&lt;/p&gt;
&lt;p&gt;That shift is intellectually valuable. It forces us to admit how much human communication usually gets away with by relying on context that never needed to be written down.&lt;/p&gt;
&lt;p&gt;Once we notice that, prompting becomes a mirror. It shows us that many tasks we thought were simple were only simple because other humans were doing heroic amounts of implicit reconstruction for us.&lt;/p&gt;
&lt;p&gt;Take a mundane instruction like:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Review this code.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;To a human reviewer in your team, that may already imply:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;prioritize correctness over style&lt;/li&gt;
&lt;li&gt;look for regressions&lt;/li&gt;
&lt;li&gt;mention missing tests&lt;/li&gt;
&lt;li&gt;keep summary brief&lt;/li&gt;
&lt;li&gt;cite specific files&lt;/li&gt;
&lt;li&gt;avoid re-explaining obvious code&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To a model, unless those expectations are already anchored in some persistent context layer, each one is only probabilistically present. So the prompt expands. Not because models are stupid, but because hidden expectations are expensive and ambiguity gets more expensive the moment automation touches it.&lt;/p&gt;
&lt;p&gt;This is why I resist the lazy claim that prompt engineering is &amp;ldquo;just learning how to ask nicely.&amp;rdquo; No. At its best it is the craft of dragging latent expectations into the light before they become failures.&lt;/p&gt;
&lt;h3 id=&#34;conversation-and-interface-pull-in-different-directions&#34;&gt;Conversation and Interface Pull in Different Directions&lt;/h3&gt;
&lt;p&gt;And once you put it that way, the social and technical layers snap together.&lt;/p&gt;
&lt;p&gt;Conversation is optimized for flexibility and repair.
Interfaces are optimized for repeatability and transfer.&lt;/p&gt;
&lt;p&gt;Prompting sits awkwardly between them.&lt;/p&gt;
&lt;p&gt;That awkwardness explains most of the current confusion in the field. Some people approach prompting like rhetoric: persuasion, tone, phrasing, psychological nudging, vibes. Others approach it like systems design: schemas, role separation, state management, tool boundaries, evaluation. Both camps touch something real, but the second camp is much closer to the long-term truth for serious systems.&lt;/p&gt;
&lt;p&gt;The conversational framing remains useful because it lowers fear. It invites non-programmers in. It gives people permission to start without mastering syntax. That is not trivial. It is a genuine democratization of access, and I would not sneer at that.&lt;/p&gt;
&lt;p&gt;But the price of that democratization is conceptual slippage. People start believing that because the interface feels human, the control problem must also be human. It is not.&lt;/p&gt;
&lt;p&gt;A human conversation can survive ambiguity because the humans co-own the recovery process. A machine interaction only survives ambiguity when the system around it has already anticipated the ambiguity and constrained the damage.&lt;/p&gt;
&lt;p&gt;That is why good prompt design increasingly looks like this:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;separate stable system instructions from task-local instructions&lt;/li&gt;
&lt;li&gt;define tool contracts precisely&lt;/li&gt;
&lt;li&gt;provide authoritative context sources&lt;/li&gt;
&lt;li&gt;demand visible uncertainty when evidence is weak&lt;/li&gt;
&lt;li&gt;specify output schema where downstream code depends on it&lt;/li&gt;
&lt;li&gt;keep room for natural-language flexibility only where flexibility is actually useful&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This is not anti-conversational. It is simply honest about where conversation helps and where it starts lying to us.&lt;/p&gt;
&lt;p&gt;There is also a deeper cultural issue. Calling prompting &amp;ldquo;conversation&amp;rdquo; flatters us. It makes us feel that we are still in purely human territory: language, personality, persuasion, style. Calling it &amp;ldquo;interface design for stochastic systems&amp;rdquo; is much less glamorous. It sounds bureaucratic, technical, slightly cold, and therefore much closer to the parts people would rather not look at.&lt;/p&gt;
&lt;p&gt;But reality does not care which description feels nicer. If the model is part of a system, then the system properties win. Reliability, clarity, observability, reversibility, testability, and control start mattering more than the aesthetic pleasure of a natural exchange.&lt;/p&gt;
&lt;h3 id=&#34;the-human-metaphor-helps-then-misleads&#34;&gt;The Human Metaphor Helps, Then Misleads&lt;/h3&gt;
&lt;p&gt;This does not kill the human side. In fact, it makes it more interesting.&lt;/p&gt;
&lt;p&gt;The authorial voice still matters.
Examples still matter.
Rhetorical framing still matters.
The order of instructions still matters.&lt;/p&gt;
&lt;p&gt;But they matter inside a designed interface, not instead of one.&lt;/p&gt;
&lt;p&gt;So the phrase I prefer is this:&lt;/p&gt;
&lt;p&gt;Prompting is not conversation.&lt;br&gt;
Prompting borrows the surface grammar of conversation to program a probabilistic collaborator.&lt;/p&gt;
&lt;p&gt;That sounds harsher, but it explains the world better and wastes less time.&lt;/p&gt;
&lt;p&gt;It explains why short prompts can work brilliantly in low-stakes settings and fail spectacularly in long-horizon work. It explains why agent systems keep growing invisible scaffolding. It explains why reusable prompts slowly mutate into templates, then policies, then skills, then full orchestration layers.&lt;/p&gt;
&lt;p&gt;If you want an ugly little scene, here is one. A team starts with &amp;ldquo;just chat with the model.&amp;rdquo; Two weeks later they have a hidden system prompt, a saved output format, a retrieval layer, a style guide, three evaluation scripts, a fallback tool policy, and an internal wiki page titled something like &amp;ldquo;Recommended Prompting Patterns v3.&amp;rdquo; At that point we are no longer talking about conversation. We are talking about infrastructure pretending to be conversation.&lt;/p&gt;
&lt;p&gt;And it explains why newcomers and experts often seem to be talking about different technologies when they both say &amp;ldquo;AI.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;The newcomer sees the conversation.
The expert sees the interface hidden inside it.&lt;/p&gt;
&lt;p&gt;Both are real. Only one is enough for production.&lt;/p&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;
&lt;p&gt;Prompting feels conversational because natural language is the visible surface. But once the task carries real consequences, the exchange stops behaving like ordinary conversation and starts behaving like interface design. Hidden assumptions have to be written down, constraints have to be made explicit, and recovery can no longer rely on human social repair alone.&lt;/p&gt;
&lt;p&gt;So the central mistake is not using conversational language. The central mistake is believing conversation itself is the control model. It is only the skin of the thing, and sometimes not even a very honest skin.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If prompting only borrows the surface grammar of conversation, what other “human” metaphors around AI are flattering us more than they are explaining the system?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Related reading:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/freedom-creates-protocol/&#34;&gt;Freedom Creates Protocol&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/is-there-a-hidden-language-beneath-english/&#34;&gt;Is There a Hidden Language Beneath English?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/musings/ai-language-protocols/from-prompt-to-protocol-stack/&#34;&gt;From Prompt to Protocol Stack&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
  </channel>
</rss>
