Orchestra conductor's hand holding baton directing a symphony of interconnected AI neural network nodes and data points, representing enterprise AI orchestration - the conductor controls the components, not the other way around

Orchestration Was Always the Point

What the “Salesforce Capitulated on LLMs” crowd reveals about the chatbot fallacy in enterprise AI

The Information ran a piece in December that launched a thousand bad takes. The headline premise: Salesforce executives admitted they had “more trust in the LLM a year ago,” and the company was now pivoting to “deterministic automation,” which the tech press gleefully interpreted as capitulation. Retreat. An admission that the AI emperor has no clothes.

The stock continued a year-long slide. The Hacker News crowd did victory laps. Analysts published notes with phrases like “deterministic is a euphemism for less AI.” One outlet ran with “Salesforce Steps Back from AI” as if Marc Benioff had personally surrendered to a flowchart.

This is what happens when people evaluate enterprise AI using a mental model built entirely on chatting with ChatGPT. They get it completely, embarrassingly wrong.

The Chatbot Fallacy
Architecture as Strategy
What Orchestration Actually Looks Like
What The Critics Got Right
The Real Signal
What to Ask Instead

The Chatbot Fallacy

Here’s the category error at the heart of this narrative: the tech press, dozens of click-thirsty LinkedIn posters, and frankly most of the market have been evaluating enterprise AI through the lens of personal AI. They see a chatbot. They type, it responds, magic happens in the black box. The LLM is the product.

Enterprise AI has never worked this way, not reliably at scale.

Side-by-side comparison of Personal AI mental model showing simple user-to-LLM-to-response flow versus Enterprise AI reality with full orchestration layer containing context, guardrails, state management, deterministic logic, routing, observability, and human loop components - demonstrating that LLM is a component not the product — *The tech press evaluates enterprise AI through a chatbot lens. On the left: how they think it works. On the right: how it actually ships. The LLM is a component—not the product.*

In enterprise deployments, the LLM is a component—a powerful one, but still just one node in an orchestration graph that includes retrieval systems, guardrail evaluators, deterministic business logic, state management, routing and escalation logic, and observability infrastructure. The LLM handles natural language understanding and generation. Everything else handles making sure the system actually works in production without hallucinating your customer’s account balance or skipping steps in a regulated workflow.

Agentscript—Salesforce’s new declarative control layer—constrains and instruments agent planning and execution. It’s not a replacement for LLM reasoning; it’s the scaffolding that makes LLM reasoning safe to deploy.

As Salesforce CTO Muralidhar Krishnaprasad put it in a recent briefing: “People always thought LLMs were the thing—that with LLMs, you can do everything. But the reality is, it’s really hard to take the power of the LLM and make it work for businesses.”

He offered an analogy worth remembering: “You have these nuclear power plants. You can build all these awesome transmission engines, but if you don’t have that last mile transformer down to your house with all the right sockets, it’s not going to work.”

Enterprise AI power grid infographic showing LLM as power plant connected to orchestration layer transformer delivering policy controls, determinism, and 100% process execution to business workflows - illustrating why 95% accuracy is not enough for enterprise AI — *The LLM is the power plant. Orchestration is the grid that makes it usable. Salesforce shipped a better transformer—the press reported it as a power outage.*

The LLM is the power plant. The orchestration layer is the power transformer. Salesforce just shipped a better transformer, and the press reported it as a power outage.

Architecture as Strategy

Our research on enterprise AI deployment patterns—published as Architecture as Strategy in late 2025—documented a fundamental dichotomy that explains why this narrative is so misguided.

Overlay AI solutions deploy in 30 days. They sit on top of existing systems, make API calls, and deliver quick wins. The activation energy is low. The integration is shallow. Governance can be overridden in the name of convenience.
Embedded AI solutions require high data gravity and expert system integration. They integrate deeply with data infrastructure, workflow engines, and governance frameworks. The activation energy is high. But the orchestration and security capabilities are enterprise-grade. Many regulated industries demand nothing less.

By “deterministic” here, we mean policy-constrained execution: rules, validations, approvals, retries, auditability. Not a return to brittle scripting. It is a graduation to governed intelligence.

Here’s what the “capitulation” crowd missed: embedded architectures were always going to require deterministic control layers. That’s not a bug—it’s the entire point. Enterprises cannot run pre-clinical experiments for a new drug on vibes. You cannot execute a regulated financial workflow by hoping the LLM remembers all the compliance steps. You need orchestration that guarantees process execution while letting the LLM handle the parts it’s good at: natural language interaction, intent recognition, and adaptive conversation.

This is why embedded vendors are forced into deterministic control planes sooner: they’re the ones being asked to run real workflows under governance.

Agentscript isn’t Salesforce retreating from AI. It’s Salesforce shipping the declarative control layer that enterprise-scale orchestration always required. The fact that observers read this as “less AI” tells you everything about their mental model and nothing about the technology.

What Orchestration Actually Looks Like

In that same briefing, Krishnaprasad laid out the four components of enterprise agent infrastructure:

“What we are doing to solve that last mile problem—making sure LLMs can be harnessed and controlled for business—is four things. One, we need to make sure LLMs get the shared context, the right data at the right time so they can make the right decisions. Two, businesses aren’t just about ‘here’s the data, go make a decision.’ There are rules and regulations—what discounts you can offer, how to handle escalated customers. So, you also need deterministic controls. Three, you need observability—how do you know it’s actually working? You want to capture all the signals, not just to observe but to learn. And four, agents don’t work in isolation. They must work with humans and potentially with other agents—because handoffs, approvals, and exception handling are where enterprise work actually lives. So, orchestration becomes critical.”

Notice that the LLM is implicit in component one and nearly absent from components two through four. That’s because the LLM handles reasoning and language; everything else handles making sure the system performs reliably at scale.

Krishnaprasad offered another analogy that should reframe this entire debate: “It’s almost like databases used to have a single update or insert, and then we added pre-triggers and post-triggers—letting you do deterministic operations around that transaction block. That’s what we’ve now done at the planner level.”

Agentscript is the equivalent of database triggers for AI reasoning. It’s a maturation of the orchestration layer, not a retreat from intelligence. Anyone who has shipped production software understands this instantly. Anyone whose only AI experience is prompting Claude apparently does not.

What The Critics Got Right

To be fair: if vendors marketed “autonomy” without guardrails, customers were right to question reliability when agents hallucinated compliance steps or forgot to send satisfaction surveys. Early Agentforce deployments surfaced real problems—Vivint discovered surveys weren’t being sent despite clear instructions; customers found that LLMs started omitting steps when given more than eight directives.

The correction isn’t capitulation—it’s the industry learning what production actually requires. Salesforce didn’t retreat from AI; they shipped the governance layer that the “deploy it and pray” crowd never understood was coming.

The Real Signal

The original reporting twisted a nuanced technical discussion into a retreat narrative. What Salesforce executives actually described was the opposite of capitulation—it was the delivery of infrastructure that enterprise AI always required but that most observers never understood was necessary.

Madhav Thattai, COO of Salesforce AI, explained the hybrid approach directly: “For enterprises to use agents to execute process, we need this hybrid approach—where we get the best of the LLMs, all the creativity and expression, but tied into core process execution within one reasoning chain. That is what the new hybrid reasoning capability is about.”

The stakes are non-negotiable. As Thattai noted: “When you start to think about regulated industries—healthcare, financial services—90% or 95% accuracy is not good enough. It must work every single time.”

Here’s the prediction that matters: by Q2 2026, every major enterprise AI vendor will be marketing policy DSLs, auditable execution traces, planner constraints, and agent observability dashboards. The vendors currently selling pure LLM magic aren’t ahead of Salesforce—they simply haven’t hit the wall yet. When their customers try to scale from pilot to production, they’ll discover what Salesforce already learned: you need orchestration. You need control planes. You need the boring infrastructure that makes intelligent systems reliable.

Salesforce got there first. The press called it a retreat.

What to Ask Instead

When evaluating enterprise AI platforms, stop asking “how good is the LLM?” That question made sense in 2023. It is table stakes now—LLM quality is increasingly commoditized for most workflows.

Start asking:

How mature is the orchestration layer?
What deterministic controls exist for policy-constrained execution?
How do I observe and optimize agent performance over time?
Can this system guarantee process execution in regulated workflows?
What happens when the agent hits an edge case it can’t handle?

The LLM is the power plant. Orchestration is the grid that makes it usable.

Vernon Keenan is CEO of Keenan Vision LLC and founder of SalesforceDevops.net. His “Architecture as Strategy” research with UC Berkeley Haas documents enterprise AI deployment patterns and why 95% of GenAI pilots fail.