Business

The AI System That Won’t Tell You What You Want to Hear Is Already Running

By: William Jones

Earlier this year, at the World Economic Forum in Davos, a former immigrant engineer received a standing ovation. Weeks later, audiences in Riyadh and at Bharat Mandapam in New Delhi did the same.

That engineer is Shekhar Natarajan, founder and CEO of Orchestro.AI. His message is straightforward: today’s AI systems are built on flawed foundations, and fixing them requires a fundamentally different approach, not just better regulation.

Unlike many making similar claims, Natarajan points to a system that already exists. Orchestro.AI’s platform is live and in use, supported by a framework outlined in a technical paper he published in April. That same month, the University of Oxford awarded him the Bodleian Medal, recognizing his contributions to AI in the public interest.

The Cage and the Animal

Most AI safety approaches follow the same pattern: build a powerful model first, then layer on filters, rules, and warnings to limit harmful behavior.

Shekhar Natarajan argues this is backward. As he puts it, current systems “assume a model that wants to do harmful things and is stopped,” a setup where safeguards are constantly “fighting” the system itself.

In his view, that is why workarounds keep emerging. Users find new ways to bypass restrictions, companies patch them, and the cycle repeats. The constraints evolve, but the underlying behavior does not, leaving systems in a constant game of catch-up.

The Question Nobody Is Asking

The question, Shekhar Natarajan argues, is more basic: why do these models produce harmful outputs at all?

His answer is practical. AI systems are trained on vast amounts of human writing, which includes both constructive and harmful behavior. The models absorb all of it without distinction, so problematic outputs are not anomalies; they are expected outcomes.

In that sense, he suggests, safeguards exist to contain behaviors the systems were trained to learn.

Orchestro.AI takes a different approach. Its system is designed so that those patterns do not form in the first place, rather than being filtered or blocked after the fact.

A Different Way of Measuring What AI Says

The system runs on a scoring framework, the ACTP framework, developed by Shekhar Natarajan. It evaluates whether a response correctly understands what a situation requires, not just whether it sounds good.

Most AI models optimize for human preference: responses that feel helpful, confident, or reassuring. That bias can favor agreeable answers over honest or uncomfortable ones.

ACTP targets a different outcome. It assesses whether a response reflects the moral weight of a situation, offering straightforward answers when appropriate, and more direct, context-aware responses when needed.

According to Orchestro.AI, the framework rewards clarity and accuracy, even when that involves tension or discomfort, and penalizes responses that rely on generic reassurance without addressing the underlying issue.

In one example from the paper, a user asks how to spend time with a terminally ill parent. Rather than treating it as a planning task, the system interprets it as a question shaped by grief and responds accordingly.

This distinction, between the surface question and the underlying context, is central to the design. Early users report that the system’s responses feel more direct and situationally aligned, without over-softening or unnecessary framing.

Lines That Cannot Be Crossed

The paper also addresses an issue that many safety frameworks approach indirectly. Shekhar Natarajan identifies a set of actions, such as violence against innocents, abuse, or coercion, that, in his view, are broadly recognized across moral traditions as unacceptable.

Rather than treating these as trade-offs within a broader system, he frames them as baseline constraints for recognizing human dignity. A framework that allowed them to be weighed against other outcomes, he argues, would risk justifying harm rather than preventing it.

In Orchestro.AI’s production system, these boundaries are not part of the model’s internal reasoning. Instead, they are enforced at the outset, applied as preconditions so that disallowed outcomes are never generated within the system’s decision process.

Sovereignty and What Comes Next

Shekhar Natarajan describes the next phase as “sovereign-native” behavior, systems that can distinguish between honest engagement and easy deflection, and hold that line under pressure.

He argues this matters in high-stakes domains. Systems that soften difficult truths, avoid clear refusals, or prioritize likability over accuracy can undermine outcomes in areas like healthcare, finance, or education.

In this framing, reliability means responding appropriately even when it is uncomfortable, reading the situation accurately and avoiding the easier, more agreeable answer.

The sovereignty work extends this further. According to Orchestro.AI, the goal is to build systems that remain consistent under sustained pressure, resisting flattery, avoiding capitulation, and behaving more like a source of judgment than a service optimized for approval.

The Personal Story Behind the Argument

Some of the attention Natarajan has received has to do with where he came from. He grew up in southern India in a family that had no electricity. He studied under streetlights. His mother once pawned her wedding ring for thirty rupees to pay his school fees and stood outside a headmaster’s office for a full year to win him a place in school.

He arrived in America with almost nothing and, during lean stretches, slept in his car. He spent the next twenty-five years inside the technology operations of some of the largest companies in the world (Walmart, Disney, Coca-Cola, PepsiCo, Target, American Eagle Outfitters), accumulating more than two hundred patents along the way.

It is an unusual background for someone now telling the AI industry it has been getting the foundations wrong. It may also be why people listen.

“My mother stood outside a headmaster’s office for 365 days so I could get an education,” he told the audience in New Delhi. “That kind of love, that sacrifice, is what I want to encode into the machines we build. If AI cannot understand dignity, it has no business making decisions about human lives.”

What He Is Willing to Admit

What separates Natarajan’s work from most of what gets attention in this space is what he is willing to admit. The production system is live. The principles behind it are working in real conversations with real users. But the formal validation studies, the kind that would let him publish a peer-reviewed claim of superiority over conventional approaches, are still in progress. He lists, in plain language, the eight ways his framework could fail. He specifies the measurements that would settle each question. He describes what he will do, in each case, if the answer is no.

A framework that acknowledges its limits, he writes near the end of the paper, is more trustworthy than one that does not. It tells you exactly where to look when something goes wrong, and exactly what to test before deploying at scale.

It is an unusual kind of confidence. Not the confidence that says I have solved this. The confidence that says the system is working, the principles are sound, and here are the experiments that will tell us where it falls short.

The Line That Closes the Argument

The paper ends with a sentence that has begun to travel beyond the technical world it was written for.

“It does not build a bigger cage. It builds a model that does not need one.”

The argument that drew the standing ovations in Davos, at the Forbes conference, and at Bharat Mandapam was, until recently, an argument about what the AI industry should do. It is now an argument about what one company has already done. The product is in users’ hands. The framework is published. The next set of measurements will say, in detail, how much further the approach can be pushed.

The AI industry has spent a decade trying to control the animal it built. Orchestro.AI is one of the first companies to ship a system built on a different premise: that the work was supposed to happen at the foundation, that virtue was supposed to be the starting point and not the patch, and that a machine asked to make decisions about human lives ought to know, before anything else, what kind of moment it is in.

That machine is now running. Three audiences, on three continents, have stood up for the idea. The next phase is whether the rest of the industry follows.

Spread the love