One hundred days in, the question I get most often is some version of: "So it's like a chatbot for healthcare data?"
The gap between that framing and what we're actually building is the whole thing.
What we're building is a different architecture for software, one designed specifically for domains where the path from question to answer is long, ambiguous, and governed. Healthcare is exactly that kind of domain, and it's why we started here.
Agents aren't a new interface pattern or a friendlier way to write SQL; they're a different way of thinking about what the software is actually responsible for doing.
A healthcare analyst or business user doesn't start with a perfectly formed query. She starts with intent: find the right providers in a market, understand a patient population, compare utilization across payer types, build an audience, identify what changed and why.
That shows up very clearly in the questions people actually ask. In recent customer chats, users asked things like:
"Who prescribes Apligraf and performs skin grafts in Cook County?"
"Who treats inflammatory skin conditions in Illinois?"
"Who are the top half of GLP-1 prescribers this year who were not in the top half two years ago?"
"I want to create a brief for dermatologists prescribing Dupixent nationwide."
These are not dashboard-filter questions. Each one carries a chain of hidden decisions. Apligraf has to be treated as a product, skin grafts as procedures, Cook County as a geography, and the answer as an intersection of provider behavior rather than a simple list. "Inflammatory skin conditions" needs to resolve into a governed condition class. A "top half" question depends on denominator, ranking period, provider identity, and comparison logic across time. A brief is not a number at all; it is an artifact built from an audience, context, and supporting evidence.
But turning that intent into a number, a provider list, or a usable artifact requires a chain of work the user never sees. The system has to parse the language, resolve the terminology (is "cardiologist" a specialty, a taxonomy code, or a procedure context?), map provider types to the right hierarchy, distinguish a drug name from an NDC from a therapeutic class, apply geography at the right granularity, choose the right dataset, construct the query, enforce permissions, run the calculations, persist the result, and explain what happened. That entire chain is an orchestration problem, and it's one that better UI design alone is never going to solve.
For years, healthcare analytics has lived in an uncomfortable split: rigid dashboards that work only for anticipated questions, and SQL tools that require knowing the database, the coding systems, and the right methodology to even get started. Most of the actual work happens in the gap between those two worlds: in spreadsheets, in Slack threads, in ad hoc requests to data science teams who are already stretched.
Agents are suited to close that gap because software can now carry out multi-step work on behalf of a user while staying inside governed boundaries. A well-designed agent can plan, ask clarifying questions, call tools, retrieve context, validate its own assumptions, execute against trusted systems, and return not just an answer but a traceable path to that answer. That is a meaningfully different thing than returning a number in a dashboard cell.
The first instinct, when you start building with LLMs, is to let the model do more. That instinct feels justified early on; the capability is genuinely impressive until you ship it to someone who actually knows the domain and they immediately find the edge where the model hallucinates a methodology, invents a code set, or answers confidently with the wrong denominator. In healthcare analytics, those aren't edge cases. They're the core of the job.
The right architecture sits between two failure modes: a model that does everything and a model that does nothing. In the version we've built, models handle ambiguity, language, intent, and planning, while deterministic software handles validation, execution, permissions, persistence, observability, and calculation. The agent is the control loop between them, making decisions about when to interpret, when to resolve, when to execute, and when to stop and ask.
In practice, that means natural language becomes structured intent, structured intent becomes entity resolution, and entity resolution becomes a query plan against a governed semantic layer. The plan runs, results are stored and summarized, each step is auditable, each step can be constrained, and each step can be improved without touching the others. We built it this way because the alternative - letting the model improvise directly against claims data - produces answers that sound right until they don't.
The idea of intelligent software assistants predates LLMs by decades. What's different now is that the underlying stack has matured to the point where you can actually build on it.
A year ago, you couldn't reliably get a model to call a tool, check its output, and decide whether to call another one. You couldn't build long-running workflows where the model maintains context across a dozen steps. The latency and cost curves didn't support anything interactive. The semantic layers for healthcare data didn't exist in a form models could use. Those constraints aren't gone, but they've shifted enough that a robust product is now possible.
Modern reasoning models can handle genuinely messy requests and hold context across a workflow. Tool calling is mature enough to build on. Semantic layers give agents a governed contract for querying data rather than raw SQL access. Terminology systems can map "common blood thinner" to the right NDC set. Streaming makes long-running work feel interactive instead of frozen. Persistence lets intermediate results become artifacts you can come back to. Observability finally gives teams a way to inspect what the agent actually did, not just what it returned. The convergence of those capabilities is what makes the timing real.
Healthcare has more hidden translation layers than almost any other domain. "Cardiologist" implies a taxonomy hierarchy. A drug name needs to resolve to NDCs. A therapeutic class needs to expand into the right set of products. A condition maps to ICD-10 or CCSR groupings. A provider is an individual NPI or an organization or both. A geography is a state or a ZIP or a metro or a custom region. A claims question involves time windows, denominators, payer type, procedure codes, prescribing behavior, and patient cohorts, and the right answer changes depending on which combination you mean.
The practical implication is that no static dashboard can anticipate every combination, no analyst should have to memorize every schema, and no model should be trusted to improvise answers across that level of complexity without a governed layer underneath it.
An agent-native system offers a better path: experts express intent in their own language, the platform translates that intent into governed execution, and the result is the flexibility of conversation without giving up the discipline of software.
One thing I feel strongly about:
trust in this context is an engineering problem, not a messaging one.
A trustworthy agent shows its work: what entities it resolved, what codes it used, what filters were applied, what query ran, what data source was used, what calculation produced the result, and what limitations apply. If the system can't surface those answers, it's not ready for serious healthcare work. Observability has to be built into the architecture from the start, where every step the agent takes is logged, inspectable, and explainable to the analyst who needs to defend the number in a meeting. That's the baseline requirement for this category of software, not a differentiating feature.
Agents are becoming the orchestration layer for the data platform, not a replacement for it.
The companies that win in this space won't be the ones with the most impressive demo. They'll be the ones that can turn expert intent into governed action reliably, repeatedly, and in a way the user can actually trust. Healthcare analytics is ultimately a sequence of decisions made under constraints, and it requires judgment, context, translation, execution, guardrails, and accountability at every step. Agents, built the right way, are how you get all of that without forcing every user to become a data engineer.
That's the architecture we're building. One hundred days in, I'm more convinced of it than when we started.