Beyond llms.txt: Structuring Business Data So Agents Do Not Hallucinate Your Hours and Prices

To format business data so AI agents read your hours and prices correctly, you do not need a new file. You need three things on a standard that already exists. First, mark up your facts as typed schema.org JSON-LD: OpeningHoursSpecification with dayOfWeek, opens, and closes for hours, and priceRange (or Offer.price where you model products) for prices. Second, keep that markup identical to what a human sees on the page, because Google's structured data policies require the markup to "be a true representation of the page content." Third, keep it synced as your real hours and prices change. Agents do not hallucinate your Tuesday hours because a standard is missing. They hallucinate because your structured data has drifted from your visible page and gone stale. The problem is freshness and consistency, not file invention.

01

llms.txt is a proposal, and it was never built for your prices

Most people searching for how to format business data for agents reach for llms.txt. It is worth being precise about what that file is. By its own description, llms.txt is "a proposal to standardise on using an /llms.txt file to provide information to help LLMs use a website at inference time." It was authored by Jeremy Howard and published on 2024-09-03. It remains an open proposal, not a ratified standard.

Its status matters less than its shape. The format is a markdown file whose only required section is an H1 with the site name, plus an optional blockquote summary and H2 link lists. It exists to point a language model at brief background and detailed markdown documentation. It does not carry typed commerce facts. There is no field in llms.txt for "opens at 09:00, closes at 17:30, Monday through Friday," and none for "price range two dollar signs." You would be writing prose and hoping the model parses it correctly, which is exactly the ambiguity you set out to remove.

Adoption is low. No major model provider has committed to honouring it, and figures at Google have publicly questioned the proposal. None of that makes llms.txt useless. It is a reasonable way to hand a model a curated map of your documentation. It is simply the wrong instrument for hours and prices.

02

The boring standard already has typed fields for both

Schema.org has carried the answer for over a decade. The OpeningHoursSpecification type is "a structured value providing information about the opening hours of a place or a certain service inside a place." Its typed properties are dayOfWeek, opens, closes, plus validFrom and validThrough for temporary changes like a holiday schedule. These are not free text. They are machine-typed values an agent can resolve without guessing.

For commercial terms, LocalBusiness defines openingHours, openingHoursSpecification, priceRange (a categorical text value such as $$$), currenciesAccepted, and paymentAccepted. Note one real limit: there is no exact-price-amount property at the LocalBusiness level. priceRange is categorical by design. When you need to publish an exact figure, you model the thing as a Product or Offer and use Offer.price, which is standard practice once an entity is modeled that way.

Google's guidance is concrete. Its Local Business documentation recommends JSON-LD and the most specific LocalBusiness sub-type. It specifies opens and closes in hh:mm:ss format, represents 24-hour operation as opens 00:00 and closes 23:59, and caps priceRange at under 100 characters. The only strictly required properties are name and address. openingHoursSpecification and priceRange are recommended, not mandatory. This is a solved data-modeling problem.

03

Hallucination is a consistency failure, not a format gap

Here is the part that reframes the whole question. An agent that tells a customer you close at 6pm when you close at 5pm is almost never confused by a missing standard. It is reading a number that was once true. Your JSON-LD says 6pm because someone set it eighteen months ago. Your visible page now says 5pm because a manager updated the website copy. Nobody reconciled the two.

Google treats this drift as a policy violation, not a cosmetic flaw. The structured data guidelines state plainly that "your structured data must be a true representation of the page content" and "don't mark up content that is not visible to readers of the page." Violations can trigger manual actions that strip your rich-result eligibility. So the discipline that protects your search appearance is the same discipline that keeps an agent honest. The typed data and the human-readable page must say the same thing, and both must be current.

That is why the real work is operational, not notational. You need a single source of truth for your hours and prices. You need both the page and the JSON-LD generated from it. And you need that pipeline to run whenever the underlying fact changes. A new file format does nothing for a sync problem.

04

On top of the data sits a consent and integration layer

Typed, consistent data is the foundation. The layer above it governs how an agent actually reaches in and uses it. Here the relevant standard is the Model Context Protocol (MCP), "an open protocol that enables seamless integration between LLM applications and external data sources and tools." MCP runs on JSON-RPC 2.0 across Hosts, Clients, and Servers, and exposes Resources, Prompts, and Tools. Its current specification revision is dated 2025-11-25.

MCP matters because consent is built in. Its security principles require explicit user approval before invoking any tool, and they state that tool behavior descriptions "should be considered untrusted, unless obtained from a trusted server." That is the line between a model passively reading your published facts and an agent taking an action against your business with a human in the loop.

These layers are converging on schema.org rather than replacing it. Microsoft's open-source NLWeb natively supports MCP, positions itself as "to MCP/A2A what HTML is to HTTP," and deliberately reuses existing schema.org markup. NLWeb argues that schema.org and semi-structured formats, which it says are used by over 100 million websites, have become "a semantic layer for the web," and every NLWeb instance runs as an MCP server with an ask method. The direction of travel is clear. The data stays in schema.org. The access and consent layer is MCP.

05

Control who trains on it, separately from who reads it

One more distinction keeps two different decisions from collapsing into one. Marking your facts up for agents to read is not the same as consenting to have your content train a model. Those are governed separately. Google's Google-Extended is "a standalone product token that web publishers can use to manage whether content Google crawls from their sites may be used for training future generations of Gemini models." You set it through the Google-Extended user-agent token in robots.txt. It governs Gemini Apps, the Vertex AI API for Gemini, and grounding features, and it does not affect your inclusion or ranking in Google Search.

The practical posture has three moves. Publish typed, accurate JSON-LD so agents quote your hours and prices correctly. Expose action through a consented protocol like MCP. Use training-control tokens to make a deliberate, separate choice about model training. Three layers, three decisions, all governable.

06

Where Origin Pi fits

Origin Pi builds the governed business layer that sits underneath all of this. The thesis is simple. An agent is only as trustworthy as the business facts it reads, so those facts have to be typed, consistent with what humans see, and continuously synced from one source of truth. Our work on agent readiness treats your structured data as a maintained system rather than a one-time markup task. Our marketing AI practice keeps the facts that agents quote, including hours and prices, aligned with the live business behind them. The standard already exists. The discipline is the deliverable.

07

Sources

Questions

Common questions.

What is the best file format for AI agents to read my business hours and prices?

Schema.org JSON-LD, not a new file like llms.txt. JSON-LD provides typed fields built exactly for this: OpeningHoursSpecification with dayOfWeek, opens, and closes for hours, and priceRange or Offer.price for prices. Google recommends JSON-LD and the most specific LocalBusiness sub-type. It is a decade-old, widely supported standard, where llms.txt is an unratified proposal designed to point models at documentation, not to carry typed commerce facts.

Is llms.txt an official standard I should adopt?

No. By its own description, llms.txt is a proposal, authored by Jeremy Howard and published on 2024-09-03, that remains open for community input. It is a markdown file whose only required section is an H1 site name plus optional link lists, built to surface documentation. Adoption is low, no major model provider has committed to honouring it, and it has been publicly questioned by figures at Google. It can usefully map your docs, but it is the wrong tool for structured hours and prices.

Why do AI agents get my opening hours or prices wrong even when I have structured data?

Almost always because your structured data has drifted from your visible page and gone stale. The agent is reading a number that was true once. Google's policies require that your markup be a true representation of the page content and that you not mark up content invisible to readers, so the fix is operational: one source of truth, the page and the JSON-LD generated from it, and a pipeline that re-syncs whenever a fact changes.

How do I represent opening hours in schema.org?

Use OpeningHoursSpecification, a structured value with typed properties: dayOfWeek, opens, and closes, plus validFrom and validThrough for temporary changes such as holiday hours. Google specifies opens and closes in hh:mm:ss format, and represents 24-hour operation as opens 00:00 and closes 23:59. This removes the ambiguity of describing hours in prose.

Can schema.org hold an exact price, not just a price range?

At the LocalBusiness level, no. LocalBusiness offers priceRange, a categorical text value such as $$$, capped by Google at under 100 characters, along with currenciesAccepted and paymentAccepted. To publish an exact figure, you model the item as a Product or Offer and use Offer.price, which is standard practice when the entity is modeled that way.

What is the Model Context Protocol and how does it relate to schema.org markup?

MCP is an open protocol that connects LLM applications to external data sources and tools using JSON-RPC 2.0, with the current specification revision dated 2025-11-25. It sits above your data as a consent and integration layer: it requires explicit user consent before invoking any tool and treats tool descriptions as untrusted unless from a trusted server. Your typed facts stay in schema.org; MCP governs how an agent reaches in and acts. Microsoft's NLWeb shows the pattern, reusing schema.org markup and running each instance as an MCP server.

Continue reading.

Building the agent-ready layer for your business? Send a note. Real reply, no funnel.

Talk to us Read the thesis