Jeremy Howard proposed llms.txt on 3 September 2024. By Q1 2026 over 844,000 sites had implemented it (Mintlify, Q1 2026). The vendor blogs say it's a 2-3× citation lift. The AI labs say nothing.
This guide separates what's verified from what's vendor narrative, and gives you a working template if you decide to ship one anyway.
The spec — non-negotiable structure
The canonical document at /llms.txt must be CommonMark and follow this strict order:
- H1 — site / project name (
# Site Name). Required. - Blockquote summary — one paragraph (
> ...) immediately after the H1. Required. - Optional context paragraphs — plain prose, no markdown blocks.
- H2 sections — bullet list of
- [Title](URL): Descriptionlinks. - "Optional" H2 — has reserved semantics. Consumers running tight context budgets may skip these.
That's the entire spec. There's a sibling file, /llms-full.txt, that concatenates the full markdown content of every linked page. Use it when you want long-context retrieval systems to ingest your docs in one fetch.
The community LLMs.txt Validator flags llms-full.txt over 500 KB — a useful red-line for your generator's full variant.
Who actually reads it — the honest answer
| Consumer | Officially confirmed? | Behavior observed |
|---|---|---|
| OpenAI / ChatGPT | No | No public statement; no observed traffic from OAI-SearchBot requesting /llms.txt. |
| Anthropic / ClaudeBot | No | Same. |
| Perplexity / PerplexityBot | No | Same. |
| Google (Gemini, AIO, AI Mode) | No | Same. |
| Mintlify-hosted docs traffic | Yes (Mintlify Q1 2026 telemetry) | Bots fetch llms-full.txt more frequently than llms.txt on docs platforms. |
| You.com | Unclear | Some community reports. |
The honest summary: no major engine has confirmed consumption. What we have is observational telemetry on docs platforms (Mintlify) and a lot of pattern-matching from SEO vendors. Skepticism is warranted — but the file costs ~5 minutes to ship and creates zero downside.
What it actually does, in our opinion
llms.txt is most useful as a stake in the ground for entity disambiguation:
- Tells a retrieval system what you call yourself and what your top-priority URLs are.
- Concentrates your highest-authority content in one fetch — useful if a long-context model decides to ingest it.
- Acts as a structured signal that you've thought about AI consumption (correlates with sites that also nail schema + author entities).
What it is not:
- A direct rank lever.
- A replacement for
robots.txt(different layer —robots.txtis permission,llms.txtis hint). - A guarantee of citation lift. Anyone claiming a specific number (1.4×, 2.2×) is extrapolating from a vendor cohort without a control group.
A template that actually validates
Save as /public/llms.txt (or wherever your static files live):
# Your Brand
> One-paragraph summary of what your brand does, what category you compete in,
> and one phrase a journalist would use to introduce you. Keep it under 60 words.
This is your brand's "stake in the ground" for AI search. The lines above are
the only required fields; everything below is optional context that
long-context retrieval systems may ingest.
## Core pages
- [Homepage](https://yourbrand.com/): Headline + main value prop.
- [Pricing](https://yourbrand.com/pricing): Tier breakdown and what's in/out.
- [About](https://yourbrand.com/about): Founders, team, what makes you different.
## Documentation
- [Quickstart](https://yourbrand.com/docs/quickstart): 5-minute hands-on.
- [API reference](https://yourbrand.com/docs/api): Full endpoint catalog.
- [Integrations](https://yourbrand.com/docs/integrations): Stack compatibility.
## Trust & authority
- [Security](https://yourbrand.com/security): SOC 2, encryption, data handling.
- [Customer stories](https://yourbrand.com/customers): Named accounts + outcomes.
- [Changelog](https://yourbrand.com/changelog): Recent updates (proves freshness).
## Optional
- [Blog](https://yourbrand.com/blog): Long-form research.
- [Newsroom](https://yourbrand.com/press): Press coverage.
Run it through the validator before shipping. The blockquote line is the #1 thing people get wrong — it must come immediately after the H1 with no blank line and no other markdown between them.
Cloudflare's nested-llms.txt pattern — worth copying
Their root /llms.txt points each product to its own product-scoped llms.txt, and there's a single archive at developers.cloudflare.com/llms-full.txt for offline / bulk ingestion. This nested approach scales better than a single mega-file once you have more than ~20 top-level sections.
Fetch their llms.txt if you want to see the pattern. Particularly useful for SaaS with multiple products.
What to do if you don't ship one
Run our free audit — it checks for llms.txt presence and content quality, and also checks the 14 AI crawler user-agent rules in your robots.txt. The robots.txt side has more confirmed consumption than llms.txt does today, so it's the higher-leverage half of the access-protocol stack.
Sources
- llms.txt spec — https://llmstxt.org/
- Jeremy Howard introduction (Sep 3 2024) — https://www.answer.ai/posts/2024-09-03-llmstxt.html
- LLMs.txt Validator — https://llmstxtvalidator.dev/
- Mintlify: "The value of llms.txt: hype or real?" — https://www.mintlify.com/blog/the-value-of-llms-txt-hype-or-real
- Cloudflare docs llms.txt — https://developers.cloudflare.com/llms.txt
Want to know if your llms.txt actually validates? Run a free audit — we check spec compliance, file size, and whether your robots.txt is letting in the AI crawlers that matter.