llms.txt: who actually uses it

Three independent studies converged: AI-search bots ignore llms.txt 97% of the time. But Cursor, Claude Code, and Copilot do fetch it. Ship for your docs; skip the SEO pitch. Here's the spec, the studies, and the honest decision tree.

llms.txt · defined

llms.txt is a markdown file served at the root of a website that gives AI agents a curated, human-readable index of the site's content — its sections, key pages, and structure — formatted for direct ingestion by large language models. It was proposed by Jeremy Howard at Answer.AI in September 2024.

llms.txt is a markdown file at the root of your site that tells AI agents what content exists. Proposed by Jeremy Howard at Answer.AI in September 2024, currently at v0.0.6. As of mid-2026, no major LLM search provider has publicly confirmed consuming it as a primary signal. Coding agents — Cursor, Claude Code, Copilot — do fetch it. Three independent studies (Ahrefs 137K domains, Semrush, SE Ranking 300K) found AI-search bots ignore it 97% of the time. Ship it for your documentation; skip the SEO pitch.

I built TurboAudit, an AI search visibility tool. I have no llms.txt product to sell, no plugin to upsell, and no commercial relationship with any of the platforms mentioned below. The conclusion the data forces — "this does not measurably help AI search citations" — is unpopular with SEO vendors. Three independent studies say it anyway.

What llms.txt actually is

llms.txt is a markdown file served at the root of a website, at the path `/llms.txt`. The format is plain markdown — not the robots.txt grammar, not XML, not JSON. The file lists the site's name, a one-line summary, optional context, and a set of links organised under markdown H2 sections.

The proposal was published by Jeremy Howard at Answer.AI on September 3, 2024, in a blog post titled "The /llms.txt file." The current spec version is v0.0.6 (January 2026). It is community-managed, hosted on the AnswerDotAI/llms-txt GitHub repository, and has no formal RFC or W3C track. The version number itself is the spec author's signal that the standard is provisional.

The intent: where robots.txt is a *restrictive* directive — it tells crawlers what they may not read — llms.txt is a *descriptive* signal. It tells AI agents what content exists on your site and how to find it. Two different mechanisms, two different purposes. They are not substitutes.

A valid llms.txt is short. The spec example is roughly twenty lines of markdown. It is meant to fit inside an LLM's context window without consuming it, so a 50,000-line dump of every page on your site is the wrong shape — that's what `/llms-full.txt` is for, separately. The structure matters.

The sibling-file family

Most articles about llms.txt mention only the headline file. The proposal — and the FastHTML / Answer.AI conventions that grew up around it — actually defines four files with different purposes. Most sites ship only the first.

  1. 01

    /llms.txt

    The curated index. Short. Markdown. Meant for context-window-friendly ingestion.

    The canonical file the spec defines. Site name, one-line summary, optional context, then H2 sections of links. Think of it as a sitemap written in prose for an agent, not a crawler. Every site that ships llms.txt at all ships this one.

  2. 02

    /llms-full.txt

    Full content concatenated. Heavy. For agents that want the corpus, not the index.

    Same markdown format, but instead of links it inlines the full content of each section. Useful when an agent has the context budget to ingest the whole site at once. Cloudflare publishes one of these for every product. Most sites do not.

  3. 03

    /llms-ctx.txt

    Context-window-trimmed processed version. A FastHTML convention.

    Generated by FastHTML's `llms_txt2ctx` tool. Strips and reshapes content so the result fits cleanly inside a typical LLM context window. Not part of the core spec — a convention from the Answer.AI ecosystem.

  4. 04

    /llms-ctx-full.txt

    Full content with context-window structuring. Also a FastHTML convention.

    Same idea as llms-ctx.txt but with the fuller content payload preserved. Used by agents and tooling inside the FastHTML ecosystem. Outside that ecosystem, rarely seen in the wild.

Who actually uses it

This is the section every existing page hedges on. The honest answer up front: as of mid-2026, no major LLM search provider has publicly confirmed consuming llms.txt as a primary signal. Coding agents do consume it. Three independent server-log studies have converged on the same conclusion. The provider-by-provider map and the study data are below.

  • OpenAI

    No official endorsement. Bot controls route through robots.txt.

    OpenAI's published crawler documentation (developers.openai.com/api/docs/bots) describes four crawlers and routes all crawler controls through robots.txt. The term "llms.txt" does not appear on the page. Profound's GEO tracking has observed occasional GPTBot fetches of llms.txt and llms-full.txt — exploratory, not documented consumption.

  • Anthropic

    Hosts llms.txt for their own documentation. No statement on third-party consumption.

    docs.anthropic.com serves an llms.txt. There is no public statement that Claude consumes third-party llms.txt as a primary signal. Anthropic's own documentation is a coding-agent surface — Claude Code is the obvious agent that benefits.

  • Perplexity

    Hosts llms.txt for their own docs. PerplexityBot has not been documented to consume third-party llms.txt.

    Semrush's August–October 2025 server-log analysis recorded zero PerplexityBot hits on llms.txt files. Same pattern as Anthropic: ships one for their own docs, no confirmation of consuming others.

  • Google

    Contradictory signals from the same company. Search team disowns; Chrome DevRel ships a Lighthouse audit.

    John Mueller (Google Search) publicly compared llms.txt to the deprecated keywords meta tag and stated "no AI system currently uses llms.txt." Reiterated in 2026. Separately, Chrome DevRel shipped a Lighthouse audit for llms.txt at developer.chrome.com/docs/lighthouse/agentic-browsing/llms-txt. Mueller has explicitly clarified that the Lighthouse audit is not a Google Search endorsement.

  • Cursor / Claude Code / Copilot / Cline / Aider / Windsurf

    Consume llms.txt when pointed at a docs site. This is the real audience.

    All major coding agents read llms.txt when fetching documentation. Cloudflare's docs-for-agents project explicitly positions llms.txt as a coding-agent entry point. This is the B2A (business-to-agent) use case where shipping the file has measurable value.

Three converging studies

  • Ahrefs — 137,210 domains, 97% zero requests

    Ahrefs analysed 137,210 domains. Roughly 28% publish an llms.txt. Of the ~38,000 valid files identified, only about 1,100 received any traffic during the study window — meaning 97% of llms.txt files received zero requests. Of the bots that did fetch llms.txt files, 96% were non-AI bots (SEO crawlers, Googlebot, profilers). The two most common AI bots seen fetching them were GPTBot and Claude-Code — training crawlers and coding agents, not AI-search bots. Critical finding: AI bots never go looking for llms.txt files. They only fetch them incidentally.

  • Semrush — own server logs, August–October 2025

    Semrush analysed their own server logs across three months and reported zero AI-bot hits on their llms.txt. Their analysis found no correlation between shipping llms.txt and AI-search performance on Semrush's own properties.

  • SE Ranking — 300,000 domains

    SE Ranking's adoption survey across 300,000 domains found 10.13% had shipped an llms.txt. Their conclusion was consistent with Ahrefs and Semrush: no measurable impact on AI search citations attributable to the file's presence.

Three independent studies, three different methodologies, one convergent conclusion: AI-search bots ignore llms.txt at scale. Coding agents fetch it. The next section is the decision tree that follows from this data.

llms.txt vs robots.txt

These are not substitutes. They control different things and have different grammars.

robots.txt is a restrictive directive. It tells crawlers what they may not read. It follows RFC 9309, has a well-defined grammar (`User-agent:` and `Disallow:` / `Allow:` directives), and is broadly enforced — major crawlers respect it. If you do nothing else, ship a robots.txt.

llms.txt is a descriptive signal. It tells AI agents what content exists on your site and gives them a curated path to find it. It is plain markdown. It is not enforced by anyone in particular; it is read by whoever decides to read it. As of mid-2026, that is mostly coding agents.

Sites can ship both, one, or neither. They answer different questions and operate in different modes. Treating llms.txt as a robots.txt replacement is the most common confusion in published guidance, and it is wrong.

Should you ship one? A decision tree

Three categories. Match yourself to one. Move on.

Ship it if any of these
  1. 01

    You operate a documentation site

    Coding agents (Cursor, Claude Code, Copilot, Cline, Aider, Windsurf) actively fetch llms.txt when pointed at documentation. If your audience is developers using agents to read your docs, this file has real users. The Mintlify customer list — Anthropic, Cursor, Cloudflare, Perplexity, Replit, Zapier, Kalshi, Loops — all ship one for exactly this reason.

  2. 02

    You sell developer-facing tools, SDKs, APIs, or infrastructure

    Your buyers' workflows include reading your docs through an agent. Shipping llms.txt is a small, defensible quality signal in that distribution channel. Cost is minutes; upside is concrete even if narrow.

  3. 03

    You publish technical reference content

    Frameworks, libraries, language tutorials, infrastructure guides. The same agent-readers apply. The Mintlify and GitBook ecosystems exist for this reason.

Never do this
  1. 01

    Do not auto-generate a fabricated llms.txt

    Several plugins and SaaS tools will generate llms.txt content algorithmically, including descriptions and section labels that do not reflect what is on your site. This is the keywords meta tag failure mode John Mueller cited — a self-declared signal an LLM by design cannot trust. Even where llms.txt is read, agents that find a fabricated one will learn to discount the signal, not trust it more.

  2. 02

    Do not ship llms.txt with content that does not exist at the URLs you list

    If you list /docs/getting-started and that page returned 404 last month, you are sending agents to dead destinations. Either keep the file accurate or do not ship it.

  3. 03

    Do not treat shipping llms.txt as a substitute for any GEO work

    Citation engineering, AI crawler accessibility, entity authority, evidence density — those are the levers that affect AI search visibility. llms.txt is none of them. Adding the file does not replace doing the work.

A working minimal example

A real, valid llms.txt at the v0.0.6 spec format. Copyable. The whole file is shorter than most pages' meta descriptions — that's the point.

  1. /llms.txt

    v0.0.6 format
    # Example Site
    
    > One sentence describing what this site is.
    
    Optional second paragraph giving an agent a little more context: who
    the audience is, what the priorities are, when the content was last
    updated.
    
    ## Docs
    
    - [Getting Started](https://example.com/docs/getting-started): minimal setup
    - [API Reference](https://example.com/docs/api): full endpoint list
    - [Tutorials](https://example.com/docs/tutorials): step-by-step guides
    
    ## Optional
    
    - [Changelog](https://example.com/changelog): version history
    - [Blog](https://example.com/blog): essays and announcements

    Notes on the shape: the H1 is the site name. The blockquote is the one-sentence summary. The optional paragraph after is context. H2 sections group related links. Sections named "Optional" carry a specific spec meaning: agents can skip them when working within a tight context budget. Validate the file at llmstxt.org or with the asvaai.com generator.

Shipping it on the major platforms

Most platforms have either a first-party feature, a community plugin, or a one-file path that works without anything special. The list below is non-exhaustive — covers what the audience for this page most often uses.

  • WordPress

    The community plugin "Website LLMs.txt" on wordpress.org/plugins/website-llms-txt/ supports per-post-type config, daily or weekly regeneration, and optional llms-full.txt. Activate, configure which post types to include, save. Verify by visiting yoursite.com/llms.txt.

  • Webflow

    First-party feature in 2025: upload via SEO settings, served from root, excluded from indexing automatically. Webflow also maintains an open-source generator at github.com/Webflow-Examples/llms-txt-generator-webapp.

  • Shopify

    The "LLMs.txt Generator" app on apps.shopify.com handles generation and serving. For technical store operators, a custom template at /llms.txt with the site's product catalog as markdown is a hand-rolled alternative.

  • Mintlify

    Auto-generates per docs site. No config needed. Shipped by default for all Mintlify customers — Anthropic, Cursor, Cloudflare, Perplexity, Replit, Zapier all use it.

  • GitBook

    First-party support shipped. Configurable from project settings. Same shape as Mintlify — automatic generation, no manual file maintenance.

  • Vercel / Next.js / static sites

    Add a static file at public/llms.txt (or app/llms.txt as appropriate). Generate it from the same content source that produces your sitemap. For Next.js apps, a route handler at app/llms.txt/route.ts that emits the markdown server-side works cleanly.

  • Plain static HTML

    Create the file, name it llms.txt, place it at the site root, ensure your web server serves .txt with `Content-Type: text/markdown` or `text/plain`. That is the entire setup.

After shipping, verify with `curl https://your-site/llms.txt`. The response should be the file you wrote, with a 200 status. If you are behind a CDN, confirm the CDN is not stripping or rewriting it. Optionally validate the markdown structure with the llmstxt.org reference or asvaai.com's validator.

Frequently asked questions

Q · 01

Does Google use llms.txt?

No. Google Search has publicly disowned llms.txt twice. John Mueller compared it to the deprecated keywords meta tag and stated "no AI system currently uses llms.txt." Chrome's DevRel team shipped a Lighthouse audit that measures whether a site has one, but Mueller has explicitly clarified that the Lighthouse audit is not a Google Search endorsement. Shipping llms.txt does not affect Google Search ranking or AI Overview citation likelihood.

Q · 02

Does ChatGPT use llms.txt?

Not as a primary signal. OpenAI's bot documentation routes all crawler controls through robots.txt and does not mention llms.txt. GPTBot has been observed fetching llms.txt files occasionally, but the behavior is exploratory rather than documented consumption. Three independent studies (Ahrefs 137K domains, Semrush, SE Ranking 300K) found AI-search bots — including OpenAI's — ignore llms.txt 97% of the time.

Q · 03

What's the difference between llms.txt and robots.txt?

robots.txt is a restrictive directive that tells crawlers what they may not read. llms.txt is a descriptive signal that tells AI agents what content exists on your site. robots.txt follows the RFC 9309 grammar and is broadly enforced; llms.txt is plain markdown and is read by whoever decides to read it (currently mostly coding agents). They control different things and have different grammars. They are complementary, not substitutes.

Q · 04

Will shipping llms.txt improve my AI search citations?

Three independent studies say no. Ahrefs found 97% of llms.txt files received zero requests across 137,210 domains. Semrush's own server logs showed zero AI-bot hits over three months. SE Ranking's 300K-domain survey found no measurable impact on AI search citations. If your goal is AI search citations from ChatGPT, Claude, Perplexity, or Gemini, llms.txt is not the lever. Work on citation engineering, AI crawler accessibility, and evidence density instead.

Q · 05

What are llms-full.txt, llms-ctx.txt, and llms-ctx-full.txt?

Three sibling files that grew up around the core spec. llms-full.txt is the same markdown shape as llms.txt but inlines the full content of each section instead of linking out — meant for agents that want the whole corpus. llms-ctx.txt and llms-ctx-full.txt are FastHTML / Answer.AI conventions, produced by the llms_txt2ctx tool, that reshape content to fit cleanly inside a typical LLM context window. The core spec defines llms.txt and llms-full.txt; the context variants are FastHTML conventions. Most sites ship only llms.txt.

Q · 06

Should I auto-generate llms.txt with a plugin?

Cautiously. A plugin that auto-generates from your actual sitemap and content is fine — it stays in sync. A plugin that algorithmically fabricates descriptions and section labels that do not reflect your site is the keywords meta tag failure mode John Mueller cited. If you cannot read the generated file and verify it accurately describes your site, do not ship it. A bad llms.txt is worse than no llms.txt because the agents that do read it will learn to discount your signal.

New GEO research, as it ships.

Occasional essays on AI search visibility — nothing else, no sponsored content. Unsubscribe anytime.