Technical13 min readArticle · 10

llms.txt: who actually uses it

Three independent studies converged: AI-search bots ignore llms.txt 97% of the time. But Cursor, Claude Code, and Copilot do fetch it. Ship for your docs; skip the SEO pitch. Here's the spec, the studies, and the honest decision tree.

Ibrahim Furkan Ozcelik · works on GEO and AI search

PublishedJune 21, 2026UpdatedJune 21, 2026Sourceibrahimfurkanozcelik.com

llms.txt · defined

llms.txt is a markdown file served at the root of a website that gives AI agents a curated, human-readable index of the site's content — its sections, key pages, and structure — formatted for direct ingestion by large language models. It was proposed by Jeremy Howard at Answer.AI in September 2024.

llms.txt is a markdown file at the root of your site that tells AI agents what content exists. Proposed by Jeremy Howard at Answer.AI in September 2024, currently at v0.0.6. As of mid-2026, no major LLM search provider has publicly confirmed consuming it as a primary signal. Coding agents — Cursor, Claude Code, Copilot — do fetch it. Three independent studies (Ahrefs 137K domains, Semrush, SE Ranking 300K) found AI-search bots ignore it 97% of the time. Ship it for your documentation; skip the SEO pitch.

I built TurboAudit, an AI search visibility tool. I have no llms.txt product to sell, no plugin to upsell, and no commercial relationship with any of the platforms mentioned below. The conclusion the data forces — "this does not measurably help AI search citations" — is unpopular with SEO vendors. Three independent studies say it anyway.

Spec · 01

What llms.txt actually is

llms.txt is a markdown file served at the root of a website, at the path `/llms.txt`. The format is plain markdown — not the robots.txt grammar, not XML, not JSON. The file lists the site's name, a one-line summary, optional context, and a set of links organised under markdown H2 sections.

The proposal was published by Jeremy Howard at Answer.AI on September 3, 2024, in a blog post titled "The /llms.txt file." The current spec version is v0.0.6 (January 2026). It is community-managed, hosted on the AnswerDotAI/llms-txt GitHub repository, and has no formal RFC or W3C track. The version number itself is the spec author's signal that the standard is provisional.

The intent: where robots.txt is a *restrictive* directive — it tells crawlers what they may not read — llms.txt is a *descriptive* signal. It tells AI agents what content exists on your site and how to find it. Two different mechanisms, two different purposes. They are not substitutes.

A valid llms.txt is short. The spec example is roughly twenty lines of markdown. It is meant to fit inside an LLM's context window without consuming it, so a 50,000-line dump of every page on your site is the wrong shape — that's what `/llms-full.txt` is for, separately. The structure matters.

Family · 02

The sibling-file family

Most articles about llms.txt mention only the headline file. The proposal — and the FastHTML / Answer.AI conventions that grew up around it — actually defines four files with different purposes. Most sites ship only the first.

01
/llms.txt
The curated index. Short. Markdown. Meant for context-window-friendly ingestion.
The canonical file the spec defines. Site name, one-line summary, optional context, then H2 sections of links. Think of it as a sitemap written in prose for an agent, not a crawler. Every site that ships llms.txt at all ships this one.
02
/llms-full.txt
Full content concatenated. Heavy. For agents that want the corpus, not the index.
Same markdown format, but instead of links it inlines the full content of each section. Useful when an agent has the context budget to ingest the whole site at once. Cloudflare publishes one of these for every product. Most sites do not.
03
/llms-ctx.txt
Context-window-trimmed processed version. A FastHTML convention.
Generated by FastHTML's `llms_txt2ctx` tool. Strips and reshapes content so the result fits cleanly inside a typical LLM context window. Not part of the core spec — a convention from the Answer.AI ecosystem.
04
/llms-ctx-full.txt
Full content with context-window structuring. Also a FastHTML convention.
Same idea as llms-ctx.txt but with the fuller content payload preserved. Used by agents and tooling inside the FastHTML ecosystem. Outside that ecosystem, rarely seen in the wild.

Adoption · 03

Who actually uses it

This is the section every existing page hedges on. The honest answer up front: as of mid-2026, no major LLM search provider has publicly confirmed consuming llms.txt as a primary signal. Coding agents do consume it. Three independent server-log studies have converged on the same conclusion. The provider-by-provider map and the study data are below.

OpenAI
No official endorsement. Bot controls route through robots.txt.
OpenAI's published crawler documentation (developers.openai.com/api/docs/bots) describes four crawlers and routes all crawler controls through robots.txt. The term "llms.txt" does not appear on the page. Profound's GEO tracking has observed occasional GPTBot fetches of llms.txt and llms-full.txt — exploratory, not documented consumption.
Anthropic
Hosts llms.txt for their own documentation. No statement on third-party consumption.
docs.anthropic.com serves an llms.txt. There is no public statement that Claude consumes third-party llms.txt as a primary signal. Anthropic's own documentation is a coding-agent surface — Claude Code is the obvious agent that benefits.
Perplexity
Hosts llms.txt for their own docs. PerplexityBot has not been documented to consume third-party llms.txt.
Semrush's August–October 2025 server-log analysis recorded zero PerplexityBot hits on llms.txt files. Same pattern as Anthropic: ships one for their own docs, no confirmation of consuming others.
Google
Contradictory signals from the same company. Search team disowns; Chrome DevRel ships a Lighthouse audit.
John Mueller (Google Search) publicly compared llms.txt to the deprecated keywords meta tag and stated "no AI system currently uses llms.txt." Reiterated in 2026. Separately, Chrome DevRel shipped a Lighthouse audit for llms.txt at developer.chrome.com/docs/lighthouse/agentic-browsing/llms-txt. Mueller has explicitly clarified that the Lighthouse audit is not a Google Search endorsement.
Cursor / Claude Code / Copilot / Cline / Aider / Windsurf
Consume llms.txt when pointed at a docs site. This is the real audience.
All major coding agents read llms.txt when fetching documentation. Cloudflare's docs-for-agents project explicitly positions llms.txt as a coding-agent entry point. This is the B2A (business-to-agent) use case where shipping the file has measurable value.

Three converging studies

Ahrefs — 137,210 domains, 97% zero requests
Ahrefs analysed 137,210 domains. Roughly 28% publish an llms.txt. Of the ~38,000 valid files identified, only about 1,100 received any traffic during the study window — meaning 97% of llms.txt files received zero requests. Of the bots that did fetch llms.txt files, 96% were non-AI bots (SEO crawlers, Googlebot, profilers). The two most common AI bots seen fetching them were GPTBot and Claude-Code — training crawlers and coding agents, not AI-search bots. Critical finding: AI bots never go looking for llms.txt files. They only fetch them incidentally.
Semrush — own server logs, August–October 2025
Semrush analysed their own server logs across three months and reported zero AI-bot hits on their llms.txt. Their analysis found no correlation between shipping llms.txt and AI-search performance on Semrush's own properties.
SE Ranking — 300,000 domains
SE Ranking's adoption survey across 300,000 domains found 10.13% had shipped an llms.txt. Their conclusion was consistent with Ahrefs and Semrush: no measurable impact on AI search citations attributable to the file's presence.

Three independent studies, three different methodologies, one convergent conclusion: AI-search bots ignore llms.txt at scale. Coding agents fetch it. The next section is the decision tree that follows from this data.

Disambiguation · 04

llms.txt vs robots.txt

These are not substitutes. They control different things and have different grammars.

robots.txt is a restrictive directive. It tells crawlers what they may not read. It follows RFC 9309, has a well-defined grammar (`User-agent:` and `Disallow:` / `Allow:` directives), and is broadly enforced — major crawlers respect it. If you do nothing else, ship a robots.txt.

llms.txt is a descriptive signal. It tells AI agents what content exists on your site and gives them a curated path to find it. It is plain markdown. It is not enforced by anyone in particular; it is read by whoever decides to read it. As of mid-2026, that is mostly coding agents.

Sites can ship both, one, or neither. They answer different questions and operate in different modes. Treating llms.txt as a robots.txt replacement is the most common confusion in published guidance, and it is wrong.

Decision · 05

Should you ship one? A decision tree

Three categories. Match yourself to one. Move on.

Ship it if any of these

01
You operate a documentation site
Coding agents (Cursor, Claude Code, Copilot, Cline, Aider, Windsurf) actively fetch llms.txt when pointed at documentation. If your audience is developers using agents to read your docs, this file has real users. The Mintlify customer list — Anthropic, Cursor, Cloudflare, Perplexity, Replit, Zapier, Kalshi, Loops — all ship one for exactly this reason.
02
You sell developer-facing tools, SDKs, APIs, or infrastructure
Your buyers' workflows include reading your docs through an agent. Shipping llms.txt is a small, defensible quality signal in that distribution channel. Cost is minutes; upside is concrete even if narrow.
03
You publish technical reference content
Frameworks, libraries, language tutorials, infrastructure guides. The same agent-readers apply. The Mintlify and GitBook ecosystems exist for this reason.

Skip the SEO pitch if

01
Your goal is AI search citations from ChatGPT, Claude, Perplexity, or Gemini
Three studies say it does not measurably help. Google Search has publicly disowned the file. Shipping llms.txt because an SEO vendor told you it would lift your AI citation rate is the keywords-meta-tag move thirty years later. Do not do it.
02
You run a marketing site, blog, ecommerce store, or news property
Your audience is humans through traditional search and AI answer surfaces, not agents reading docs. The cost is low, but so is the upside. The opportunity cost is the time you spent reading articles like this one. Pick a different lever.
03
You cannot keep it updated when your site changes
A stale llms.txt that lists pages that have moved, content that has changed, or sections that no longer exist is actively misleading to the agents that do consume it. If your site changes weekly and you cannot automate regeneration, do not ship a snapshot that will rot.

Never do this

01
Do not auto-generate a fabricated llms.txt
Several plugins and SaaS tools will generate llms.txt content algorithmically, including descriptions and section labels that do not reflect what is on your site. This is the keywords meta tag failure mode John Mueller cited — a self-declared signal an LLM by design cannot trust. Even where llms.txt is read, agents that find a fabricated one will learn to discount the signal, not trust it more.
02
Do not ship llms.txt with content that does not exist at the URLs you list
If you list /docs/getting-started and that page returned 404 last month, you are sending agents to dead destinations. Either keep the file accurate or do not ship it.
03
Do not treat shipping llms.txt as a substitute for any GEO work
Citation engineering, AI crawler accessibility, entity authority, evidence density — those are the levers that affect AI search visibility. llms.txt is none of them. Adding the file does not replace doing the work.

Example · 06

A working minimal example

A real, valid llms.txt at the v0.0.6 spec format. Copyable. The whole file is shorter than most pages' meta descriptions — that's the point.

/llms.txt

v0.0.6 format

# Example Site

> One sentence describing what this site is.

Optional second paragraph giving an agent a little more context: who
the audience is, what the priorities are, when the content was last
updated.

## Docs

- [Getting Started](https://example.com/docs/getting-started): minimal setup
- [API Reference](https://example.com/docs/api): full endpoint list
- [Tutorials](https://example.com/docs/tutorials): step-by-step guides

## Optional

- [Changelog](https://example.com/changelog): version history
- [Blog](https://example.com/blog): essays and announcements

Notes on the shape: the H1 is the site name. The blockquote is the one-sentence summary. The optional paragraph after is context. H2 sections group related links. Sections named "Optional" carry a specific spec meaning: agents can skip them when working within a tight context budget. Validate the file at llmstxt.org or with the asvaai.com generator.

Implementation · 07

Shipping it on the major platforms

Most platforms have either a first-party feature, a community plugin, or a one-file path that works without anything special. The list below is non-exhaustive — covers what the audience for this page most often uses.

WordPress
The community plugin "Website LLMs.txt" on wordpress.org/plugins/website-llms-txt/ supports per-post-type config, daily or weekly regeneration, and optional llms-full.txt. Activate, configure which post types to include, save. Verify by visiting yoursite.com/llms.txt.
Webflow
First-party feature in 2025: upload via SEO settings, served from root, excluded from indexing automatically. Webflow also maintains an open-source generator at github.com/Webflow-Examples/llms-txt-generator-webapp.
Shopify
The "LLMs.txt Generator" app on apps.shopify.com handles generation and serving. For technical store operators, a custom template at /llms.txt with the site's product catalog as markdown is a hand-rolled alternative.
Mintlify
Auto-generates per docs site. No config needed. Shipped by default for all Mintlify customers — Anthropic, Cursor, Cloudflare, Perplexity, Replit, Zapier all use it.
GitBook
First-party support shipped. Configurable from project settings. Same shape as Mintlify — automatic generation, no manual file maintenance.
Vercel / Next.js / static sites
Add a static file at public/llms.txt (or app/llms.txt as appropriate). Generate it from the same content source that produces your sitemap. For Next.js apps, a route handler at app/llms.txt/route.ts that emits the markdown server-side works cleanly.
Plain static HTML
Create the file, name it llms.txt, place it at the site root, ensure your web server serves .txt with `Content-Type: text/markdown` or `text/plain`. That is the entire setup.

After shipping, verify with `curl https://your-site/llms.txt`. The response should be the file you wrote, with a 200 status. If you are behind a CDN, confirm the CDN is not stripping or rewriting it. Optionally validate the markdown structure with the llmstxt.org reference or asvaai.com's validator.

FAQ · 08

Frequently asked questions

Q · 01

Does Google use llms.txt?

No. Google Search has publicly disowned llms.txt twice. John Mueller compared it to the deprecated keywords meta tag and stated "no AI system currently uses llms.txt." Chrome's DevRel team shipped a Lighthouse audit that measures whether a site has one, but Mueller has explicitly clarified that the Lighthouse audit is not a Google Search endorsement. Shipping llms.txt does not affect Google Search ranking or AI Overview citation likelihood.

Q · 02

Does ChatGPT use llms.txt?

Not as a primary signal. OpenAI's bot documentation routes all crawler controls through robots.txt and does not mention llms.txt. GPTBot has been observed fetching llms.txt files occasionally, but the behavior is exploratory rather than documented consumption. Three independent studies (Ahrefs 137K domains, Semrush, SE Ranking 300K) found AI-search bots — including OpenAI's — ignore llms.txt 97% of the time.

Q · 03

What's the difference between llms.txt and robots.txt?

robots.txt is a restrictive directive that tells crawlers what they may not read. llms.txt is a descriptive signal that tells AI agents what content exists on your site. robots.txt follows the RFC 9309 grammar and is broadly enforced; llms.txt is plain markdown and is read by whoever decides to read it (currently mostly coding agents). They control different things and have different grammars. They are complementary, not substitutes.

Q · 04

Will shipping llms.txt improve my AI search citations?

Three independent studies say no. Ahrefs found 97% of llms.txt files received zero requests across 137,210 domains. Semrush's own server logs showed zero AI-bot hits over three months. SE Ranking's 300K-domain survey found no measurable impact on AI search citations. If your goal is AI search citations from ChatGPT, Claude, Perplexity, or Gemini, llms.txt is not the lever. Work on citation engineering, AI crawler accessibility, and evidence density instead.

Q · 05

What are llms-full.txt, llms-ctx.txt, and llms-ctx-full.txt?

Three sibling files that grew up around the core spec. llms-full.txt is the same markdown shape as llms.txt but inlines the full content of each section instead of linking out — meant for agents that want the whole corpus. llms-ctx.txt and llms-ctx-full.txt are FastHTML / Answer.AI conventions, produced by the llms_txt2ctx tool, that reshape content to fit cleanly inside a typical LLM context window. The core spec defines llms.txt and llms-full.txt; the context variants are FastHTML conventions. Most sites ship only llms.txt.

Q · 06

Should I auto-generate llms.txt with a plugin?

Cautiously. A plugin that auto-generates from your actual sitemap and content is fine — it stays in sync. A plugin that algorithmically fabricates descriptions and section labels that do not reflect your site is the keywords meta tag failure mode John Mueller cited. If you cannot read the generated file and verify it accurately describes your site, do not ship it. A bad llms.txt is worse than no llms.txt because the agents that do read it will learn to discount your signal.

Sources

Jeremy Howard / Answer.AI — The /llms.txt file (Sept 2024 proposal)answer.ai →
llmstxt.org — canonical spec pagellmstxt.org →
answerdotai/llms-txt — GitHub repo (v0.0.6 release)github.com →
Ahrefs — We Analyzed 137K Sites: 97% of llms.txt Files Never Get Readahrefs.com →
Semrush — What Is LLMs.txt & Should You Use It?semrush.com →
Search Engine Journal — Google's Mueller Says llms.txt Can't Help LLMs Differentiate Sitessearchenginejournal.com →
Search Engine Journal — Google Says LLMs.txt Comparable To Keywords Meta Tagsearchenginejournal.com →
Search Engine Roundtable — Google Search Team Does Not Endorse LLMs.txtseroundtable.com →
Chrome for Developers — Lighthouse llms.txt auditdeveloper.chrome.com →
Cloudflare Developers — llms.txt + Docs for agentsdevelopers.cloudflare.com →
SE Ranking — LLMs.txt adoption across 300K domainsseranking.com →
Mintlify — llms.txt docs (auto-generation for docs sites)mintlify.com →

What llms.txt actually is

The sibling-file family

/llms.txt

/llms-full.txt

/llms-ctx.txt

/llms-ctx-full.txt

Who actually uses it

OpenAI

Anthropic

Perplexity

Google

Cursor / Claude Code / Copilot / Cline / Aider / Windsurf

Three converging studies

Ahrefs — 137,210 domains, 97% zero requests

Semrush — own server logs, August–October 2025

SE Ranking — 300,000 domains

llms.txt vs robots.txt

Should you ship one? A decision tree

You operate a documentation site

You sell developer-facing tools, SDKs, APIs, or infrastructure

You publish technical reference content

Your goal is AI search citations from ChatGPT, Claude, Perplexity, or Gemini

You run a marketing site, blog, ecommerce store, or news property

You cannot keep it updated when your site changes

Do not auto-generate a fabricated llms.txt

Do not ship llms.txt with content that does not exist at the URLs you list

Do not treat shipping llms.txt as a substitute for any GEO work

A working minimal example

/llms.txt

Shipping it on the major platforms

WordPress

Webflow

Shopify

Mintlify

GitBook

Vercel / Next.js / static sites

Plain static HTML

Frequently asked questions

New GEO research, as it ships.