Custom robots.txt

View as Markdown

By default, Fern serves an auto-generated robots.txt at the root of your documentation site that allows all crawlers and points to your sitemap.xml. Use the agents.robots-txt key in docs.yml to serve your own file instead — useful for opting in or out of specific AI crawlers, gating sensitive sections, or signaling preferences with the Cloudflare Content Signals Policy.

robots.txt is advisory: compliant crawlers honor your Disallow and Allow directives, but bots that ignore the protocol still reach those paths. For content that must stay private, use authentication.

robots.txt decides which crawlers can reach your site and what AI training signals you broadcast. Its companions, llms.txt and llms-full.txt, shape what AI agents receive once they crawl.

Configuration

1

Point agents.robots-txt at your file in docs.yml

docs.yml
1agents:
2 robots-txt: ./robots.txt

The path is relative to docs.yml.

2

Write your custom robots.txt

robots.txt
# Allow search engines
User-Agent: Googlebot
Allow: /
# Restrict an AI crawler from a private path
User-Agent: GPTBot
Disallow: /private
# Declare AI usage preferences via Cloudflare Content Signals
Content-Signal: ai-train=yes, search=yes, ai-input=yes
# Point crawlers at your sitemap — Fern's default robots.txt includes this,
# so add it back when you replace the default with a custom file
Sitemap: https://docs.example.com/sitemap.xml

Place named bots (e.g., GPTBot, Googlebot) before any wildcard groups in your file — Fern appends its own User-Agent: * block when it serves the file.

3

Fern serves your file

Your file is served verbatim at /robots.txt. Fern appends a managed block at the end that disallows internal API routes:

# Fern-managed routes — automatically disallowed
User-Agent: *
Disallow: /api/fern-docs/