robots.txt for AI Crawlers: Config Guide for 8 Bots [2026]
robots.txt controls GPTBot, ClaudeBot, PerplexityBot, and 5 more AI crawlers. Get copy-paste configurations, selective access rules, and common mistakes to avoid.
Prominara Team
robots.txt for AI Crawlers: Complete Configuration Guide
Your robots.txt file controls which AI crawlers can access your content. Proper configuration is essential for AI visibility. This guide covers all major AI crawlers and provides copy-paste configurations. Also consider setting up an llms.txt file alongside your robots.txt.
Key Takeaways
- 8 major AI crawlers exist in 2026: GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, Google-Extended, Amazonbot, FacebookBot, and Applebot-Extended. Allow the ones that matter for your audience.
- The recommended approach is selective access: allow public content (blog, docs, guides) while blocking sensitive areas (admin, API, dashboard).
- Blocking GPTBot alone eliminates your visibility in ChatGPT — the world's most popular AI assistant. Do not block it unless you have a specific legal or competitive reason.
- Crawl-delay is unreliable for AI bots. Not all crawlers respect it. Use server-side rate limiting instead if crawl volume is a concern.
- Review your robots.txt quarterly as new AI crawlers emerge regularly.
Understanding AI Crawlers
AI platforms use specialized crawlers to index web content for their models. Unlike traditional search engine crawlers that focus on indexing for search results, AI crawlers gather information to improve AI responses.
Major AI Crawlers
| Crawler | Platform | User-Agent |
|---|---|---|
| GPTBot | ChatGPT/OpenAI | GPTBot |
| ChatGPT-User | ChatGPT Browsing | ChatGPT-User |
| ClaudeBot | Claude/Anthropic | ClaudeBot |
| PerplexityBot | Perplexity | PerplexityBot |
| Google-Extended | Google AI/Gemini | Google-Extended |
| Amazonbot | Amazon Alexa | Amazonbot |
| FacebookBot | Meta AI | FacebookBot |
| Applebot-Extended | Apple AI | Applebot-Extended |
Basic Configurations
Allow All AI Crawlers
# AI Crawlers - Allow All
User-agent: GPTBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: Amazonbot
Allow: /
User-agent: FacebookBot
Allow: /
User-agent: Applebot-Extended
Allow: /Block All AI Crawlers
# AI Crawlers - Block All
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Amazonbot
Disallow: /
User-agent: FacebookBot
Disallow: /
User-agent: Applebot-Extended
Disallow: /Selective Access (Recommended)
# AI Crawlers - Selective Access
# Allow public content, block sensitive areas
User-agent: GPTBot
Allow: /blog/
Allow: /docs/
Allow: /guides/
Allow: /glossary/
Disallow: /admin/
Disallow: /api/
Disallow: /dashboard/
Disallow: /account/
User-agent: ChatGPT-User
Allow: /
Disallow: /admin/
Disallow: /api/
User-agent: ClaudeBot
Allow: /blog/
Allow: /docs/
Allow: /guides/
Disallow: /admin/
Disallow: /api/
User-agent: PerplexityBot
Allow: /
Disallow: /admin/
Disallow: /api/
User-agent: Google-Extended
Allow: /
Disallow: /admin/
Disallow: /api/Platform-Specific Considerations
OpenAI (GPTBot)
Crawl behavior:
- Respects robots.txt
- Focuses on content quality
- Used for model training and ChatGPT browsing
Recommendation: Allow access to authoritative content you want cited.
Anthropic (ClaudeBot)
Crawl behavior:
- Respects robots.txt
- Less frequent crawling than GPTBot
- Used for Claude's knowledge
Recommendation: Allow same access as GPTBot.
Perplexity (PerplexityBot)
Crawl behavior:
- Very active crawler
- Real-time search focus
- Respects robots.txt
Recommendation: Allow broad access for search visibility.
Google (Google-Extended)
Crawl behavior:
- Separate from Googlebot (search)
- Used for Gemini/AI features
- New in 2024
Recommendation: Allow if you want Gemini/AI Overview visibility.
Advanced Configurations
Rate Limiting
# Crawl delay for AI bots (in seconds)
User-agent: GPTBot
Allow: /
Crawl-delay: 10
User-agent: ClaudeBot
Allow: /
Crawl-delay: 10Note: Not all crawlers respect Crawl-delay.
Sitemap Reference
# Include sitemap reference
Sitemap: https://yoursite.com/sitemap.xml
User-agent: GPTBot
Allow: /Combined with Traditional Bots
# Traditional Search Engines
User-agent: Googlebot
Allow: /
User-agent: Bingbot
Allow: /
# AI Crawlers
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
# General Rules
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /api/
Sitemap: https://yoursite.com/sitemap.xmlTesting Your Configuration
1. Verify File Location
Your robots.txt must be at your domain root:
https://yoursite.com/robots.txt2. Test Accessibility
Access the URL directly in a browser.
3. Validate Syntax
Use Google's robots.txt Tester or similar tools.
4. Monitor Crawl Logs
Check server logs for AI crawler activity.
Common Mistakes
1. Blocking All Bots Accidentally
# WRONG - This blocks everything including AI bots
User-agent: *
Disallow: /2. Incorrect File Location
Place robots.txt at domain root, not in subdirectories.
3. Conflicting Rules
More specific rules should come after general rules.
4. Missing AI Crawlers
Update your robots.txt as new AI crawlers emerge.
5. Not Updating After Site Changes
Review robots.txt when restructuring your site.
Implementation Checklist
- [ ] Identify which AI platforms matter for your business
- [ ] Decide on allow/block strategy
- [ ] Create or update robots.txt
- [ ] Place at domain root
- [ ] Validate syntax
- [ ] Test accessibility
- [ ] Monitor crawler activity
- [ ] Review quarterly
Framework-Specific Implementation
Next.js
Create public/robots.txt or use dynamic generation.
WordPress
Use SEO plugins like Yoast or RankMath.
Static Sites
Add robots.txt to your build output directory.
Configure your robots.txt properly to maximize your AI visibility while protecting sensitive content.
Ready to improve your AI visibility?
Start with a free scan and see how your content performs across AI search engines.
Start Free ScanRelated Resources
AI Crawler
AI crawlers are automated bots like GPTBot, ClaudeBot, and PerplexityBot used by AI companies to discover and index...
Claude
Claude is Anthropic's AI assistant known for nuanced reasoning, safety-focused design, and 200K-token context...
Perplexity AI
Perplexity AI is an AI-powered search engine that provides direct answers with numbered source citations, making it...
Perplexity SEO: How to Get Cited in Perplexity Search [2026]
Perplexity optimization requires authority, structured content, and PerplexityBot access. Learn what content...
ChatGPT Optimization Guide [2026]: Get Cited by AI
ChatGPT optimization strategies to earn brand citations and traffic. Learn GPTBot setup, Browse feature tactics, and...
llms.txt Guide: How to Set Up Your File in 5 Steps [2026]
llms.txt tells AI crawlers what your brand does and which pages matter most. Get the complete template, real...
