Robots.txt Generator — Free Advanced Tool

Pick a platform preset to get a smart starting point instantly. You can fine-tune everything in the Advanced Builder.

🔵

WordPress

Protects admin, allows AJAX

🛍️

Shopify

E-commerce best practices

🟣

WooCommerce

WP + shop-specific rules

⚡

Next.js / React

Modern JS app defaults

📝

Blog / CMS

General blog setup

🔴

Laravel / PHP

Protects app internals

📰

News / Media

Google News optimised

🧱

Start Blank

Build from scratch

Your Website Domain

Used to generate absolute sitemap URLs

Tip: Every preset follows Google's crawl recommendations. WordPress preset allows /wp-admin/admin-ajax.php so live search and AJAX-powered features keep working for crawlers.

Global Settings

Website Domain

Sitemap URL

Additional Sitemap URLs (optional, one per line)

Access Controls — All Bots ( * )

Disallow These Paths (one per line) Leave empty to allow all. Use / to block everything.

Allow These Paths (overrides disallows above)

Advanced Options

Set Crawl-delay

Block AI Training Bots (GPTBot, Claude, Bard etc.)

Block SEO Scrapers (AhrefsBot, SemrushBot, MJ12)

Block Bad Bots (DotBot, MJ12bot, DataForSeoBot)

Disallow Googlebot-Image

Allow AdsBot (required for Google Ads)

Add custom rules for specific crawlers. Rules here are merged into the final output alongside the global rules in Advanced Builder.

How priority works: Named bot rules (e.g. Googlebot) take priority over wildcard (*) rules. Among rules for the same bot, the longest matching path wins.

robots.txt

Validate

# Click "Generate robots.txt" in any tab to see your output here

Edit Output Directly

Validate This File

Next step: Upload the generated robots.txt to your website root (same level as your homepage). Then paste the URL into Google Search Console → URL Inspection to verify Googlebot can access it correctly.

How This Robots.txt Generator Works

Everything runs in your browser. Nothing is sent to any server — your domain and rules stay private.

1. Pick a Preset

Choose WordPress, Shopify, WooCommerce, or start blank. Each preset follows platform-specific SEO best practices.

2. Customise Rules

Add your domain, sitemap URLs, blocked paths, and toggle advanced options like AI bot blocking or crawl delay.

3. Manage Bots

Use the Bot Manager to write granular rules per crawler — Googlebot, Bingbot, Baiduspider, or any custom agent.

4. Copy or Download

Grab your clean robots.txt file, edit it inline if needed, then upload it to your website root folder.

What Makes a Good Robots.txt File in 2026?

A robots.txt file is one of those things that's easy to set up and easy to get catastrophically wrong. At its best, it quietly saves your crawl budget and keeps bots focused on your most important pages. At its worst, a single mistyped line can block Googlebot from your entire site — and you might not notice for weeks.

A well-structured robots.txt does three things well. First, it protects internal pages that serve no purpose in search — admin areas, login pages, internal search result URLs, shopping cart pages, and duplicate parameter-based URLs that would just dilute your index. Second, it explicitly allows anything that Googlebot needs to render your pages correctly — CSS files, JavaScript bundles, web fonts, and image directories. Third, it points crawlers directly to your sitemap, which speeds up discovery of new and updated content.

What's changed in recent years is the rise of AI training bots. Crawlers like GPTBot (OpenAI), ClaudeBot (Anthropic), Google-Extended, and PerplexityBot now actively crawl the web to build training datasets. If you don't want your content used for AI training, robots.txt is the correct mechanism to opt out — and this generator includes a one-click toggle to block them all.

Crawl Budget: Why It Matters More Than You Think

Google allocates a limited number of crawls to each website based on its authority and server response times. If Googlebot wastes those crawls on admin pages, login flows, and URL parameter duplicates, it has fewer crawls left for your actual content. For large e-commerce sites or content-heavy sites, this is a real ranking factor — not a theoretical one.

Use the Disallow rules in this generator to block low-value URL patterns like /tag/, /search/, /?s=, and /page/ where they generate infinite or near-duplicate pages. This is especially important for WordPress sites running lots of taxonomy pages.

5 robots.txt Generation Mistakes That Kill Rankings

Using a relative Sitemap path Writing Sitemap: /sitemap.xml instead of the full https:// URL. The robots.txt specification requires an absolute URL. This generator always produces the correct absolute format.

Blocking CSS and JavaScript from Googlebot If your theme files, JavaScript bundles, or web fonts are blocked, Googlebot renders a broken page. Google explicitly says this hurts how it understands and ranks your content. Always check the Resource Checker in the Validator tool after generating your file.

Generating robots.txt without setting a domain Sitemap URLs without a domain (like just /sitemap.xml) are technically invalid per the Robots Exclusion Protocol. Enter your full domain in the generator to get correct absolute sitemap references every time.

Thinking Disallow removes pages from Google's index It doesn't. It only stops Googlebot from crawling. If the page is linked from elsewhere, Google can still index it — just without being able to read the content. For actual deindexing, you need a noindex tag on the page itself.

Always validate after generating Even tools can produce files with logical errors — especially when you mix preset rules with manual overrides. Run every generated file through the Robots.txt Validator before uploading to your site. It checks syntax, rule conflicts, and flags any blocked resources.

Robots.txt Directives — What Each One Does

Understanding each directive helps you generate rules that actually do what you intend, rather than guessing and hoping for the best.

User-agent

Identifies which crawler the following rules apply to. * means all bots. Named agents like Googlebot override wildcard rules.

Disallow

Tells a bot not to crawl this path or any URL starting with this prefix. Empty value means nothing is disallowed.

Allow

Overrides a Disallow for a specific sub-path. Most useful for allowing a single file inside a blocked folder.

Sitemap

Points any crawler to your XML sitemap. Must be an absolute URL. Can appear multiple times for multiple sitemaps.

Crawl-delay

Asks a bot to wait N seconds between page fetches. Ignored by Googlebot. Respected by Bingbot, Yandex, and others.

Wildcard *

Matches any sequence of characters in a path. /search/* blocks all paths starting with /search/ regardless of what follows.

End anchor $

Forces the pattern to match only at the end of a URL. /*.pdf$ blocks only URLs ending in .pdf.

Specificity wins

When two rules match the same URL, the longer (more specific) rule takes priority. Allow beats Disallow on equal-length matches.

Frequently Asked Questions About Generating robots.txt

Do I need a robots.txt file if my whole site should be crawled?

Not strictly — a missing robots.txt means bots assume full access. That said, having one is still recommended because it lets you point crawlers to your sitemap, which speeds up content discovery significantly. You can generate a minimal file here with just the sitemap line and no Disallow rules.

How do I generate a robots.txt file for WordPress?

Use the WordPress preset in the Quick Presets tab. It comes pre-configured to block /wp-admin/ while allowing /wp-admin/admin-ajax.php (needed for live search and AJAX), and blocks tag/search/author archive pages that commonly create duplicate content. Add your domain and sitemap URL, then download the file and upload it to your site root — replacing any existing robots.txt.

Where exactly do I upload the robots.txt file?

The file must live at the root of your domain — at https://yourdomain.com/robots.txt. Upload it via FTP, cPanel File Manager, or your hosting panel to the public_html folder (or whatever your web root is). For WordPress, this is the same folder that contains wp-config.php. After uploading, visit the URL in your browser to confirm it's live.

How do I block AI bots like GPTBot and ClaudeBot from scraping my site?

Toggle on "Block AI Training Bots" in the Advanced Builder tab. This adds individual Disallow: / rules for GPTBot (OpenAI), ClaudeBot (Anthropic), Google-Extended, PerplexityBot, FacebookBot, and Bytespider. These opt-outs are voluntary — compliant AI crawlers respect them, though not all do. This is currently the only legitimate mechanism to signal your content preference to AI companies.

Can I have different rules for Googlebot and Bingbot?

Yes — use the Bot Manager tab to create separate rule groups for each crawler. You might want Googlebot to follow your main rules while allowing Bingbot slightly different access, or adding a Crawl-delay specifically for Bingbot without affecting Google. Named agent groups always override wildcard rules for that specific bot.

Should I use this generator or let WordPress auto-generate my robots.txt?

WordPress generates a virtual robots.txt automatically, but it's extremely minimal — it only blocks /wp-admin/. It doesn't block SEO-wasteful URLs like search pages, tag archives, or URL parameters. For serious SEO work, generating a custom file with this tool and uploading a physical robots.txt gives you far more control. The physical file always takes precedence over WordPress's virtual one.

Robots.txt File Generator

How This Robots.txt Generator Works

What Makes a Good Robots.txt File in 2026?

Crawl Budget: Why It Matters More Than You Think

5 robots.txt Generation Mistakes That Kill Rankings

Robots.txt Directives — What Each One Does

Frequently Asked Questions About Generating robots.txt