robots.txt Builder

Build robots.txt files with presets for common configurations.

Quick Presets

Rules

User-agent

Allow (one per line)Disallow (one per line)

Additional Settings

Sitemap URL

Crawl Delay (seconds)

Generated robots.txt

User-agent: *
Allow: /

💡 Tip: Place your robots.txt file in the root of your website (e.g., https://example.com/robots.txt). Use * as a wildcard in paths.

What is a Robots.txt Generator?

A Robots.txt Generator is a critical technical SEO tool that helps you create the `robots.txt` configuration file. This text file sits at the root of your website and acts as a set of instructions for search engine web crawlers (like Googlebot). It tells these bots which pages or folders they are allowed to crawl, and which private or low-value areas they should strictly avoid, helping you manage your crawl budget efficiently.

How to Create a Robots.txt File

Choose a Quick Preset: If you aren't sure where to start, click a preset like "Standard" or "Allow All". "Block All" is crucial for staging environments.
Define User-Agents: Set the rule for `*` to target all bots, or specify a specific crawler (e.g., `Googlebot` or `Bingbot`) for specific rules.
Set Allow & Disallow Rules: Enter the relative paths (like `/admin/` or `/cart/`) that you want blocked from search engines under the Disallow fields.
Add Your Sitemap: Paste the absolute URL to your `sitemap.xml` file. This tells crawlers exactly where to find your most important pages.
Download and Upload: Download the generated text file and place it exactly at the root of your domain (e.g., `https://yoursite.com/robots.txt`).

Why Crawl Budget and Robots Syntax Matter

Search engines allocate a specific "Crawl Budget" to every website—a limit on how many server requests their bots will make per day. If a crawler wastes time scanning thousands of internal search result pages, shopping cart URLs, or admin dashboards, it might fail to crawl and index your newly published blog posts. By writing a strict `robots.txt` file, you force search engines to spend their crawl budget exclusively on your high-value SEO landing pages.

Frequently Asked Questions (FAQs)

Does Disallow mean a page won't be indexed?

Not necessarily. `Disallow` prevents crawling, but if a page has external backlinks pointing to it, Google might still index the URL. To strictly prevent indexing, you must use a `noindex` meta tag on the page itself.

What does User-agent: * mean?

The asterisk (`*`) is a wildcard that instructs your rules to apply to all crawling bots equally. Most sites only need wildcard rules unless they are actively trying to block specific AI scrapers or aggressive bots.

Why put the sitemap in robots.txt?

Placing your absolute sitemap URL at the bottom of your `robots.txt` file is an official web standard. It is the first file crawlers look at, ensuring they instantly discover your sitemap without needing to rely on Google Search Console.

Should I use Crawl-delay?

Usually no. Modern servers can handle web crawlers easily. Furthermore, Googlebot explicitly ignores the `Crawl-delay` directive. Only use it if you are dealing with aggressive secondary bots on a very weak shared server.