robots.txt Generator
Create a correctly formatted `robots.txt` file for your website.
About the robots.txt Generator
A robots.txt
file is a simple text file that lives on your website and tells search engine bots (like Googlebot) which pages they are allowed to crawl and which ones they should ignore. We built this tool to help you create a perfectly formatted `robots.txt` file without having to remember the specific syntax. A correct file is an important part of good technical SEO.
How to Use the Generator
- Start by choosing a "Preset" or set a default policy for all bots in the first section. "Allow All" is the most common setting.
- To add rules for a specific bot (like Googlebot), click the "Add Bot Rules" button and enter the bot's name in the "User-agent" field.
- For each bot, add as many `Disallow` or `Allow` rules as you need. All paths must start with a `/`.
- It is highly recommended to add the full URL to your sitemap in the "Sitemap URL" field.
- As you make changes, the final `robots.txt` code is generated in the results panel on the right.
- When you're finished, you can copy the code or download the final `robots.txt` file, ready to be uploaded to the root directory of your website.
Frequently Asked Questions
What are the different types of rules in a robots.txt file?
There are several main directives (rules) you can use:
- User-agent: This specifies which web crawler the following rules apply to. The name of the bot is used here.
- Disallow: This is the main instruction. It tells a bot not to enter a specific file or directory. For example,
Disallow: /admin/
blocks the admin folder. - Allow: This is used to create an exception to a `Disallow` rule. For instance, you could disallow an entire folder but then use
Allow
for a specific file inside it. - Sitemap: This is a very important rule that tells bots the location of your sitemap.xml file, which helps them discover all the pages you want them to crawl.
- Crawl-delay: This is a non-standard rule (but respected by some bots like Bing) that tells a crawler to wait a certain number of seconds between visiting pages to avoid overloading your server.
What are the most common User-agents?
A User-agent is the name of a specific web crawler. While there are hundreds, you'll most often want to set rules for these common ones:
*
: This is a wildcard that applies to all bots that don't have a more specific set of rules.Googlebot
: Google's main crawler for search results.Googlebot-Image
: Google's crawler for Google Images.Bingbot
: Microsoft Bing's main crawler.DuckDuckBot
: The crawler for DuckDuckGo.YandexBot
: The crawler for the Yandex search engine.
Do I need a robots.txt file?
If you want all the content on your site to be indexed by search engines, you don't technically need a `robots.txt` file. However, it is a best practice to have one, even if it's just to "Allow All" and point to your sitemap. It's essential if you have areas you want to keep private from search engines, like an admin login page or internal search results.
What's the difference between Allow and Disallow?
`Disallow` is the main instruction; it tells a bot not to enter a specific directory. For example, Disallow: /admin/
blocks the admin folder. `Allow` is used to create an exception to a `Disallow` rule. For instance, you could disallow an entire folder but then `Allow` a specific file inside it.
Where do I upload my robots.txt file?
The `robots.txt` file must be uploaded to the root directory of your website. For example, if your site is `www.example.com`, the file must be accessible at `www.example.com/robots.txt`.