Technical SEO5 min read

Robots.txt: What It Is and How It Affects Your SEO

Robots.txt is small and powerful. Here's what it does, why it matters, and how to make sure yours isn't accidentally hiding your site from Google.

By Oh So SEO·30 January 2025

What Is a Robots.txt File?

A robots.txt file is a small text file that sits at the root of your website (yoursite.com/robots.txt) and gives instructions to search engine crawlers about which pages they should and shouldn't visit.

Why It Matters for SEO

The robots.txt file is your first line of communication with Googlebot. Misconfigure it, and you could accidentally prevent Google from crawling your entire site — which means none of your pages will rank. This sounds dramatic. It happens more often than you'd think. It's one of the first things to check when a site suddenly loses rankings.

How to Read a Robots.txt File

A basic robots.txt looks like this:


User-agent: *
Disallow: /admin/
Disallow: /checkout/

Sitemap: https://www.yoursite.com/sitemap.xml

User-agent: which crawlers this rule applies to. * means all crawlers. Googlebot means just Google. Disallow: which paths the crawler should not visit. /admin/ blocks the admin section of the site. Sitemap: tells crawlers where your sitemap is.

Common Robots.txt Mistakes

Disallowing everything: Disallow: / blocks all crawlers from all pages. Sometimes left in accidentally after development. Catastrophic for SEO. Blocking CSS and JavaScript: Search engines need to render your pages to understand them. Blocking CSS/JS files prevents this. Blocking important pages: Blocking your blog, product pages, or other content you want to rank.

What You Should and Shouldn't Block

Block: admin areas, login pages, search results pages, duplicate content pages, shopping cart/checkout pages. Don't block: your homepage, product pages, service pages, blog content, landing pages — anything you want to rank.

How to Check Your Robots.txt

Simply visit yoursite.com/robots.txt in your browser. You'll see the file (or a 404 if it doesn't exist — which is fine, it means no restrictions are in place). You can also use Google Search Console's robots.txt tester to see exactly what Googlebot can and can't access.

FAQ

Does blocking a page in robots.txt make it disappear from Google? Blocking crawling doesn't remove a page from Google's index. To prevent indexing, use a noindex meta tag. Robots.txt and noindex serve different purposes. Should I have a robots.txt file? Most small business sites don't need to change their robots.txt at all. The defaults are fine. Check it exists, check it's not accidentally blocking important pages, and move on. Who sets up robots.txt? Usually your developer or your website platform. Shopify, WordPress, and most platforms generate a sensible robots.txt automatically.