MyIPScan

Website Security and SEO Tool

Robots.txt Checker

Fetch a public robots.txt file, parse crawler directives, review sitemap declarations, and spot common SEO configuration signals. This is a limited syntax and accessibility check, not a full crawler simulation.

Check robots.txt

Enter one public domain or HTTP/HTTPS URL. MyIPScan will check the root robots.txt file only.
Raw JSON

Trust note: this server-assisted check fetches only one public robots.txt URL, blocks private/internal targets, and does not crawl the site.

What this checks

MyIPScan normalizes the input to the site root, fetches /robots.txt with strict limits, and parses common directives including User-agent, Allow, Disallow, Crawl-delay, Sitemap, and Host.

What the results mean

A missing file is not always a problem, but it means crawlers do not receive explicit sitemap or crawl guidance from robots.txt. A global Disallow: / can block broad crawling when placed under User-agent: *. Sitemap declarations help crawlers discover canonical sitemap URLs.

How to use this tool

  1. Enter a public domain or URL such as example.com.
  2. Review the HTTP status, parsed directive groups, sitemap declarations, and warning notes.
  3. Use Sitemap Checker, Canonical / Noindex Checker, Meta Title / Description Checker, HTML Heading / Content Structure Checker, Structured Data / JSON-LD Validator, Open Graph / Social Preview Checker, Redirect Checker, SSL Certificate Checker, DNS Lookup, and Security Headers Checker for nearby website diagnostics.

FAQ

What is robots.txt?

robots.txt is a public text file at the root of a site that gives crawler guidance such as User-agent, Allow, Disallow, Crawl-delay, and Sitemap directives.

Does robots.txt block indexing?

It can block crawling, but a URL may still appear in search if discovered elsewhere. Use page-level noindex or headers when indexing control is required.

What happens if robots.txt is missing?

Many sites work without one. Missing robots.txt means crawlers do not receive explicit crawl guidance or sitemap declarations from that file.

Should robots.txt include sitemap.xml?

A Sitemap directive is useful because it points crawlers to canonical sitemap URLs, but it is not the only way crawlers discover sitemaps.

Can robots.txt hide private pages?

No. robots.txt is public and should not be used to protect private pages. Use authentication, authorization, and proper indexing controls for sensitive URLs.

Limitations

This tool parses common robots.txt syntax and flags obvious signals, but it does not fully emulate Googlebot or every crawler-specific rule precedence model. It also does not crawl pages or verify whether indexed URLs exist. See the methodology for how MyIPScan labels limited checks.