🍋
Menu
SEO

Robots.txt

Robots Exclusion Protocol

A text file at the site root instructing search engine crawlers which pages or directories to avoid.

Chi tiết kỹ thuật

Robots.txt uses two mechanisms: robots.txt (file-level, prevents crawling but not indexing) and meta robots tags (page-level, controls indexing and link following). Common directives: 'noindex' (exclude from search), 'nofollow' (don't pass link equity), 'noarchive' (no cached copy). X-Robots-Tag HTTP headers provide the same controls for non-HTML resources (PDFs, images). A blocked page can still rank if other pages link to it — 'noindex' in meta tags is the only way to guarantee exclusion from search results.

Ví dụ

```
# robots.txt
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /api/internal/

Sitemap: https://peasytools.com/sitemap.xml
```

Định dạng liên quan

Công cụ liên quan

Thuật ngữ liên quan