# ============================================================ # robots.txt # # POLICY SUMMARY # -------------- # Allow: Major search engines (Google, Bing, Apple, Amazon), # Google Shopping, AI search indexers (OpenAI, Perplexity, # Claude/Anthropic, DuckDuckGo AI), Google Ads quality bot, # Meta Ads crawler, UptimeRobot, facebookexternalhit, # HeadlessChrome (internal sitemap tool) # # Block via robots.txt: Well-behaved crawlers that are expected # to respect this file. # # ============================================================ # ============================================================ # GLOBAL FALLBACK RULES # (Applies to all unknown bots not explicitly listed below) # ============================================================ User-agent: * Disallow: /cart Disallow: /checkout/ Disallow: /my-list Disallow: /my-account/ Disallow: /search?* # ============================================================ # ALLOWED — explicitly permitted (no Disallow needed, but # listed here for documentation) # ============================================================ # Google Search User-agent: Googlebot # Google Shopping / Merchant Center User-agent: Storebot-Google # Google Ads landing page quality User-agent: AdsBot-Google # OpenAI — Search indexing (not training) User-agent: OAI-SearchBot # OpenAI — ChatGPT browsing agent User-agent: ChatGPT-User # OpenAI — Web training crawler # Allowed per client request for AI search visibility User-agent: GPTBot # Anthropic — Claude AI indexing User-agent: ClaudeBot # Perplexity AI — search indexing User-agent: PerplexityBot # DuckDuckGo AI assistant User-agent: DuckAssistBot # DuckDuckGo main search crawler User-agent: DuckDuckBot # Meta Ads — link preview and ad quality (NOT the AI training bot) User-agent: facebookexternalhit # Meta Ads — ad campaign quality crawler User-agent: meta-externalads # UptimeRobot — hosting provider monitoring User-agent: UptimeRobot # HeadlessChrome — internal sitemap generation tool User-agent: HeadlessChrome # Pingdom — performance monitoring (Must Have) User-agent: Pingdom # Apply dynamic page restrictions to ALL the allowed bots above Disallow: /cart Disallow: /checkout/ Disallow: /my-list Disallow: /my-account/ Disallow: /search?* # ============================================================ # ALLOWED - with crawl delay # ============================================================ # Bing / Microsoft Search User-agent: bingbot # Apple Search / Siri / Spotlight User-agent: Applebot # Amazon Alexa / product discovery User-agent: Amazonbot Crawl-delay: 2 Disallow: /cart Disallow: /checkout/ Disallow: /my-list Disallow: /my-account/ Disallow: /search?* # ============================================================ # BLOCKED — SEO data harvesters (respect robots.txt) # ============================================================ # SEMrush — not a subscribed tool User-agent: SemrushBot # Ahrefs — not a subscribed tool User-agent: AhrefsBot # Moz — not a subscribed tool User-agent: DotBot # Awario — not a subscribed tool User-agent: AwarioBot # BrightEdge — not a subscribed tool User-agent: Brightbot # SE Ranking — not a subscribed tool User-agent: SERankingBacklinksBot # Screaming Frog — not confirmed as internal tool User-agent: Screaming # Majestic SEO User-agent: MJ12bot # Babbar.tech — European market, no relevance User-agent: Barkrowler # DataForSEO — bulk data reseller User-agent: DataForSeoBot # Huawei PetalSearch — no market relevance User-agent: PetalBot # 360/Qihoo — Chinese search engine, no market relevance User-agent: 360Spider # Cốc Cốc — Vietnamese browser/search, no market relevance User-agent: coccocbot # Yandex — Russian search, no market relevance User-agent: YandexBot User-agent: YandexImages # ByteDance - Chinese IT platform, no market relevance User-agent: Bytespider # Baidu — Chinese search, no market relevance [WAF REQUIRED] User-agent: Baiduspider # Sogou — Chinese search, no market relevance User-agent: Sogou # Yisouspider — Chinese search aggregator User-agent: YisouSpider # Mojeek — niche UK search, no relevance User-agent: MojeekBot # Yelp — consumer reviews, irrelevant for B2B distributor User-agent: YelpBot # HubSpot — not a subscribed tool User-agent: HubSpot # Yext — not a subscribed tool User-agent: YextBot # Automattic — no WordPress/Jetpack dependency User-agent: Automattic # ZoomInfo — B2B data broker; harvests contact data User-agent: ZoominfoBot # Dataprovider — B2B data broker User-agent: Dataprovider # Modernize.com — not a registered platform partner User-agent: ModernizeBot # SiteLock — uninvited security scanner User-agent: SiteLockSpider # Recorded Future — not a subscribed security intel tool User-agent: RecordedFutureBot # SiteScore — uninvited scoring tool User-agent: SiteScoreBot # Amazon comparison shopping — pricing intelligence risk User-agent: Amzn-SearchBot # Legacy MSN crawler — superseded by bingbot User-agent: msnbot # Gaisbot — obscure/defunct crawler User-agent: Gaisbot # 2ip.ru — Russian IP/domain probe User-agent: 2ip # Surdotly — content aggregator User-agent: SurdotlyBot # ShapBot — no known legitimate purpose User-agent: ShapBot # E2Microlink — developer scraping API User-agent: E2MicrolinkBot # Meta — Web Indexer and External Agent (AI/Indexing) User-agent: meta-externalagent User-agent: meta-webindexer # Common Crawl User-agent: CCBot Disallow: / Sitemap: https://elliottshardware.com/sitemap.xml