AI Crawlers Are Not Google's Crawler
Google has been crawling the web for over two decades. Its crawler (Googlebot) is well-understood, and most websites are designed to accommodate it. AI model crawlers are newer, behave differently, and have different priorities.
Understanding what AI crawlers actually read on your website is essential for AEO. If they cannot access your content, nothing else in your AI visibility strategy matters.
Which AI Crawlers Exist
The major AI crawlers include GPTBot (used by OpenAI for ChatGPT), Anthropic's crawler (used for Claude), PerplexityBot (used by Perplexity), and Google's AI-related crawlers (used for Gemini and AI Overviews).
Each has its own user-agent string and respects robots.txt directives. This means you can control which AI crawlers access your site and which pages they can read.
What AI Crawlers Read
AI crawlers read your HTML content, including text, headings, lists, tables, and other structured elements. They process your meta tags, including title, description, and Open Graph tags. They read your schema markup (JSON-LD). They read your llms.txt file if you have one. They follow internal links to discover additional pages.
AI crawlers do not read images (they read alt text), do not execute JavaScript (content hidden behind JS may be invisible), do not read PDFs reliably (content in PDFs is harder to parse), and do not read content inside iframes from external sources.
How to Check Your Robots.txt
Your robots.txt file (at yourdomain.com/robots.txt) controls which crawlers can access which pages. Some websites block AI crawlers by default, either intentionally or because their robots.txt was configured before AI crawlers existed.
Check your robots.txt for directives that block GPTBot, Anthropic's crawler, or PerplexityBot. If you want AI models to recommend your business, you need to allow their crawlers to access your content.
A common mistake is blocking all bots except Googlebot. This prevents AI models from reading your content while still allowing Google to index it. For businesses that want AI visibility, the correct approach is to allow both Google and AI crawlers.
The JavaScript Problem
Many modern websites use JavaScript frameworks (React, Vue, Angular) to render content. Google has learned to render JavaScript content, but AI crawlers are less reliable at this.
If your website's main content is rendered by JavaScript on the client side, AI crawlers may see a blank page. This is a significant problem for businesses using single-page applications or heavy JavaScript frameworks without server-side rendering.
The solution is server-side rendering (SSR) or static site generation (SSG), which delivers the HTML content directly without requiring JavaScript execution. If your website uses a JavaScript framework, confirm that your content is accessible without JavaScript enabled.
Optimising for AI Crawlers
Beyond just allowing access, you can optimise what AI crawlers find.
Put your most important content on pages with clean URLs and clear navigation. AI crawlers follow links, so well-structured internal linking helps them discover all your content.
Use descriptive headings (H1, H2, H3) to structure your content. AI crawlers use heading hierarchy to understand content organisation and importance.
Implement an llms.txt file at your site root. This is a structured summary specifically for AI models and is the most direct way to communicate with AI crawlers. Our llms.txt guide covers implementation.
Keep your page load times fast. AI crawlers, like all crawlers, have limited time budgets. Slow pages may be partially crawled or skipped entirely.
Monitoring AI Crawler Activity
Check your server logs for AI crawler user-agent strings. This tells you whether AI crawlers are visiting your site, which pages they are accessing, and how frequently they crawl.
If you see no AI crawler activity, your robots.txt may be blocking them, or your site may not have enough external signals (backlinks, mentions) for AI crawlers to prioritise crawling it.
For the full technical accessibility assessment, see our five-dimension audit framework.
Talk to Us
Chat with us on WhatsApp to discuss your website's AI accessibility. We reply within one Singapore business day.
Ready to get started?
Chat with the Swop Labs team on WhatsApp. We reply within one Singapore business day.
Chat on WhatsApp
