AI Crawler Tracking

Savri can show you exactly which AI bots fetch your site, GPTBot, ClaudeBot, PerplexityBot and more, once you install a small snippet on your server.

Why track AI crawlers?

AI bots don't run JavaScript, so they're invisible to regular analytics. Yet they decide whether your pages can be cited by ChatGPT, Perplexity, Claude and others. Tracking them gives you:

Which AI models know about your site , Each platform runs its own bot. Seeing PerplexityBot but no GPTBot? Now you know where you stand with each AI.
Which pages are indexed most , Pages AI fetches often are the ones most likely to be cited. That tells you where to focus optimization energy.
When indexing happens , Daily GPTBot traffic = OpenAI is updating its view of your site. Long pauses can signal your content isn't being treated as fresh.

How it works

The snippet checks the User-Agent on every request to your server. When an AI crawler matches, it makes an async POST to https://savri.io/api/ai-crawl with the crawler name and URL. Everything runs server-side, zero impact on your visitors' experience.

We only store crawler name, URL, User-Agent and timestamp. No personal data, no cookies. Tracking applies only to bots, real visitors are ignored.

Installation

Pick your platform. Replace YOUR_SITE_ID with your site ID (the same one in your regular tracking snippet). The snippet runs in parallel with our regular JS tracker, they don't interfere.

Tip: if you are logged in, you'll find the same snippets prefilled with your site ID under AI Insights on your site's dashboard, along with a test button that verifies the installation.

Next.js (App Router or Pages)

Add to your existing middleware.ts (or create one). Vercel runs middleware at the edge, minimal latency.

// middleware.ts (create in project root, or merge into your existing middleware)
import { NextResponse } from 'next/server';
import type { NextRequest, NextFetchEvent } from 'next/server';

const SITE_ID = 'YOUR_SITE_ID';
const AI_API = 'https://savri.io/api/ai-crawl';

const AI_CRAWLER_PATTERNS = [
  'OAI-SearchBot', 'ChatGPT-User', 'GPTBot', 'Claude-User',
  'Claude-SearchBot', 'Claude-Web', 'ClaudeBot', 'anthropic-ai',
  'Perplexity-User', 'PerplexityBot', 'Google-Extended', 'GoogleOther',
  'Google-CloudVertexBot', 'Google-NotebookLM', 'GoogleAgent-Mariner', 'MistralAI-User',
  'Grok-DeepSearch', 'xAI-Grok', 'GrokBot', 'Applebot-Extended',
  'Applebot', 'meta-externalfetcher', 'Meta-ExternalFetcher', 'meta-externalagent',
  'Meta-ExternalAgent', 'Amazonbot', 'cohere-ai', 'cohere-training-data-crawler',
  'Bytespider', 'CCBot', 'DuckAssistBot', 'Diffbot',
  'YouBot',
];

export function middleware(request: NextRequest, event: NextFetchEvent) {
  const ua = request.headers.get('user-agent') || '';
  const matched = AI_CRAWLER_PATTERNS.some(p =>
    ua.toLowerCase().includes(p.toLowerCase())
  );

  if (matched) {
    // waitUntil keeps the report alive after the response is sent
    // (a plain fire-and-forget fetch can be cancelled on serverless hosts)
    event.waitUntil(
      fetch(AI_API, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          siteId: SITE_ID,
          pathname: request.nextUrl.pathname,
          userAgent: ua,
        }),
      }).catch(() => {})
    );
  }

  return NextResponse.next();
}

export const config = {
  matcher: ['/((?!_next|favicon.ico).*)'],
};

WordPress

Easiest: use the WordPress plugin

The Savri Analytics plugin (version 1.1.0 or later) reports AI bot visits automatically, no coding needed.

Open the plugin on wordpress.org

Add to your theme's functions.php or create an mu-plugin. wp_remote_post with 'blocking' => false runs async and doesn't affect page load.

php

// Add to functions.php in your active theme (or as an mu-plugin)

add_action('template_redirect', function () {
    $site_id = 'YOUR_SITE_ID';
    $api = 'https://savri.io/api/ai-crawl';

    $patterns = array(
        'OAI-SearchBot', 'ChatGPT-User', 'GPTBot', 'Claude-User',
        'Claude-SearchBot', 'Claude-Web', 'ClaudeBot', 'anthropic-ai',
        'Perplexity-User', 'PerplexityBot', 'Google-Extended', 'GoogleOther',
        'Google-CloudVertexBot', 'Google-NotebookLM', 'GoogleAgent-Mariner', 'MistralAI-User',
        'Grok-DeepSearch', 'xAI-Grok', 'GrokBot', 'Applebot-Extended',
        'Applebot', 'meta-externalfetcher', 'Meta-ExternalFetcher', 'meta-externalagent',
        'Meta-ExternalAgent', 'Amazonbot', 'cohere-ai', 'cohere-training-data-crawler',
        'Bytespider', 'CCBot', 'DuckAssistBot', 'Diffbot',
        'YouBot',
    );

    $ua = $_SERVER['HTTP_USER_AGENT'] ?? '';
    foreach ($patterns as $p) {
        if (stripos($ua, $p) !== false) {
            wp_remote_post($api, array(
                'blocking' => false, // async, never slows down the page
                'timeout'  => 1,
                'headers'  => array('Content-Type' => 'application/json'),
                'body'     => wp_json_encode(array(
                    'siteId'    => $site_id,
                    'pathname'  => $_SERVER['REQUEST_URI'] ?? '/',
                    'userAgent' => $ua,
                )),
            ));
            break;
        }
    }
});

Node.js / Express

Generic middleware for Express, Fastify or similar. Uses fetch (Node 18+), no extra dependency.

// Express middleware, add before your routes
const SITE_ID = 'YOUR_SITE_ID';
const AI_API = 'https://savri.io/api/ai-crawl';

const AI_CRAWLER_PATTERNS = [
  'OAI-SearchBot', 'ChatGPT-User', 'GPTBot', 'Claude-User',
  'Claude-SearchBot', 'Claude-Web', 'ClaudeBot', 'anthropic-ai',
  'Perplexity-User', 'PerplexityBot', 'Google-Extended', 'GoogleOther',
  'Google-CloudVertexBot', 'Google-NotebookLM', 'GoogleAgent-Mariner', 'MistralAI-User',
  'Grok-DeepSearch', 'xAI-Grok', 'GrokBot', 'Applebot-Extended',
  'Applebot', 'meta-externalfetcher', 'Meta-ExternalFetcher', 'meta-externalagent',
  'Meta-ExternalAgent', 'Amazonbot', 'cohere-ai', 'cohere-training-data-crawler',
  'Bytespider', 'CCBot', 'DuckAssistBot', 'Diffbot',
  'YouBot',
];

app.use((req, res, next) => {
  const ua = req.get('user-agent') || '';
  const matched = AI_CRAWLER_PATTERNS.some(p =>
    ua.toLowerCase().includes(p.toLowerCase())
  );

  if (matched) {
    fetch(AI_API, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ siteId: SITE_ID, pathname: req.path, userAgent: ua }),
    }).catch(() => {});
  }

  next();
});

Which AI bots are tracked?

We currently recognize crawlers from these AI models. The list grows as new bots appear.

OpenAI: OAI-SearchBot, ChatGPT-User, GPTBot

Anthropic: Claude-User, Claude-SearchBot, Claude-Web, ClaudeBot

Perplexity: Perplexity-User, PerplexityBot

Google: Google-Extended, GoogleOther, Google-CloudVertexBot, Google-NotebookLM, GoogleAgent-Mariner

Mistral AI: MistralAI-User

xAI: Grok

Apple: Applebot-Extended, Applebot

Meta: meta-externalfetcher, meta-externalagent

Amazon: Amazonbot

Cohere: cohere-ai

ByteDance: Bytespider

Common Crawl: CCBot

DuckDuckGo: DuckAssistBot

Diffbot: Diffbot

You.com: YouBot

FAQ

Does the snippet affect my site's performance?

No. All variants use async or fire-and-forget fetch that runs in the background without blocking the response to either the visitor or the bot.

What if a bot is blocked by my CDN/firewall?

Then the snippet never sees the request, and we never get a report. That's correct behavior. If you want to see ALL bots including blocked ones, you'd need to log at the edge/CDN level instead.

Can I report additional crawlers myself?

Right now the known-crawler list is centralized, only bots we recognize are stored. If you spot a new AI crawler we miss, let us know and we'll add it.

← Back to documentation