Skip to content
Directory Datasets
AEO Citation Monitor logo
AEO / GEOSEOAIMarketing

AEO Citation Monitor dataset

Track how AI search engines (ChatGPT, Claude, Gemini, Perplexity, Grok, Google AI Overviews) cite your brand for the prompts your customers ask.

Records
Per-run, schema-validated
Pricing
$0.01/prompt × provider record
Format
JSON · CSV · HTML

Sample data

Fields you get

FieldTypeDescription
recordIdstringStable hash of (prompt + provider + runStartedAt). Deterministic.
runIdstringUUID grouping every record from one Actor invocation.
organizationIdstring | nullPass-through tenant tag from input.
promptTextstringThe exact prompt sent to the provider, verbatim.
promptCategorystring | nullBuyer-side category tag (or template-supplied) for dashboard pivots.
providerstringOne of: openai, anthropic, google-gemini, perplexity, xai-grok, google-aio.
modelstringProvider-specific model identifier (gpt-5.5, claude-sonnet-4-6, etc.).
modelTierstring"tracking" (the monitored model) or "utility" (sentiment/discovery helpers).
groundingUsedbooleanTrue when the provider used live web grounding for this query.
queriedAtstringISO 8601 timestamp when the provider was called.
responseLatencyMsnumberWall-clock latency from request to fully-parsed response.
costUsdnumberUpstream provider cost in USD (token-only; excludes Apify PPR).
transportstringWhich transport handled the call: direct | vercel | openrouter | serp-direct | serp-fallback.
localeobject | nullCountry/language applied to this resolution, with method=native|system-prompt-instruction.
responseTextstringFull provider response text, verbatim. For audit and re-parse.
responseTokensobject | nullToken usage (input/output/cached) when the provider reports it.
brandMentionsarrayYour brand's mentions in the response: count, list rank, up to 3 surrounding contexts.
competitorMentionsarrayCompetitor mentions, same shape as brandMentions.
citationsarrayCited URLs with isOwned/isCompetitor flags and rank position.
positioningSummarystring | nullOptional one-line summary of how the brand was positioned.
rawProviderResponseobject | nullVerbatim provider response payload (opt-in via emitRawProviderResponse).

How it works

Track how the AI search ecosystem cites your brand. The Actor sends your prompts to ChatGPT, Claude, Gemini, Perplexity, xAI Grok, and Google AI Overviews, parses every response into a structured record, and emits the dataset for your dashboard / SQL / spreadsheet.

What does AEO Citation Monitor do?

For each prompt × provider combination, the Actor:

  1. Queries the AI engine via its sanctioned API
  2. Parses the response for your brand mentions (with surrounding context, list-rank position, optional sentiment)
  3. Captures competitor mentions in the same response
  4. Extracts every cited URL with isOwned / isCompetitor flags and rank position
  5. Records transport, cost, latency, and grounding state for full audit

One record per (prompt × provider × run). Schema published on npm as @apify-portfolio/aeo-schema so dashboards consume the shape with type-safe ingestion.

Why use it?

The AEO/GEO category went from "emerging" to a venture-funded race in 12 months — Profound raised $155M, Scrunch $26M, OtterlyAI claims 20K marketing pros. They all sell vertically integrated SaaS at $99–$5K+/mo.

This Actor is the data layer underneath those products, priced per-result. Buyers who want the structured citation data without paying SaaS rent — agencies, in-house SEO teams, brand managers, comms teams, e-commerce retailers — pay $0.010–$0.575 per record (provider-tiered) and get the same primitives the SaaS dashboards rebuild on top of.

Example use cases

  • In-house SEO at SMB-mid-market — weekly tracking of brand citations in AI responses for $30/mo (vs $399 minimum SaaS tier).
  • Agencies — bulk-monitor 50 client brands × 100 prompts × weekly = 130K resolutions/mo at agency-grade unit economics.
  • Marketing analytics teams — pipe the Apify dataset into BigQuery / Snowflake / Looker for native AEO dashboards.
  • PR / comms teams — see whether AI engines cite your brand for sensitive topics, with sentiment and rank.
  • E-commerce / D2C — track product-discovery prompts ("best running shoes for marathon training") to see how AI ranks your brand.
  • Fintech / regulated verticals — built-in template covers compliance/regulation/security-question prompts.

How does it work?

For each LLM provider, the Actor uses a 3-tier transport chain with automatic fallback:

direct API → Vercel AI Gateway → OpenRouter

If your direct key 429s or 5xxs, the Actor falls back to gateway routes seamlessly. Records carry transport: 'direct' | 'vercel' | 'openrouter' for audit.

Google AI Overviews uses an independent SERP partner chain (DataForSEO → SerpAPI).

Vertical templates ship 7 starter prompt sets pre-grouped by intent (saas-b2b, ecommerce-d2c, local-services, agency, media-publisher, fintech, plus custom). Each template generates ~25 prompts grouped into 4-6 categories, so dashboard pivots work out of the box.

Locale targeting routes country/language hints natively where supported (Perplexity web_search_options.user_location, OpenAI web_search.user_location, Google AIO via DataForSEO) and falls back to a system-prompt instruction for Anthropic/Gemini/Grok. Each record's locale.method field tells you which approach was used.

Parallel execution runs multiple prompts concurrently, with all configured providers fanning out per-prompt. A 25-prompt × 6-provider sweep that takes 55 minutes sequentially completes in ~9 minutes parallel.

Input

{
  "prompts": [
    "What is the best AI coach app for Clash Royale players?",
    "How can I improve my ladder ranking in Clash Royale?"
  ],
  "brand": {
    "name": "Clash Coach AI",
    "aliases": ["ClashCoachAI", "Clash Coach", "ClashCoach.ai"],
    "ownedDomains": ["clashcoachai.com"]
  },
  "competitors": [
    { "name": "Royale Buddy", "ownedDomains": ["royalebuddy.com"] },
    { "name": "MetaDecks", "ownedDomains": ["metadecks.gg"] },
    { "name": "RoyaleAPI", "ownedDomains": ["royaleapi.com"] }
  ],
  "providers": ["perplexity", "anthropic", "xai-grok", "google-aio"],
  "locale": { "country": "US", "language": "en" },
  "acknowledgePublicBrandsOnly": true
}

Output (one item per prompt × provider)

{
  "recordId": "<sha256>",
  "runId": "<uuid>",
  "promptText": "What is the best AI coach app for Clash Royale players?",
  "promptCategory": "discovery",
  "provider": "perplexity",
  "model": "sonar",
  "modelTier": "tracking",
  "groundingUsed": true,
  "transport": "direct",
  "locale": { "country": "US", "language": "en", "method": "native" },
  "costUsd": 0.0007,
  "responseLatencyMs": 11342,
  "responseText": "Several AI-powered tools can help Clash Royale players improve...",
  "brandMentions": [
    {
      "brand": "Clash Coach AI",
      "mentionCount": 1,
      "rankPosition": 4,
      "contexts": [
        { "text": "...Clash Coach AI offers AI-driven battle analysis...", "charStart": 312, "charEnd": 326 }
      ]
    }
  ],
  "competitorMentions": [
    {
      "brand": "Royale Buddy",
      "mentionCount": 2,
      "rankPosition": 1
    }
  ],
  "citations": [
    {
      "url": "https://clashcoachai.com/features",
      "domain": "clashcoachai.com",
      "title": "Clash Coach AI — Features",
      "citationType": "grounded",
      "isOwned": true,
      "isCompetitor": false,
      "rankPosition": 2
    },
    {
      "url": "https://royaleapi.com/blog/best-coaching-apps",
      "domain": "royaleapi.com",
      "title": "Best Clash Royale Coaching Apps",
      "citationType": "grounded",
      "isOwned": false,
      "isCompetitor": true,
      "rankPosition": 1
    }
  ]
}

Pricing

Pay-per-event with provider-tiered rates that track upstream cost basis:

EventPrice
aeo-resolve-perplexity$0.010
aeo-resolve-aio (Google AI Overview)$0.015
aeo-resolve-light (Anthropic, Grok)$0.020
aeo-resolve-gemini$0.025
aeo-resolve-openai-base$0.075
aeo-resolve-openai-grounding-light+$0.05
aeo-resolve-openai-grounding-medium+$0.20
aeo-resolve-openai-grounding-heavy+$0.50
aeo-sentiment-tagged+$0.005
aeo-prompt-discovery$0.05
aeo-raw-response-passthrough+$0.001

A 10-prompt × 6-provider sweep with default settings (OpenAI grounded at medium bracket, the typical case) costs $3.65 total — vs $99–$5K/mo SaaS minimums for the same data shape.

Limits and notes

  • Word-boundary brand matching only in v1 — list every spelling variant in brand.aliases. No fuzzy matching to avoid false positives like "Clash" matching "Clashing."
  • OpenAI grounding cost varies dramatically. A single grounded call ranges from ~$0.005 (training-only) to $0.40+ (deep grounded with many sources). The bracketed pricing covers this; the maxCostUsdPerRecord guard (default $0.50) catches outliers.
  • No webhook output — Apify dataset is the v1 sink. Apify's own integrations (Zapier, Make, Slack via webhook) handle delivery to downstream systems.
  • Each run = one brand. Use Apify's scheduling + multiple Actor runs (one per brand) for portfolio monitoring.
  • Microsoft Copilot is reserved in the schema, not implemented in v1. Roadmap for v1.4.

Operator FAQ

Why isn't my brand mentioned?

If brandMentions is empty across most or all of your records, it's real diagnostic signal — not a bug. Three things to check, in order:

  1. Are you matching the right name variants? AI engines may say "ClashCoach" when your brand.name is "Clash Coach AI". Word-boundary matching is exact (case-insensitive). List every spelling variant in brand.aliases — abbreviations, ticker, product name, common misspellings. Catches ~30% of "missing" mentions.
  2. Is the prompt too narrow or too vague? "How do I improve my ladder ranking?" produces strategy advice; "Best Clash Royale coaching apps" produces product mentions. Run both prompt styles to see which surfaces your brand.
  3. Is the AI's grounding pulling from the wrong sources? Look at citations[].domain. If the AI is citing g2.com, capterra.com, and a competitor's blog but never your own site, that's the problem. AI engines surface brands based on what their grounding sources say. Your AEO work is producing content for those sources to cite, not the AI directly.

If all three check out and you still see zero mentions, your brand may genuinely have low AI-search presence — that's the signal AEO content marketing fixes. Re-run after publishing more content; see if the count moves.

How do I run this weekly?

Apify Console → SchedulesNew Schedule. Pick this Actor, set cron 0 9 * * 1 (Mondays 9am UTC), and use your filled-in input. Apify runs it weekly and stores each week's dataset.

For agencies tracking multiple clients: create one Schedule per brand. Each Schedule is independent so a slow run for client A doesn't block client B.

To see week-over-week deltas (only emit changed records), set "deltaMode": true in the input. The Actor stores per-prompt-provider state in the KeyValueStore and after the first run only emits records where the response changed — much smaller datasets, easier to spot real movement.

How do I pipe results into Sheets / Looker / BigQuery?

Google Sheets — easiest. Apify Console → run → Storage → Dataset → Export → Google Sheets. Or use the dataset's signed share URL with format=csv in IMPORTDATA: =IMPORTDATA("https://api.apify.com/v2/datasets/<id>/items?format=csv&clean=true").

Looker / Looker Studio — connect to the same Apify dataset URL as a CSV data source, or schedule a daily ETL into BigQuery via Apify's BigQuery integration (Console → Integrations → BigQuery).

BigQuery direct — Apify ships a native integration: Console → Integrations → BigQuery → connect → pick the dataset to mirror. Records flow into a flat table; the JSON columns (brandMentions, citations) become BigQuery STRUCT/ARRAY columns you can query with UNNEST.

Custom ETL — the dataset is just JSON over HTTP. Pull with curl, jq, or any HTTP client. The schema is published at @apify-portfolio/aeo-schema on npm.

Other questions

Can I use the Actor without supplying my own API keys? Yes. The Actor uses pre-configured upstream API keys; you pay PPR per record and the underlying API costs are handled.

Why isn't ChatGPT.com web UI a provider? OpenAI's terms prohibit automated access to ChatGPT.com. We use the OpenAI API — the sanctioned path. API responses differ slightly from the web UI but the data is far cleaner and auditable.

Can I monitor a public figure? Only with documented authorization (journalism, authorized research) and the bypassToSGuard: true flag. Default policy blocks honorific-style prompts because providers' usage policies forbid surveillance of individuals.

Where's the schema published? @apify-portfolio/aeo-schema on npm — semver-stable Zod definitions of AICitationRecord and AeoActorInput.

Support

Issues, feature requests, or buyer questions: open a support ticket in the Apify Console, or email the contact listed on the Actor's Apify page.

Changelog

  • 0.2.1 — v1.3.1: operator-focused README rewrite. 5-minute quickstart with pre-pasted starter input, fully annotated example record explaining every field, three operator FAQs ("Why isn't my brand mentioned?", "How do I run this weekly?", "How do I pipe results into Sheets/Looker/BigQuery?"). Help-them-use content above the fold.
  • 0.2 — v1.3: form mode for non-technical buyers, vertical templates pre-grouped by intent, locale routing, HTML report download, Slack/email-friendly status digest, lite wizard variant, parallel runner. maxResolutions enforcement so demo runs stay under Apify's 5-min maintenance ceiling.
  • 0.1.9 — v1.2: parallel runner + lazy provider construction. 6× speedup on 25-prompt template runs.
  • 0.1.5 — v1.1.1: bracketed OpenAI grounding pricing — base + light/medium/heavy events scale with actual upstream cost.
  • 0.1 — Initial release. Cheerio-based parsing of 6 LLM/SERP responses; bracketed mention contexts; deterministic recordId; pay-per-event PPR; ToS attestation enforced.

Need a custom format or one-off pull?

Submit a request — pre-built scrapers cover the common cases, but custom output formats, filtered subsets, or one-off historical pulls are usually quick to deliver.

Request a custom dataset