This article establishes AI search visibility benchmarks based on 265 brand checks run through aeogeoai.net across ChatGPT, Claude and Gemini. Key findings: the average AI visibility score was 72.4 for brands that received any response; 32% of checks had at least one model return zero; AI models disagree by an average of 20.9 points on the same brand; searching a brand name produces scores approximately 19 points higher than searching the same brand's domain name; and Gemini scored the world's most recognised sportswear brand zero in every single check.
The benchmark question nobody has answered
Every AI visibility tool produces a score. Almost none of them tell you what the score means.
If your brand scores 62 across ChatGPT, Claude and Gemini — is that good? Is it average? Is it cause for concern? Without a reference point, the number is decoration.
We built aeogeoai.net to answer the question "does AI know about my brand?" — and after 265 brand checks across three AI models, we have enough data to start answering the follow-up question: what does the score actually mean?
Methodology
Dataset: 265 brand checks with at least one non-zero model score, drawn from checks run through aeogeoai.net between launch and June 2026.
Models tested: ChatGPT (GPT-4o-mini via OpenAI API), Claude (claude-haiku-4-5 via Anthropic API), Gemini (gemini-2.5-flash-lite via Google API).
Scoring: Each model returns a visibility score from 0–100 based on whether the brand is mentioned, how prominently, how accurately, and in what context. A score of 0 means the model did not mention the brand in its response.
Important caveat: This dataset is skewed toward well-known brands — Semrush, Nike, Booking.com, Reddit, Google — because these are the examples used on the aeogeoai.net homepage. Real-world scores for unknown or local businesses will be significantly lower. Treat the averages here as benchmarks for established brands, not typical businesses.
The AI visibility benchmark scale
Based on 265 checks, here is what each score range means in practice:
| Score range | Label | What it means | Real example |
|---|---|---|---|
| 80–100 | Dominant | AI recommends this brand confidently and accurately. It appears in answers unprompted and is described correctly. | Google (93.9), Semrush (82.5) |
| 60–79 | Visible | AI knows this brand and mentions it in relevant contexts. May not always be the first recommendation but appears consistently. | Booking.com (80.3), Reddit (78.9) |
| 40–59 | Weak | AI has partial knowledge. Mentions the brand inconsistently — present in some model responses, absent in others. | Gymshark (58.1), Surfer SEO (57.5) |
| 0–39 | Invisible | AI rarely or never mentions this brand. May confuse it with others or return no information at all. | Robert A.M. Stern Architects (45), most local businesses |
These averages are skewed upward by well-known brands. Most local businesses and unknown brands would score significantly lower — many would return zero across all three models.
Five findings that surprised us
AI models disagree by an average of 20.9 points on the same brand
Across 59 checks where all three models returned non-zero scores, the average spread between the highest and lowest score for the same brand on the same query was 20.9 points. The largest single disagreement was 50 points — Reddit.com scored 45 on Claude, 95 on Gemini, and 70 on ChatGPT in the same check.
This has a direct implication for AI visibility measurement: picking one model to represent your AI visibility produces a misleading picture. A brand that scores 45 on Claude but 95 on Gemini is not a 45-score brand — it has a model-specific visibility problem that requires a model-specific intervention.
This is why aeogeoai.net tests across all three models simultaneously and reports scores individually rather than averaging them.
Searching "Reddit" vs "Reddit.com" produces an 18.8 point score difference
Reddit (the brand name) averaged 76.1 across all checks. Reddit.com (the domain name) averaged 57.2 — an 18.8 point gap for the same entity, described differently.
The same pattern held for Dropbox vs Dropbox.com: a consistent 18.9 point gap across three checks per variant.
AI systems are trained predominantly on text that refers to brands by name, not by domain. The entity "Reddit" appears millions of times in training data. The string "Reddit.com" appears far less frequently and in different contexts — often in technical or navigational content rather than brand discussions.
Practical implication: when checking your AI visibility, enter your brand name as it is most commonly referenced in writing — not your domain name. The results will be materially different.
Gemini scored Nike zero in 100% of checks
Nike was checked 11 times. Claude averaged 67.7. ChatGPT averaged 56.4. Gemini scored Nike zero in every single check — 11 out of 11, a 100% zero rate for one of the most recognisable brand names on earth.
This is not a data quality issue — the same checks returned non-zero Claude and ChatGPT scores. Something about how Gemini processes the Nike brand in the context of these specific queries produces consistent non-responses.
This finding illustrates a broader point: a brand can be globally famous and still score zero on a specific AI model for a specific query type. AI visibility is not a proxy for brand awareness. It is a separate and distinct measure of how AI systems process and respond to brand-related queries.
The three models score remarkably similarly on average — but diverge wildly on individual brands
Claude averaged 70.6, Gemini averaged 75.3, and ChatGPT averaged 71.7 across all non-zero checks. At the aggregate level, the three models are almost indistinguishable.
At the individual brand level, they diverge dramatically. Semrush: Claude 88.8, Gemini 82.9, ChatGPT 75.8 — Claude rates it highest. Reddit: Claude 59.1, Gemini 83.2, ChatGPT 85.9 — Claude rates it significantly lower than the other two. Nike: Claude 67.7, Gemini 0.0, ChatGPT 56.4 — Gemini drops out entirely.
The implication: aggregate model averages tell you very little. Brand-level model breakdown tells you where the specific visibility gap is — and which model you need to optimise for.
Professional services firms have the lowest AI visibility of any category tested
The lowest-scoring entries in our dataset were architecture firms. Robert A.M. Stern Architects — one of the most prominent architecture firms in the United States — scored 45 on Claude and zero on Gemini and ChatGPT. The brand "Stern" scored 55 on Claude and zero on both other models.
By contrast, Semrush — a software tool — averaged 82.5 overall. The gap between a globally recognised architecture firm and an SEO software tool is approximately 37 points in AI visibility.
Professional services firms — architects, lawyers, accountants, consultants — operate in categories where AI systems have sparse, inconsistent training data. Their work is not heavily documented in the text corpora that train AI models. Their clients do not write extensively about them online. Their expertise is embedded in buildings, documents, and relationships rather than content.
For professional services firms, AI visibility is not a nice-to-have — it is a significant and growing competitive disadvantage relative to digital-native brands in adjacent categories.
Score distribution across all checks
Across all 261 individual model scores recorded in this dataset:
| Score range | Count | Percentage |
|---|---|---|
| 80–100 Dominant | 99 | 37.9% |
| 60–79 Visible | 92 | 35.2% |
| 40–59 Weak | 32 | 12.3% |
| 0–39 Invisible | 1 | 0.4% |
| Scored zero (no mention) | 37 | 14.2% |
The 14.2% zero rate is significant — and remember this dataset is biased toward well-known brands. For a dataset of typical businesses, the zero rate would be substantially higher.
What a good score looks like by brand category
| Category | Representative brands | Avg score | Claude | Gemini | ChatGPT |
|---|---|---|---|---|---|
| Digital platform / search | Google, Reddit | 86.4 | 75.0 | 95.0 | 88.3 |
| SaaS / SEO tools | Semrush, Ahrefs | 82.5 | 88.8 | 82.9 | 75.8 |
| Travel / booking | Booking.com | 80.3 | 79.5 | 81.5 | 80.0 |
| Consumer / sportswear | Nike, Adidas, Gymshark | 65.0 | 62.6 | 16.7 | 63.8 |
| Cloud / productivity SaaS | Dropbox, Trello, Clickup | 63.7 | 56.7 | 68.0 | 56.7 |
| Professional services | Architecture firms | 45.0 | 50.0 | 0.0 | 0.0 |
What to do with your score
If you scored 80 or above: Your AI visibility is strong. The priority is maintaining accuracy — AI systems should be describing you correctly, not just mentioning you. Check the excerpts to verify the AI's description matches your current positioning.
If you scored 60–79: You have visibility but inconsistency. One or more models is likely scoring lower than the others — check which model is the weak point and focus on building evidence in the sources that model trusts. For Gemini, this typically means Google-indexed content. For ChatGPT, it means Bing-indexed content and third-party citations.
If you scored 40–59: You have partial entity recognition. The AI knows your brand exists but cannot describe it accurately or consistently. A structured brand profile on indexed third-party publications typically moves scores in this range significantly.
If you scored below 40 or zero: The AI has insufficient evidence to form an opinion about your brand. This is the most common situation for local businesses, professional services firms, and early-stage companies. The fix is building the third-party evidence layer — directory profiles, editorial publications, structured content — that gives AI systems a trusted source to cite.
Check your AI visibility score — free
See what ChatGPT, Claude and Gemini currently say about your brand. Score from 0–100 per model. No account required.
Check my brand →