AI Search · 10 min read

How to run an AI visibility audit.

Most discussion of AI search optimisation is forward-looking: what to build, what to fix, what signals to invest in. The visibility audit asks a different question that comes earlier in the cycle: how visible is your brand actually right now, across ChatGPT, Gemini, Perplexity, and Claude, on the queries that matter? The answer is almost always more uneven than people expect, with one engine citing the brand frequently and others barely mentioning it. The audit converts that into measurable baseline data that you can track over time and use to prioritise optimisation work. The conceptual framing of AI search optimisation as a discipline is in what is answer engine optimization; this piece is the diagnostic methodology that sits inside that framing.

By Tomer Shiri · Published June 6, 2026 · Updated June 6, 2026

A sample AI visibility audit scorecard for a hypothetical Thai SaaS brand. ChatGPT cites the brand on 31 percent of 200 relevant queries (62 citations). Gemini on 24 percent (48 citations). Perplexity on 12 percent (24 citations). Claude on 7 percent (14 citations). The cross-engine variance signals where optimisation work should focus.

The reason this audit matters as a distinct workstream is that AI visibility behaves differently from traditional search rankings in two important ways. The variance across engines is much larger than the variance across positions one through five on Google for the same keyword (a brand can be cited heavily on ChatGPT and ignored on Claude for what looks like the same query intent). And the visibility moves more often than rankings do because the engines update their underlying corpora and retrieval systems more frequently than Google updates its index. A baseline measurement followed by a tracked cadence is the only way to know whether the optimisation work is producing results.

What an AI visibility audit measures versus an AI readiness audit

The two audits are easy to confuse and worth distinguishing clearly.

A readiness audit asks whether the website is structured to be cited by AI search engines. It evaluates the on-site signals (structured content, schema markup, named authorship, entity strength, citation-friendly claims) against best practice, identifies gaps, and produces a remediation backlog. The output is a list of changes to make to the site. The existing readiness audit framework on our blog is in the AI readiness audit.

A visibility audit asks how visible the brand is right now. It runs a structured set of queries across each major AI engine, captures the actual answers, scores them against citation and accuracy criteria, and produces a baseline scorecard. The output is data about current performance, not a list of changes (though the audit findings naturally suggest changes).

Both audits are useful. The clean sequence is to run the visibility audit first (to establish the baseline), then the readiness audit (to identify why the baseline is what it is and what to change), then the visibility audit again 90 to 180 days later (to validate whether the changes moved the needle).

The four steps of a visibility audit

The four-step methodology for running an AI visibility audit: define the query set with 50 to 200 relevant queries across category, competitor, and solution intent; run each query across ChatGPT, Gemini, Perplexity, and Claude capturing verbatim responses; score each result on citation rate, accuracy of facts, sentiment, and source visibility; report findings with a prioritised backlog of follow-up actions.
Four steps. Most audits stop at step three. The report makes the audit actionable.

Step one: Define the query set

The query set is the foundation of the audit. A weak query set produces unreliable conclusions regardless of how careful the rest of the methodology is. The right number of queries depends on business breadth. Smaller, single-product businesses can run a useful audit with 50 to 80 queries; larger businesses with multiple product lines typically need 150 to 200 queries to capture the full search surface. Below 50 queries the sample is too small to draw reliable conclusions; above 200 queries the marginal insight diminishes and the cost increases without proportional value.

The query set should cover three intent categories in roughly equal proportions. Category queries are searches where the user is looking for any provider in the category ("best CRM for Thai SMBs," "Bangkok SEO agencies," "Thai dental clinic for veneers"). Competitor queries are searches naming specific competitors where you would want your brand to appear in a comparison ("Salesforce alternatives Thailand," "Bumrungrad vs Bangkok Hospital"). Solution queries are searches describing the problem your brand solves rather than naming a category ("how to grow organic traffic in Bangkok," "where to get implants done abroad cheaper than Australia").

Each intent type tests a different aspect of brand visibility. Strong category coverage indicates broad market presence. Strong competitor coverage indicates the brand is part of the recognised consideration set. Strong solution coverage indicates depth of problem-specific content that gets matched to user intent. A brand that scores well on one but not the others has a specific kind of optimisation gap.

Step two: Run the queries

Each query in the set is run across each of the four major AI engines: ChatGPT, Gemini, Perplexity, and Claude. The mechanics matter for repeatability. Use a clean session for each query (avoid the engine remembering context from previous queries that would bias the answer). Capture the verbatim response as text, not as a summary or interpretation; the exact wording is needed for scoring. Note the date and which engine version produced the response (where the engine surfaces this information) because comparison across audit cycles needs to account for model updates.

Manual query running is reasonable for audits up to 100 queries. For larger sets, API-based automation is more efficient where the engines support API access. ChatGPT and Claude both offer APIs that produce reliable visibility audit data; Gemini's API equivalent works similarly. Perplexity has API access but the answer format differs slightly from the consumer product so some normalisation is needed when comparing.

The engine-by-engine differences in how citations work are unpacked in ChatGPT vs Gemini vs Perplexity for SEO, which is useful context for interpreting raw audit data.

Step three: Score the results

Each captured response is scored against four core metrics.

Citation rate. Did the brand get mentioned at all? Binary scoring per query, aggregated across the query set per engine. A 31 percent citation rate on ChatGPT means the brand was mentioned in 31 of every 100 relevant queries on that engine. This is the headline number most audits report.

Mention accuracy. When the brand was mentioned, were the facts about it accurate? AI engines sometimes attribute incorrect products, wrong locations, made-up pricing, or outdated positioning to brands. Inaccurate mentions can be worse than no mentions because they actively misinform potential customers. Track this as a percentage of mentions that contain at least one significant factual error.

Sentiment. Are the mentions positive, neutral, or negative? Most mentions in AI search are neutral (factual descriptions of products and services), but some are positively framed (the brand is recommended or praised) and some are negatively framed (the brand is criticised or dismissed in favour of alternatives). Sentiment scoring at the mention level gives a more complete picture than citation rate alone.

Source visibility. When the brand was mentioned, did the AI engine cite the brand's own website as a source, or only third-party sources mentioning the brand? Own-source citations indicate the website is being indexed and used directly by the engine. Third-party-only citations indicate the brand exists in the training data through external mentions but the website itself is not being retrieved. The mechanics of why sources get cited specifically are in what AI search engines look for when citing sources.

Step four: Report the findings

This is the step where most audits underdeliver. A spreadsheet of citation rates is data, not a report. A useful audit report has four components: the headline scorecard (citation rate, accuracy, sentiment, source visibility per engine), an analysis section explaining what the patterns mean and why they exist, a prioritised backlog of follow-up actions ranked by expected impact, and a recommended cadence for the next audit cycle.

The prioritised backlog is where the audit becomes actionable. Findings naturally suggest specific actions: low citation rate on a specific engine suggests engine-specific optimisation work; high citation rate with low accuracy suggests claim correction is needed; strong third-party citations with weak own-source visibility suggests technical SEO and structured content work. The wider AI readiness layer that connects the findings to action is in the AI readiness audit.

Query design in practice

The single largest determinant of audit quality is the query set design. Three patterns produce better audits.

Mix branded and unbranded queries deliberately. A query like "is SEO Bangkok a good agency" tests whether the engine knows about your specific brand. A query like "best SEO agencies in Bangkok" tests whether your brand appears unprompted. Most audits should weight unbranded queries more heavily because they measure actual discovery rather than brand recall confirmation.

Use real customer language, not internal jargon. Customers do not search using the terminology your team uses internally. They use the words they actually use in conversations, support tickets, and reviews. Where possible, source query language from real customer touchpoints rather than constructing queries based on what should be searched.

Test across the full buyer journey. Awareness-stage queries (broad category questions), consideration-stage queries (specific feature or comparison questions), and decision-stage queries (vendor selection or product specification questions) each test different visibility aspects. A brand strong only at the awareness stage but weak at decision stage has a specific kind of optimisation gap that the audit can surface.

Common mistakes in AI visibility audits

  • Query sets too small. Audits with fewer than 50 queries produce noise rather than signal. A handful of queries can swing the citation rate by 10 percentage points purely by chance.
  • Query sets too biased toward branded queries. "Is Brand X good?" tests recall, not discovery. Healthy audits weight unbranded queries heavily.
  • Single-shot queries without repetition. AI engine answers can vary across runs of the same query because of model temperature and retrieval randomness. Single-shot results are point estimates with implicit uncertainty; the better methodology runs each query two to three times and aggregates.
  • Scoring only citation rate. Citation rate alone misses critical diagnostic information about accuracy, sentiment, and source visibility.
  • No baseline before optimisation. Running the audit only after optimisation work makes it impossible to measure whether the work moved the needle. The baseline measurement must precede the optimisation.
  • One-time audits. A single audit produces a snapshot. Tracking over time produces actionable trend data and is much more useful for prioritising ongoing work.
  • Ignoring the report layer. A scorecard alone does not drive action; the analysis and prioritised backlog are what turn data into work.
  • Mixing languages without controls. Thai-language and English-language queries produce different visibility patterns. Audits that mix them without separation produce misleading aggregate numbers.

Cadence and follow-up

Quarterly audit cadence is typically the right balance for most businesses. The AI engines update their models and indexes frequently enough that audit results six months apart can look very different even with no underlying business changes; monthly audits are usually overkill for the cost. Quarterly audits across a stable query set provide enough data to identify trends, validate optimisation outcomes, and surface unexpected changes in brand representation.

Between audits, lighter-touch monitoring of specific high-value queries (top ten to twenty queries by commercial intent) can run more frequently without the cost of a full audit cycle. The lighter cadence catches significant changes early without overinvesting in measurement work.

The Thai-market specifics on running quarterly audits across multiple engines simultaneously are part of our AI search visibility audit workstream, which includes both the initial baseline audit and the ongoing tracking. For businesses wanting the broader AI search optimisation programme rather than the audit alone, the LLM visibility optimization service handles the full cycle.

The honest version of AI visibility auditing

AI visibility auditing done well is methodical, repeatable, and produces actionable insight. Done badly it produces noise that gets misinterpreted as signal. The difference is mostly in the query design (the right size and mix) and the report layer (the data has to translate into actions). Most brands underinvest in the report layer specifically, which is why most audits sit unread on shared drives after the initial scorecard reveal.

Our work as a best SEO Bangkok agency for AI search visibility includes the audit methodology as a starting point with most engagements, because optimisation without a measured baseline is guesswork. The wider Thailand SEO company services include AI visibility audits as an integrated component, not a standalone product. A short discovery conversation with our SEO expert in Bangkok usually identifies whether the audit alone is the right starting point or whether it should run alongside readiness and optimisation work simultaneously.

Common questions

What is an AI visibility audit?

An AI visibility audit measures how often a brand currently appears in answers from the major AI search engines (ChatGPT, Gemini, Perplexity, and Claude) across a defined set of relevant queries. The audit produces a quantitative baseline (citation rate per engine, mention frequency, mention accuracy) that can be tracked over time and used to prioritise optimisation work. It differs from a readiness audit, which is forward-looking; the visibility audit measures what is actually happening right now.

How many queries should an AI visibility audit include?

Between 50 and 200 queries depending on business breadth. Smaller single-product businesses can produce useful audits with 50 to 80 queries; larger businesses with multiple product lines typically need 150 to 200. Below 50 queries the sample is too small to draw reliable conclusions; above 200 queries the marginal value diminishes and the audit becomes expensive without producing materially better insight.

What metrics should an AI visibility audit track?

Four core metrics. Citation rate is the share of queries where the brand is mentioned at all. Mention accuracy is the share of those mentions that contain accurate facts. Sentiment is the tone of the mentions (positive, neutral, negative). Source visibility is whether the brand's own website appears as a cited source versus only third-party sources mentioning the brand. Tracking all four produces a complete picture; tracking only citation rate misses important diagnostic information.

How often should you run an AI visibility audit?

Quarterly cadence is typically the right balance. The AI engines update their models and indexes frequently enough that results six months apart can look very different even with no underlying business changes. Monthly audits are usually overkill for the cost. Quarterly audits across a stable query set provide enough data to identify trends, validate optimisation outcomes, and surface unexpected changes in brand representation.

Not sure how visible your brand is in AI search?

Four steps. Four engines. One scorecard.

We run quarterly AI visibility audits as the baseline measurement before any optimisation work. The audit produces the data that prioritises everything else.

Request a Visibility Audit
Keep reading

More from the blog.

AI readiness audit
AI Search · 8 min read

Is Your Website Invisible to AI? A Readiness Audit

The sibling audit framework. Readiness measures whether you are positioned to appear; visibility measures whether you actually do.

Read AI Readiness Audit
ChatGPT vs Gemini vs Perplexity for SEO
AI Search · 10 min read

ChatGPT vs Gemini vs Perplexity for SEO

The engine-by-engine context for interpreting audit results. Why citation rates differ across engines and what to do about it.

Read AI Engines Comparison
What AI search engines look for when citing sources
AI Search · 7 min read

What AI Search Engines Look For When Citing Sources

The citation signal layer that the audit measures. What makes a source citation-worthy across all four engines.

Read Citation Mechanics
All Articles