An Honest Talk on AI Prompt Volumes: Why It's Flawed and What to Do Instead
- Thought Leadership
- By multiple authors
- 10 minutes read
Don’t rely solely on an unproven metric like AI prompt volume. Learn why today's "AI MSV" is a strategic risk and how to build a durable AEO strategy on real business goals.
The Search for Certainty in an Anxious Market
Let's start with a truth: the AI search landscape is chaotic.
Organizations are facing immense pressure to "figure out AI." Your board is asking, "What's our AI strategy?" and your team is asking, "How do we even measure this?"
And it's normal—even smart—to ask, "What is the AI equivalent of Monthly Search Volume (MSV)?"
Right on cue, many technology vendors have emerged, promising to sell you that compass. They're offering "AI MSV" and "AI prompt volume" metrics, packaged with confident dashboards, promising to give you the certainty you're hungry for.
As your strategic partner, our job is to give you the honest talk others won't.
Relying on this new “AI MSV” metric without fully understanding it can be a significant strategic risk. It can lead you to misallocate resources and make key decisions based on an incomplete and unreliable picture, undermining your AI strategy before it even begins.
Let's break down exactly “why” by first recapping why traditional MSV was a metric that served as a cornerstone of SEO for so long, then exposing the fundamental problems with how AI prompt volume is being positioned today—from its "gray market" data sources to its mathematically flawed methodology.
The Previous "Gold Standard" of Traditional SEO MSV
In the past, keyword MSV was the cornerstone of SEO for a reason. We all trusted it because it was a reliable system built on three pillars:
- Pillar 1: Direct Data Access: We had a trustworthy source. Google provided this data directly via Google Ads to help advertisers plan campaigns.
- Pillar 2: Massive, Consolidated Data: Google was the market. With over 90% market share, their dataset was the entire dataset. It was statistically sound.
- Pillar 3: Uniform Search Behavior: We were all "trained" to search in simple, similar ways. Queries like "pizza near me" or "best running shoes" were common, making it easy to aggregate behavior.
The Pitfalls of Today's AI Search Volume
Today's AI search volume metrics fail all three of those pillars.
The Data Behind AI Prompt Volume is Statistically Flawed
Why Pillar 1 Fails.
First, there is no direct access to reliable data today. The LLMs—OpenAI, Antrhopic, Google's Gemini—are "black boxes". Their prompt logs are private. They do not provide public MSV data.
Why Pillar 2 Fails.
Because the data can’t be directly sourced from the LLMs, solution providers must instead rely on mathematically flawed, unreliable, and somewhat questionable approaches, such as paid panels and browser extension data, that are not truly reflective of the full market or statistically sound.
These panels are made up of people who are compensated to share their chat history, respond to surveys, or write prompts. While this can be effective for certain user research, it has key shortcomings for measuring MSV. These panels are opt-in, so the demographics are immediately biased toward more tech-savvy people who are monetarily incentivized. You also get significant, unavoidable coverage gaps based on geography, device types, and a user's machine settings.
This data is purchased from data brokers and aggregators—scraped from users who may have unknowingly installed browser extensions or plugins that monitor their activity or clickstream data. Similar to paid panels, this provides some legitimate use cases, but also is exposed to many weaknesses, such as:
- Ethical & privacy concerns: The data is often collected in ways users don’t understand or who unknowingly click “Accept” on terms and conditions they’ve never read. This isn't a theoretical risk. There have been several instances where sensitive user information was collected, and it was found to be surprisingly simple to de-anonymize. For example, the FTC recently banned Avast from selling browsing data for advertising, and the "DataSpii" debacle exposed private data from millions of people. While browsers have taken steps to be more transparent, prominent cases still emerge of companies selling sensitive data that violates consumer privacy.
- Stability risks: The data flow is extremely unstable. Rules change, tech evolves, and user behavior adapts. A single compliance change or browser update can force an extension to be delisted or change its collection methods, instantly cutting off the data source without warning. Data brokers often rely on dozens of these extensions, so the "representative sample" is in a constant, unknown state of flux.
- Untraceable origins: It is not uncommon for data brokers to resell and recombine different data sets, making the "chain-of-custody" unknown. Without knowing the data's full origin, it's impossible to know if it has been double-counted or what filtering mechanisms were used, making it impossible to trust whether you’re looking at a good representative sample when extrapolating your MSV metric.
Even when combining both of these approaches, this still results in a tiny, statistically insignificant sample of true market behavior. With total prompt volumes estimated around 2.5 billion inputs daily, a product or sales pitch claiming to provide a sample of "tens of millions" of prompts per month still represents far less than 1% of the total market (and by some estimates, as low as 0.15%).
Furthermore, brands have limited–or nonexistent–control over sample demographics. This means you can’t have full confidence that your target audience is actually included in your sample. Paid panel participants are more likely to skew more tech-savvy and may be biased based on the incentives in the program. Browser extension data is limited to desktop users, is likely to completely miss most enterprise/business users, and can often end up being blocked by different settings or environments.
Some may argue that this imperfect data is "better than nothing." But in most cases, it’s more likely that unreliable MSV data is worse than no data at all. It could lead you to prioritize the wrong content, misallocate resources, and chase metrics that don't correlate with business outcomes. It can’t provide what made traditional search MSV so reliable: trust in the source and the sample.
The Behavior Behind AI Prompt Volume is Fundamentally Different
Why Pillar 3 Fails.
The very nature of AI search makes the traditional way of measuring volume obsolete. The old methodology relied on millions of people searching in simple, similar ways. This is no longer true. We don't "search" in LLMs; we have conversations. Prompts are longer, more detailed, and deeply contextual.
- AI generates answers, not lists: AI generates a paragraph of text, not a predictable list of blue links. Your brand might be mentioned, cited, or ignored within that conversational response. This generative format doesn't have a "position" in the traditional sense.
- AI is probabilistic, not deterministic: Ask ChatGPT the same prompt three times, and you may get three different answers. The results vary by user, by chat history, and by the model itself. This means there is no single, stable, or repeatable "result" to measure.
- Personalization is ingrained: LLMs often consider context from previous conversations or personal info stored in a user's account, even if the user doesn't include it in the prompt. A user might type "best running shoes," and the LLM may automatically alter its response because it knows the user is a woman in her 50s living in a city.
- MSV of any prompt = 1: These long, context-laden prompts will rarely match exactly. This makes traditional, keyword-level aggregation nearly meaningless and makes it incredibly difficult to aggregate the number of searches for a given topic, even with broader matching.
The New Playbook: Start with Business Goals, Not Vanity Metrics
Instead of starting with the question "What are people searching for?", we believe the strategic starting point should be "What do we want to be known for?"
“It’s about going back to ‘What is the goal?’ ‘What are you trying to accomplish with AEO in general?’” says Wei Zheng, Conductor’s Chief Product Officer.
There are a lot of similar best practices shared between AEO and SEO. But a foundational change is that it’s no longer enough to just look at what keyword is the easiest to rank for as your strategy.
That’s how SEO experts have been conditioned to think for so long, but from a business perspective, it's a flawed approach. At the end of the day, nobody’s just blindly trying to rank for things; people are trying to build towards specific goals.
How to Prioritize Using the Right Data and Questions
Once you're grounded in your goals, you can optimize by prioritizing with data you can actually trust.
Stop asking, "Did we appear for this one prompt?" Start asking these four, strategic questions:
- What is our competitive share of voice? Understand how often AI connects competitors to certain topics or categories vs. your brand to identify key gaps and opportunities.
- Where are our persona gaps? Searches in LLMs are more personalized than ever. Analyze your performance data by persona to surface which of your key personas you are failing to engage.
- Where are our journey gaps? Analyze your visibility by intent stage to understand where you may be missing opportunities to connect with your audience as they move from awareness to decision.
- Where are our performance gaps? Look at your own analytics. See which pages get to AI referral traffic or where your Google Search Console (GSC) impressions are declining to find content that needs to be optimized.
The Conductor POV: How We're Building a Smarter Solution
While you shouldn’t use AI prompt volume as the sole source of your AEO strategy, that is not to say it has no value. Collecting real-world prompt data can be useful, and as Conductor builds its own solution around this need, we aim to do it right by:
- Using reliable proxies for topical interest. While prompt-level volume is unreliable, topical-level interest is still measurable. People’s core interests don’t suddenly change, even if they phrase their queries differently in an AI chat. We can leverage and combine data from Google, such as Google Trends, GSC, and Keyword Planner, to provide more accurate signals for what your audience actually cares about and help inform the topics you should be tracking.
- Separating the value of prompts from the flaw of MSV. The problem isn't the data itself; it's the misleading approach of using a tiny, unrepresentative sample to project some kind of statistically sound outcome and call it MSV. The real value isn't in the volume (how many are asking), but in the construction (how they are asking). It provides qualitative insights into your audience's language, pain points, and thought processes.
Our future R&D will be focused on combining reliable, topical data with AI-driven learnings from real-world prompts to better understand how people prompt. This allows us to help you construct better content based on how your audience thinks, not on a flawed volume metric.
Building a Durable Strategy for What's Next
Eventually, it is likely that LLM providers will release reliable, first-party data, especially as on-platform advertising and commerce are introduced.
But today, any "AI prompt volume" metric or "AI rank tracker" you are being sold is fundamentally misleading. Making critical resource decisions based on these mathematically flawed, unrepresentative, and methodologically broken concepts is a high-risk gamble.
Focus on your goals. Use real data. And build a strategy that's durable enough to outlast the hype.

![Wei Zheng, Chief Product Officer, [object Object]](https://cdn.sanity.io/images/tkl0o0xu/production/dcfa62c0fe34ba0c31f910b818874cd160ad8839-3542x3542.png?fit=min&w=100&h=100&dpr=1&q=95)




