An Honest Talk on AI Prompt Volumes: Why It's Flawed and What to Do Instead

Thought Leadership
Nov 6, 2025
By multiple authors
10 minutes read

Don’t rely solely on an unproven metric like AI prompt volume. Learn why today's "AI MSV" is a strategic risk and how to build a durable AEO strategy on real business goals.

The Search for Certainty in an Anxious Market

Let's start with a truth: the AI search landscape is chaotic.

Organizations are facing immense pressure to "figure out AI." Your board is asking, "What's our AI strategy?" and your team is asking, "How do we even measure this?"

And it's normal—even smart—to ask, "What is the AI equivalent of Monthly Search Volume (MSV)?"

Right on cue, many technology vendors have emerged, promising to sell you that compass. They're offering "AI MSV" and "AI prompt volume" metrics, packaged with confident dashboards, promising to give you the certainty you're hungry for.

As your strategic partner, our job is to give you the honest talk others won't.

Relying on this new “AI MSV” metric without fully understanding it can be a significant strategic risk. It can lead you to misallocate resources and make key decisions based on an incomplete and unreliable picture, undermining your AI strategy before it even begins.

Let's break down exactly “why” by first recapping why traditional MSV was a metric that served as a cornerstone of SEO for so long, then exposing the fundamental problems with how AI prompt volume is being positioned today—from its "gray market" data sources to its mathematically flawed methodology.

The Previous "Gold Standard" of Traditional SEO MSV

In the past, keyword MSV was the cornerstone of SEO for a reason. We all trusted it because it was a reliable system built on three pillars:

Pillar 1: Direct Data Access: We had a trustworthy source. Google provided this data directly via Google Ads to help advertisers plan campaigns.
Pillar 2: Massive, Consolidated Data: Google was the market. With over 90% market share, their dataset was the entire dataset. It was statistically sound.
Pillar 3: Uniform Search Behavior: We were all "trained" to search in simple, similar ways. Queries like "pizza near me" or "best running shoes" were common, making it easy to aggregate behavior.

The Pitfalls of Today's AI Search Volume

Today's AI search volume metrics fail all three of those pillars.

The Data Behind AI Prompt Volume is Statistically Flawed

Why Pillar 1 Fails.

First, there is no direct access to reliable data today. The LLMs—OpenAI, Anthropic, Google's Gemini—are "black boxes". Their prompt logs are private. They do not provide public MSV data.

Why Pillar 2 Fails.

Because the data can’t be directly sourced from the LLMs, solution providers must instead rely on mathematically flawed, unreliable, and somewhat questionable approaches, such as paid panels and browser extension data, that are not truly reflective of the full market or statistically sound.

These panels are made up of people who are compensated to share their chat history, respond to surveys, or write prompts. While this can be effective for certain user research, it has key shortcomings for measuring MSV. These panels are opt-in, so the demographics are immediately biased toward more tech-savvy people who are monetarily incentivized. You also get significant, unavoidable coverage gaps based on geography, device types, and a user's machine settings.

This data is purchased from data brokers and aggregators—scraped from users who may have unknowingly installed browser extensions or plugins that monitor their activity or clickstream data. Similar to paid panels, this provides some legitimate use cases, but also is exposed to many weaknesses, such as:

Ethical & privacy concerns: The data is often collected in ways users don’t understand or who unknowingly click “Accept” on terms and conditions they’ve never read. This isn't a theoretical risk. There have been several instances where sensitive user information was collected, and it was found to be surprisingly simple to de-anonymize. For example, the FTC recently banned Avast from selling browsing data for advertising, and the "DataSpii" debacle exposed private data from millions of people. While browsers have taken steps to be more transparent, prominent cases still emerge of companies selling sensitive data that violates consumer privacy.
Stability risks: The data flow is extremely unstable. Rules change, tech evolves, and user behavior adapts. A single compliance change or browser update can force an extension to be delisted or change its collection methods, instantly cutting off the data source without warning. Data brokers often rely on dozens of these extensions, so the "representative sample" is in a constant, unknown state of flux.
Untraceable origins: It is not uncommon for data brokers to resell and recombine different data sets, making the "chain-of-custody" unknown. Without knowing the data's full origin, it's impossible to know if it has been double-counted or what filtering mechanisms were used, making it impossible to trust whether you’re looking at a good representative sample when extrapolating your MSV metric.

Even when combining both of these approaches, this still results in a tiny, statistically insignificant sample of true market behavior. With total prompt volumes estimated around 2.5 billion inputs daily, a product or sales pitch claiming to provide a sample of "tens of millions" of prompts per month still represents far less than 1% of the total market (and by some estimates, as low as 0.15%).

Furthermore, brands have limited–or nonexistent–control over sample demographics. This means you can’t have full confidence that your target audience is actually included in your sample. Paid panel participants are more likely to skew more tech-savvy and may be biased based on the incentives in the program. Browser extension data is limited to desktop users, is likely to completely miss most enterprise/business users, and can often end up being blocked by different settings or environments.

Some may argue that this imperfect data is "better than nothing." But in most cases, it’s more likely that unreliable MSV data is worse than no data at all. It could lead you to prioritize the wrong content, misallocate resources, and chase metrics that don't correlate with business outcomes. It can’t provide what made traditional search MSV so reliable: trust in the source and the sample.

The Behavior Behind AI Prompt Volume is Fundamentally Different

Why Pillar 3 Fails.

The very nature of AI search makes the traditional way of measuring volume obsolete. The old methodology relied on millions of people searching in simple, similar ways. This is no longer true. We don't "search" in LLMs; we have conversations. Prompts are longer, more detailed, and deeply contextual.

AI generates answers, not lists: AI generates a paragraph of text, not a predictable list of blue links. Your brand might be mentioned, cited, or ignored within that conversational response. This generative format doesn't have a "position" in the traditional sense.
AI is probabilistic, not deterministic: Ask ChatGPT the same prompt three times, and you may get three different answers. The results vary by user, by chat history, and by the model itself. This means there is no single, stable, or repeatable "result" to measure.
Personalization is ingrained: LLMs often consider context from previous conversations or personal info stored in a user's account, even if the user doesn't include it in the prompt. A user might type "best running shoes," and the LLM may automatically alter its response because it knows the user is a woman in her 50s living in a city.
MSV of any prompt = 1: These long, context-laden prompts will rarely match exactly. This makes traditional, keyword-level aggregation nearly meaningless and makes it incredibly difficult to aggregate the number of searches for a given topic, even with broader matching.

The New Playbook: Start with Business Goals, Not Vanity Metrics

Instead of starting with the question "What are people searching for?", we believe the strategic starting point should be "What do we want to be known for?"

“It’s about going back to ‘What is the goal?’ ‘What are you trying to accomplish with AEO in general?’” says Wei Zheng, Conductor’s Chief Product Officer.

Wei Zheng, Chief Product Officer, [object Object] — LinkedInWei Zheng, Chief Product Officer, Conductor

How to Prioritize Using the Right Data and Questions

Once you're grounded in your goals, you can optimize by prioritizing with data you can actually trust.

Stop asking, "Did we appear for this one prompt?" Start asking these four, strategic questions:

What is our competitive share of voice? Understand how often AI connects competitors to certain topics or categories vs. your brand to identify key gaps and opportunities.
Where are our persona gaps? Searches in LLMs are more personalized than ever. Analyze your performance data by persona to surface which of your key personas you are failing to engage.
Where are our journey gaps? Analyze your visibility by intent stage to understand where you may be missing opportunities to connect with your audience as they move from awareness to decision.
Where are our performance gaps? Look at your own analytics. See which pages get to AI referral traffic or where your Google Search Console (GSC) impressions are declining to find content that needs to be optimized.

The Conductor POV: How We're Building a Smarter Solution

While you shouldn’t use AI prompt volume as the sole source of your AEO strategy, that is not to say it has no value. Collecting real-world prompt data can be useful, and as Conductor builds its own solution around this need, we aim to do it right by:

Using reliable proxies for topical interest. While prompt-level volume is unreliable, topical-level interest is still measurable. People’s core interests don’t suddenly change, even if they phrase their queries differently in an AI chat. We can leverage and combine data from Google, such as Google Trends, GSC, and Keyword Planner, to provide more accurate signals for what your audience actually cares about and help inform the topics you should be tracking.
Separating the value of prompts from the flaw of MSV. The problem isn't the data itself; it's the misleading approach of using a tiny, unrepresentative sample to project some kind of statistically sound outcome and call it MSV. The real value isn't in the volume (how many are asking), but in the construction (how they are asking). It provides qualitative insights into your audience's language, pain points, and thought processes.

Our future R&D will be focused on combining reliable, topical data with AI-driven learnings from real-world prompts to better understand how people prompt. This allows us to help you construct better content based on how your audience thinks, not on a flawed volume metric.

Building a Durable Strategy for What's Next

Eventually, it is likely that LLM providers will release reliable, first-party data, especially as on-platform advertising and commerce are introduced.

But today, any "AI prompt volume" metric or "AI rank tracker" you are being sold is fundamentally misleading. Making critical resource decisions based on these mathematically flawed, unrepresentative, and methodologically broken concepts is a high-risk gamble.

Focus on your goals. Use real data. And build a strategy that's durable enough to outlast the hype.

Share this article

Facebook

Twitter

About the authors

Joe Taylor

Joe Taylor is the Head of Revenue Enablement and empowers teams to deliver high-quality interactions to customers and prospects. He has worked at Conductor for over ten years, serving in various capacities, including VP of Customer Success and VP of Learning and Development. Joe joined Conductor in May of 2013 as a Customer Success Manager. Before Conductor, Joe worked as a sales representative and then as a Search Marketing Consultant at Yodle, a digital marketing solution for small businesses. Joe attended the University of Washington and moved to New York City in 2011 on a whim. He held his first Vice President title in third grade at South Pines Elementary, promising Little Caesar’s pizza every Friday if he was elected. Unfortunately, he did not fulfill that promise, and the shame has never left him.

Read Joe's bio

Wei Zheng

As Chief Product Officer, Wei is responsible for Conductor’s strategic product direction, including product management and user experience design. Prior to joining Conductor, Wei has worked in the Enterprise data management space for over 20 years. Most recently, Wei was the VP of Product and Design at Trifacta, where she drove the creation of the Self-Service Data Preparation market category and spearheaded end-user SaaS products for data engineers and data analysts. Before Trifacta, Wei led product efforts at Informatica, where she launched successful big data and data virtualization products. Wei studied Computer Science at the University of California, Berkeley and resides along the foggy shores of San Francisco.

Read Wei's bio

Jenny Li

Jenny is a Director of Product Marketing at Conductor. She blends her background in brand strategy and market research with a strong customer instinct to drive the creation of products and experiences that improve, rather than clutter, people's lives.

Read Jenny's bio