Brewing...

Industry Insights

Why All Your AI Tools Give You the Same Boring Ideas

A NeurIPS 2025 Best Paper found that major AI models keep producing the same answers. For small businesses, that explains why AI brainstorming often feels stale and what to do instead.

Sean McLellan

Lead Architect & Founder

March 11, 20265 min read

If your team has tried ChatGPT, Claude, Gemini, or a few other AI tools for brainstorming, you have probably had the same weird feeling.

You ask for campaign ideas, blog topics, ad angles, or positioning. The answers look polished. They also feel suspiciously similar.

That is not your imagination.

A NeurIPS 2025 Best Paper gives that pattern a name: Artificial Hivemind. The researchers tested more than 70 language models across real open-ended questions and found that different models often collapse toward the same answers. Not just similar in tone. Similar in substance.

For small businesses, that explains a lot. If every tool is trained to converge on the same safe, high-scoring output, then switching models will not magically produce original thinking.

What the research found

The study used Infinity-Chat, a dataset with roughly 26,000 real-world open-ended prompts and more than 31,000 human preference annotations. The researchers compared outputs from a wide range of major models, including systems from OpenAI, Anthropic, Google, DeepSeek, Qwen, and Meta.

Two findings matter most.

First, there was intra-model repetition. Ask the same model the same open-ended question five times, and you often get nearly the same answer five times.

Second, there was inter-model homogeneity. Ask competing models from different companies the same question, and they often produce near-identical responses.

That is the Artificial Hivemind in plain English: many AI systems now cluster around the same safe middle.

Why this happens

The likely culprit is alignment training, especially systems tuned around broad human preference signals.

That sounds reasonable on paper. Businesses want useful answers, not chaos. But there is a tradeoff.

When training rewards the answer most people tend to prefer, it also punishes answers that are unusual, sharp, risky, or genuinely novel. Over time, models learn to avoid standing out. They optimize for approval.

The paper also points to a second problem: reward models and LLM-as-judge systems appear miscalibrated. In other words, the systems used to score outputs may rate diverse answers lower than bland but familiar ones.

That matters because many AI pipelines now use model-based judges to rank, filter, or reinforce outputs. If the judge prefers safe averages, the whole system drifts toward safe averages.

Why SMBs should care

This is not an academic footnote.

If you use AI for marketing, content planning, product ideas, strategy work, or customer communication, homogeneity creates two immediate business problems.

1. Your ideas get generic fast

A business owner asks five AI tools for social campaign ideas and gets versions of the same list:

behind-the-scenes content
customer success stories
educational tips
limited-time offers
user-generated content

None of that is wrong. It is just obvious.

If your competitors use the same tools the same way, they get the same list too. Now everyone sounds like they attended the same bland marketing workshop.

2. Errors can become correlated

The scarier part is not boring copy. It is shared failure.

If many models are converging on the same answer patterns, then mistakes can spread in parallel. A bad strategic assumption, weak recommendation, or misleading summary may not be caught by checking a second model if that second model learned the same habits.

That means "I cross-checked it with another AI" is not the safety blanket people think it is.

The practical lesson: stop treating first drafts like strategy

Most SMB teams use AI like a fast junior assistant. That is fine. The mistake is treating the first clean answer as if it came from a creative director or senior strategist.

The new research says you should assume the opposite.

Your first AI answer is probably the statistical center of the internet's taste. It is designed to be acceptable. It is not designed to surprise you.

That does not make AI useless. It just changes how you should use it.

What to do instead

Here are five ways small businesses can get more value without getting trapped in the hivemind.

1. Generate volume, then kill the obvious ideas

Do not ask for five ideas and pick one.

Ask for 30. Then aggressively cut the ones that feel familiar. The first batch usually contains the safest patterns. Originality tends to show up only after the obvious answers are exhausted.

A good working rule: if you could imagine three competitors posting it next week, throw it out.

2. Turn up temperature and sampling

Most teams leave model settings at the default. That usually means cleaner output, but also more predictable output.

When you are brainstorming, use higher temperature or more aggressive sampling settings. You can always tighten later when you need polish.

Default settings are built for reliability. Idea generation needs range.

3. Change the prompt strategy, not just the model

Using three models with the same prompt often gives you three polished versions of the same answer.

Instead, vary the framing:

ask one model for contrarian ideas
ask another for industry-specific angles
ask a third to avoid all common best practices
ask a fourth to write for a niche customer segment

Model diversity helps a little. Prompt diversity helps more.

4. Add constraints that force novelty

AI loves middle-of-the-road output unless you box it into a corner.

Try constraints like:

"Give me campaign ideas a traditional investor would hate"
"Make these ideas feel specific to a local service business, not a SaaS company"
"Avoid anything that sounds like generic LinkedIn advice"
"Give me ideas that would polarize my audience in a useful way"

Constraints create tension. Tension creates better ideas.

5. Put human taste back in the loop

This is the big one.

AI can help you explore options faster. It still cannot replace judgment. Someone on your team has to decide what is sharp, what is true for your market, and what actually sounds like your business.

The winning workflow is not "AI replaces creativity." It is "AI produces raw material, then humans apply taste."

That is slower than one-click content generation. It is also how you avoid sounding like everyone else.

Where this research points next

The researchers suggest pluralistic alignment as a better direction. The short version is simple: instead of forcing every model toward one average preferred answer, future systems may need to preserve a wider range of valid styles, values, and responses.

That is the right instinct.

Because the problem with today's AI is not that it talks too differently. It is that it increasingly talks the same.

The bottom line

If your AI outputs feel repetitive, stale, or weirdly interchangeable, the problem is not just your prompt writing. The models themselves may be converging on the same safe center.

The good news is you can work around it.

Use AI to expand the option set, not to pick the winner for you. Push for volume. Force novelty. Change the framing. And keep human judgment in the final seat.

That is how small businesses get useful leverage from AI without ending up with the same boring ideas as everybody else.

Contact Barista Labs if you want help building AI workflows that produce sharper thinking instead of more average content.

Share this post

Share on X Share on LinkedIn Share on Bluesky

Everyone covered LinkedIn’s LLM feed rewrite. The useful part was a 15% recall gain from bucketed view counts.

March 13, 2026

Microsoft BitNet b1.58 Makes Private AI Cheap Enough for Any Small Business

March 11, 2026

METR's Latest Time-Horizon Data Makes AI Capability Planning Much More Concrete

March 19, 2026

Keep Reading