AI Hallucinations

CarlJ

28 March 2025

AI Hallucinations - Hidden Dangers With AI Generated Content

Alarming Inconsistencies in AI Search Tools: A Recent Study

In the rapidly evolving digital landscape, AI search tools have gained significant popularity as potential replacements for traditional search engines. However, a recent study by the Tow Center for Digital Journalism has revealed concerning inconsistencies and inaccuracies in these AI-driven search tools, challenging their reliability.

The study meticulously analysed eight prominent AI search tools, including ChatGPT Search, Gemini, Perplexity, Perplexity Pro, DeepSeek Search, Microsoft's Copilot, Grok-2 Search, and Grok-3 Search. Researchers examined 200 randomly selected news articles from 20 different publishers, ensuring each article appeared within the top three Google search results when using an exact excerpt from the story.

Key Findings

The findings were alarming. Major AI search engines frequently fabricated reference links, failed to provide sources when requested, and delivered incorrect information, particularly when citing news articles. "Overall, the chatbots provided incorrect answers to more than 60% of queries," the study states.

The AI-generated responses were graded on a scale from "completely correct" to "completely incorrect." Only Perplexity and Perplexity Pro performed at a relatively acceptable level. The rest failed at an alarming rate, with some AI tools confidently reinforcing misinformation.

Performance of AI Search Tools

X's Grok-3 Search was incorrect 96% of the time. Microsoft's Copilot also fared poorly, refusing to answer 104 out of 200 queries. Among the 96 questions it did respond to, only 16 were "completely correct," 14 were "partially correct," and 66 were "completely incorrect," giving it an overall inaccuracy rate of about 70%.

ChatGPT Search, while one of the more responsive AI tools, also struggled with accuracy. It provided answers for all 200 queries but only achieved a "completely correct" rating 28% of the time, while it was "completely incorrect" 57% of the time.

The Issue of AI Hallucinations

The study supports ongoing concerns that AI models not only fabricate information but do so with unwavering confidence. These so-called "hallucinations" are an acknowledged flaw in large language models (LLMs), but the extent to which they occur in AI search engines is now quantifiably evident.

This issue was highlighted in a 2023 article by Ted Gioia of The Honest Broker, where he documented ChatGPT's tendency to generate incorrect information with complete certainty. Even when the AI admitted to being wrong, it would sometimes follow up with more false claims.

Implications for Future Generations

The Tow Center's research, published in the Columbia Journalism Review, warns that the perception of AI as a shortcut to knowledge, particularly among younger users, could lead to a generation ill-equipped with research and analytical skills. The study calls for a shift in perspective, advocating for AI to be understood as a tool for extending human capabilities, rather than replacing them.

Conclusion

As AI search tools continue to evolve, it is crucial to address these inconsistencies and inaccuracies to ensure they can be trusted as reliable sources of information. The findings of this study serve as a reminder that while AI has the potential to enhance our capabilities, it should not be seen as a replacement for thorough research and critical thinking.

Blog Category

01244 535527

Email

Call Us

Address

Areas Covered