Misleading Information in AI Search Engines: A Critical Analysis

 


Search tools based on generative AI models have become a popular alternative to traditional search engines, attracting an increasing segment of users to search for information quickly and easily, however, a recent study by the Tao center for Digital Journalism at Columbia University revealed a significant flaw in the accuracy of these models, especially when used as a news source.

The study showed that artificial intelligence models, including those developed by leading companies such as OpenAI and xAI, tend to fabricate stories and provide incorrect information when asked about current events.

This raises serious concerns about the accuracy of information, especially with the increasing dependence of the public on artificial intelligence as a source of news. researchers Claudia yazvinska and aysvaria Chandrasekhar pointed out in the study that about 25% of Americans are now using artificial intelligence models as alternatives to traditional search engines, and this percentage is not marginal, but reflects a radical change in information search behavior, which makes the mistakes made by these tools more dangerous, as the circulation of misleading information can lead to distortion of public opinion, or making incorrect decisions based on fabricated data.

Study details and results:

In the study, the researchers tested eight tools based on generative artificial intelligence models, equipped with the direct search feature, namely: ChatGPT Search, Perplexity, Perplexity Pro, DeepSeek Search, Gemini, Grok-2 Search, Grok-3 Search, and Copilot, by conducting 1,600 queries about real news articles.

The tests included feeding artificial intelligence models with excerpts directly from real news articles, and then asking each model to specify the title of the article, the original publisher, the date of publication, the link to the article – title URL – and the results showed that these models provided incorrect answers to more than 60% of queries related to news sources.

The study also revealed a worrying trend among AI models, namely their tendency to give fabricated answers that seem plausible, rather than admit that they do not have reliable information.

The researchers have confirmed that this behavior was not limited to one model, but was common to all the models tested, which indicates an inherent pattern in the way these models work.

But the shocking surprise was in the performance of the paid versions of these models, as they turned out to be more inclined to provide incorrect information compared to the free versions. For example, the (Perplexity Pro) platform – with a subscription value of 20 dollars per month – and the (Grok 3) paid platform – with a subscription value of 40 dollars per month – provided incorrect answers with more confidence than their free counterparts.

However, the paid versions were able to correctly answer a larger number of claims, but their tendency to provide uncertain answers led to a higher overall error rate, suggesting that paid versions may be designed to provide confident answers regardless of their accuracy, raising questions about the reliability of these models in providing accurate information.

Challenges for publishers: 

Other problems appeared in the study, as the researchers discovered evidence that some artificial intelligence tools bypassed the settings of the robot Exclusion Protocol, which is the protocol used by publishers to prevent unauthorized access to their content.for example, the free version of (Perplexity) was able to access all ten excerpts of (National Geographic) paid content, although (National Geographic) explicitly prohibits web crawlers of (Perplexity).

And the problem didn't stop there, often AI search engines directed users to republished copies of content via platforms such as Yahoo News, instead of the sites of the original publishers. And this happened even in cases where there were official licensing agreements between publishers and AI companies.

These issues create significant challenges for publishers, who find themselves faced with difficult choices, on the one hand, blocking AI crawlers may lead to a complete loss of attribution, depriving them of recognition for their efforts, and on the other hand, allowing these programs to access content may lead to its widespread Reuse without directing visits to the sites of the original publishers, which negatively affects their revenue.

The responsibility of users in the age of artificial intelligence:


Mark Howard, Chief Operating Officer of Time magazine, expressed serious concern about the challenges faced by publishers in controlling how artificial intelligence models absorb and display their content, and pointed out that this could lead to harm publishers ' brands, especially if they provide false news information to users, and cited a recent example, BBC News had a problem with Apple due to Apple intelligence notification summaries, which inaccurately reformulated news alerts.

However, Howard expressed optimism about the possibility of improving these tools in the future, stressing that today is the worst the product will ever be, referring to the large investments and engineering efforts made to develop these technologies.

At the same time, Howard addressed criticism to users, charging them with the responsibility of verifying the accuracy of information provided by free artificial intelligence tools. “If any consumer now thinks that any of these free products will be 100% accurate, blame him,” he said.

Howard stressed that the results of the study were not surprising because large linguistic models face difficulties in understanding the true meaning of information and rely heavily on autocomplete systems, which makes them improvise in providing answers.

OpenAI and Microsoft have provided information to Columbia Journalism Review (CJR) acknowledging receipt of the study's findings, yet they did not directly respond to the concerns raised.

Meanwhile, Microsoft emphasized its compliance with protocols aimed at excluding robots and following publishers' guidelines.

Post a Comment

Previous Post Next Post