New metric tracks AI model "hallucination" errors

techxplore.com — June 14, 2025 at 01:01 PM UTC

Researchers have created a new metric to measure and understand "hallucinations" in multimodal reasoning models, which are prone to generating false information. This could help improve the accuracy of AI. The new metric, called RH-AUC, and a diagnostic benchmark, RH-Bench, assess how a model's accuracy changes with reasoning length. Longer reasoning chains often lead to increased hallucinations, as models rely more on language priors. These tools will help researchers evaluate and improve multimodal large language models, which are used to generate content. The study found larger models often achieve a better balance between reasoning and perception.

With a significance score of 3.8, this news ranks in the top 6.3% of today's 28983 analyzed articles.

Get summaries of news with significance over 5.5 (usually ~10 stories per week). Read by 10,000+ subscribers: