Study finds large language models struggle with real-world accuracy and adaptability

livescience.com — November 16, 2024 at 02:01 PM UTC

A new study from MIT, Harvard, and Cornell reveals that large language models (LLMs) like GPT-4 struggle to accurately understand and model the real world. While they can provide precise outputs, their underlying models often contain inaccuracies. Researchers tested LLMs using driving directions in New York City and found that when faced with unexpected changes, such as road closures, their accuracy dropped significantly. This raises concerns about their reliability in real-world applications, like autonomous vehicles. The study highlights the need for improved approaches to develop coherent world models in LLMs. Current models fail to adapt to dynamic environments, indicating that further research is necessary to enhance their performance and reliability.

With a significance score of 4.2, this news ranks in the top 3.7% of today's 30576 analyzed articles.

Get summaries of news with significance over 5.5 (usually ~10 stories per week). Read by 10,000+ subscribers: