Research shows AI sucks at freelance work, news and real-life tasks: AI Eye

Researchers found AI is hopeless at most Upwork task, it gets the news wrong half the time — and humans crush AI on world model tests. AI Eye.

AI agents cant complete 97% of tasks on Upwork to even a basic standard.

Researchers at Scale AI and the Center for AI Safety got six different AI models to attempt 240 Upwork projects across categories, including writing, design and data analysis and then compared the results to the real freelancer.

The overwhelming majority of the time, the AI models were unable to complete the tasks successfully, with the best AI model, Manus, completing just 2.5% of tasks and earning $1,810 out of $143,991 on offer. Claude Sonnet and Grok 4 managed to finish 2.1% of the tasks.

While AI agents are good at simple and defined tasks like “generate a logo,” the research found they are bad at multi-step workflows, taking any initiative or using judgment.

So they wont be causing mass unemployment for a while yet.

This backs up research from August at MIT, which found that 95% of organizations had zero return on the collective $30 billion theyd invested in AI.

AIs are good at pattern matching and predicting words. But theyre currently pretty bad at building internal models of the world, according to WorldTest from MIT and Basis Research.

For example, humans have an internal model of their own kitchen in their minds, which allows them to determine where the knives are, how long it will take for the pot to boil, and to plan a sequence of actions resulting in a meal. But the testing showed that three frontier reasoning AI models suck at it.

The researchers created 129 tasks across 43 interactive worlds (spot the difference, physics puzzles, etc). The tasks required the AIs to predict hidden aspects of the world, plan sequences of actions to achieve a goal, and determine when the rules of the environment changed. Then they tested 517 humans on the same problems.

The researchers concluded:

Read more

Go to Source

Leave a Reply

Your email address will not be published. Required fields are marked *

Please enter CoinGecko Free Api Key to get this plugin works.