Unfortunately, a new study by METR sheds light on some daunting hurdles. These challenges are prevalent even when veteran developers have access to AI coding assistants. That’s why researchers studied the use of these tools in real-time programming tasks, working with 16 experienced open-source developers. Though limited, these findings paint a picture of AI tools effectively assisting in targeted contexts. They actually may not increase productivity for advanced programmers.
The core research consisted of accomplishing 246 tasks across large code repositories. Second to our very successful first-ever Tool Fair, which was really all about tools like Cursor Pro. Indeed, 94% of participants reported that they have incorporated web-based large language models (LLMs) into their coding workflows. Of writer respondents, only fifty-six percent had explicitly used Cursor. Even though developers predicted a 24% decrease in time to completion of tasks, the reality was just the opposite.
Researchers discovered a particularly surprising finding: When AI is permitted, completion time increases by 19%. Developers in fact become less productive the more they use AI tooling. This finding should give us pause about the general effectiveness of AI-assisted coding tools, especially for users who are already advanced coders.
Our learning developers often took longer just to prompt the AI. They waited much longer for its responses than they did coding in real time. Beyond simply making the developer experience annoying, inefficiency makes everyone’s code bug-prone and a security target. This is unnecessary and undermines the very purpose of AI assistance. As developers are faced with increasingly complicated tasks, the introduction of errors can lead to dire consequences, particularly in high-stakes settings.
METR also recognized that there are negative aspects associated with AI coding tools. Yet they have been shown to be extremely useful in allowing developers to better address complex and long-horizon tasks. AI technology has advanced rapidly in the past few years. Though hurdles remain, continuing advancements hold the potential to produce more potent tools down the line.
This study employed an RCT design. It was able to train developers on how to use Cursor before they started their work. This new approach sought to ensure that participants were trained and comfortable enough to use the AI tool in a meaningful way.