There’s an enormous and remarkable change underway in the new world of artificial intelligence tools that write code for us. With the growing CLIs, many developers consider command-line interfaces (CLIs) the dominant means of coding today. In the past, code-editing tools such as Cursor, Windsurf, and GitHub’s Copilot dominated the development sphere. Today, the terminal is enjoying a renaissance as a general-purpose environment for more complicated programming tasks. This news is a striking example of the increasing use of command-line-style coding tools from companies like Anthropic, DeepMind, and OpenAI.
Cursor, best known for its intelligent and creative problem-solving skills, uses the GitHub/SWE-Bench model as its base framework. Through an iterative process, this model takes wrongly conceived code and reconfigures it into properly functioning code. It gives developers a powerful, yet intuitive tool to debug and improve their projects.
Zach Lloyd, the founder of Warp, believes that the terminal has tremendous potential. He envisions it as a tool that could address challenges outside the scope of conventional code editors. He emphasizes that “the terminal occupies a very low level in the developer stack, so it’s the most versatile place to be running agents.” This attitude is indicative of a larger awareness of all the terminal’s capabilities in today’s software development.
Command-line coding tools are becoming popular amongst students at a dizzying pace. Prominent generative AI products such as Claude Code by Anthropic, Gemini CLI by DeepMind, and CLI Codex by OpenAI are paving the industry road. These tools have quickly taken off with developers, making them some of the companies’ most popular yet.
Alex Shaw, co-creator of TerminalBench—a top benchmark dedicated to terminal speed—paints a difficult picture that developers must navigate when wielding these tools. “What makes TerminalBench hard is not just the questions that we’re giving the agents,” he states. TerminalBench poses an interesting challenge in terms of reverse-engineering a corresponding compression algorithm. This is no easy task, especially considering the current state of AI-driven coding solutions.
Warp has really distinguished itself amidst this competitive landscape. The other large language model we tested, ChatGPT, did very well on TerminalBench, managing to solve over 50% of the problems given to it. This achievement speaks volumes to Warp’s ability to positively and seamlessly steer developers through the complexities of command-line programming.
Beyond its benchmarking triumph, Warp seeks to simplify the mundane tasks that every programmer faces. Lloyd notes that “if you think of the daily work of setting up a new project, figuring out the dependencies and getting it runnable, Warp can pretty much do that autonomously.” This no-code autonomous functionality frees up precious developer time. It dramatically increases the amount of time and energy they can spend on the creative aspects of their work.
As more developers explore the terminal as a coding interface, there is an emerging belief about its future role in software development. Shaw anticipates that “there’s a future in which 95% of LLM-computer interaction is through a terminal-like interface.” This statement highlights the power of command-line tools to change the way we build software.
We’ve come a long way from those old depictions seen in 90s hacker movies of the terminal as a menacing, minimalist black-and-white screen. Today it is an amazing tool for implementation of programs through creative canvassing and use of data. But as the generative AI coding tools improve, this command-line to-code environment will likely become much more essential to software development workflows.