- Researchers in a new study tasked an AI-powered tech company with developing 70 different programs.
- They found AI could develop software in under seven minutes for less than $1 in costs, on average.
- AI bots were assigned roles and were able to talk, make logical decisions, and troubleshoot bugs.
Recommended articles
The findings came after researchers published another study in which AI agents powered by large language models were able to run a virtual town on their own.
In the recent paper, a team of researchers from Brown University and multiple Chinese universities conducted an experiment to see whether AI bots powered by a version of ChatGPT's 3.5 model could complete the software-development process without prior training.
To test this, researchers created a hypothetical software-development company named ChatDev. Based on the waterfall model — a sequential approach to creating software — the company was broken down into four stages in chronological order: designing, coding, testing, and documenting.
From there, researchers assigned AI bots specific roles by prompting each one with "vital details" that described the "designated task and roles, communication protocols, termination criteria, and constraints."
Once the researchers gave the AI bots their roles, each bot was allocated to its respective stages. The "CEO" and "CTO" of ChatDev, for instance, worked in the "designing" stage, and the "programmer" and "art designer" performed in the "coding" stage.
During each stage, the AI workers chatted with one another with minimal human input to complete specific parts of the software-development process — from deciding which programming language to use to identifying bugs in the code — until the software was complete.
The researchers ran the experiment across different software scenarios and applied a series of analyses to them to see how long it took ChatDev to complete each type of software and how much each one would cost.
Researchers, for example, tasked ChatDev to "design a basic Gomoku game," an abstract strategy board game also known as "Five in a Row."
At the designing stage, the CEO asked the CTO to "propose a concrete programming language" that would "satisfy the new user's demand," to which the CTO responded with Python. In turn, the CEO said, "Great!" and explained that the programming language's "simplicity and readability make it a popular choice for beginners and experienced developers alike."
After the CTO replied with, "Let's get started," ChatDev moved on to the coding stage, where the CTO asked the programmer to write a file, followed by the programmer asking the designer to give the software a "beautiful graphical user interface." The chat chain was repeated at each stage until the software was developed.
After assigning ChatDev 70 tasks, the study found the AI-powered company was able to complete the full software-development process "in under seven minutes at a cost of less than one dollar," on average — all while identifying and troubleshooting "potential vulnerabilities" through its "memory" and "self-reflection" capabilities.
The paper said about 86.66% of the generated software systems were "executed flawlessly."
"Our experimental results demonstrate the efficiency and cost-effectiveness of the automated software development process driven by CHATDEV," the researchers wrote in the paper.
The researchers didn't immediately respond to a request for comment from Insider before publication.
The study's findings indicate one of the many ways powerful generative-AI technologies such as ChatGPT can perform specific job functions. Since the AI chatbot came out in November, workers across industries have used it on the job to save time and boost productivity.
Coders, in particular, may find generative-AI tools beneficial to their personal and professional lives. Daniel Dippold, a coder in Berlin, used ChatGPT to develop a program that helped him find an apartment, and Amazon employees were found to use ChatGPT for software development.
The study wasn't perfect, however: Researchers identified limitations, such as errors and biases in the language models, that could cause issues in the creation of software. Still, the researchers said the findings "may potentially help junior programmers or engineers in the real world" down the line.