Building your own software development team with chatGPT and AutoGen

 

Stanley Jovel

Full Stack Developer interested in making lives better through software

Updated Aug 1, 2024

Recently, we’ve seen rapid advancements in the programming capabilities of modern LLMs. Starting with AutoGPT’s impressive ability to create entire applications from scratch, then Cognition’s Devin was hailed as “the first AI software engineer” by some, though its capabilities were soon challenged by the introduction of Anthropic’s Claude 3 model family with their remarkable coding prowess, the landscape of AI-assisted software development is rapidly evolving.

In this article, I explore how to leverage existing LLMs to create an engineering team. Rather than having a single model work through tasks independently, I propose using a team of LLMs that interact with each other. One serves as a Project Manager to oversee the software development and translate requirements into manageable tasks, another one serves as the Engineer responsible for writing the code to complete these tasks.

multi_agent_diagram

Figure 1. Multi Agent Diagram

The diagram above illustrates the proposed framework ReAct introduced by Yao et al., which combines reasoning and decision-making processes in LLMs to enhance their performance in interactive tasks. A key advantage of this framework is that it enables AI agents to think through problems and create a list of tasks to solve step-by-step, making their responses more accurate and useful, and allowing them to add more tasks if necessary.

Building my team using AutoGen

There are several multi-agent frameworks available, such as AutoGen, LangGraph, CrewAI, Langroid, just to name a few. I decided to work with AutoGen because it is well-established and has a very active community.

This article won’t cover the implementation details of AutoGen. For those interested in the technical aspects, here’s the project_repository.

Make sure to rename file OAI_CONFIG_LIST.json.sample to OAI_CONFIG_LIST.json and add your OpenAI API key. For this project I’ll be using OpenAI’s GPT-4o.

Defining the agents

Admin

This is you, the human in the loop. You will run the code manually and provide feedback if something is not working as expected.

user_proxy = autogen.UserProxyAgent(
  name="Admin",
  human_input_mode="ALWAYS",
  code_execution_config=False,
)

Project Manager

In charge of breaking down the initial requirements into manageable tasks, adding tasks on the fly if needed, and assessing whether the project is finished.

project_manager = autogen.AssistantAgent(
  name="Project_Manager",
  system_message="""
  I am an experienced Project Manager overseeing the development of a software product. My responsibilities include translating the requirements provided by the Admin into manageable tasks, which I then assign to the Engineer.

  Key Responsibilities:
  - Break down project requirements into detailed, manageable tasks.
  - Assign tasks to the Engineer in a logical and efficient sequence.
  - Monitor the progress of the project and ensure timely delivery of each task.

  Workflow:
  - For each completed task, I will assess if further tasks are required.
  - I will ensure that any identified issues are resolved before moving on to the next task.
  - When the project is complete, I will confirm the project's completion with a "DONE" message.

  Restrictions:
  - I never suggest, review, or test code; my focus is purely on task management and project oversight.
  - I do not get involved in the technical implementation details.

  My goal is to ensure the smooth progression of the project from start to finish, maintaining clear communication and efficient task management.
  """,
  llm_config=llm_config,
)

Engineer

In charge of creating and editing source files with code to complete tasks.

engineer = autogen.AssistantAgent(
  name="Engineer",
  llm_config=llm_config,
  system_message="""
  I am a seasoned Software Engineer, responsible for writing clean, efficient, and maintainable code based on the tasks assigned by the Project Manager. 
  My focus is exclusively on coding and ensuring that the code meets the specified requirements.

  Key Responsibilities:
  - Implement features and fixes according to the provided task specifications.
  - Write code that adheres to best practices and is optimized for performance and readability.
  - Ensure that the code is well-documented to facilitate future maintenance and collaboration.

  Restrictions:
  - I NEVER manage or assign tasks.
  - I do not make decisions regarding the project scope or task prioritization.

  My goal is to deliver high-quality code that meets the requirements set forth by the Project Manager and the Admin.
  """,
)

Creating the group chat and chat manager

groupchat = autogen.GroupChat(
  agents=[user_proxy, project_manager, engineer],
  messages=[],
  max_round=100,
  send_introductions=True,
  enable_clear_history=True,
)
manager = autogen.GroupChatManager(
  groupchat=groupchat, 
  llm_config=llm_config, 
  is_termination_msg=lambda msg: "done" in msg["content"].lower()
)

Registering tools for coding

Equipping the Engineer with necessary tools to interface with the local file system is crucial. These tools allow it to write and edit files with code, run bash commands, and list the contents of directories:

  • list_dir: Lists the contents of a directory.
  • see_file: Displays the contents of a file.
  • modify_code: Edits the contents of a file.
  • create_file_with_code: Creates a new file with code.
  • execute_command: Runs a bash command.

Check the repo to learn how these were implemented.

Trying it out

I start the multi-agent chat and ask the team to create a simple todo list with React.js:

chat_result = user_proxy.initiate_chat(
  manager,
  message="""
Create a React app that is a simple todo list.
""",
)

The following are the highlights of the group chat history after running my setup. Here is a link with the complete trace of execution.

trace1

After the Project Manager receives the initial requirement from the Admin, it creates a comprehensive list of tasks to complete the requested project, with the first task being: “Set up the React environment.” The PM then assigns it to the Engineer, who decides to run the command npx create-react-app todo-list-app to complete it. I open a second terminal, navigate to the output folder with the newly created project, and start it with npm run start. I hit enter in the group chat to continue without feedback.

After the first task is done, the PM assigns the next task to the Engineer, which requires the creation of new files. Here’s an example of how the Engineer uses the tools we defined earlier to create new files with code:

trace2

Here is where AutoGen shines. The Engineer encounters a problem: there is no folder called “components” to write a file into. The Project Manager notices and adds new tasks to create the missing folder.

trace3

trace4

Finally, after all tasks are completed, the Project Manager suggests that I test the app. When I do, I notice a problem—the file App.js which contains the main component hasn’t been changed to integrate the new components—I provide feedback. My change request is quickly incorporated.

trace5

trace6

I notice one last problem: the header uses 100% of the height, hiding the new to-do UI below a few scrolls.

 

trace7

Final result

And voilà, the end result is a usable to-do app built in React.js. I’m impressed with the quality of the code and the overall look and feel of the app, considering it’s such a simple example.

result

Conclusion

The advantage of using multiple agents is the synergy between reasoning and acting—an advantage shown by the ReAct framework. When combined, agents could realize more accurate and practical solutions. Reasoning traces help the agent plan, monitor, and update actions, whereas acting enables it to interact with the external world and obtain necessary information. The collaboration makes the agents able to deal with complicated tasks and their exceptions, leading to more reliable and interpretable results.

In this article, I have demonstrated how leveraging LLMs like ChatGPT and frameworks like Autogen can build an efficient, interactive software development team. This experiment highlights the significant potential of multi-agent frameworks, which can extend far beyond software development. For instance, in customer service, teams of AI agents can handle inquiries and resolve issues, providing seamless support. In education, Multi-agent frameworks can offer personalized tutoring and assistance, enhancing learning experiences. The versatility of these tools opens up exciting possibilities across various domains. As these technologies continue to evolve and mature, their applications will undoubtedly expand, transforming how we approach complex tasks and problem-solving not by querying a single LLM but a network of them.

Future Work

Looking ahead, a valuable next step would be to challenge my AI team with more complex and intricate problems. Future projects could involve adding specialized agents with different roles, such as QA testers or additional engineers to work on parallel tasks. Exploring a setup without a human in the loop could significantly enhance the system’s autonomy and efficiency. This will provide valuable insights into their capabilities and limitations, helping to drive further innovation and improvement.

How can we help?

Can we help you apply these ideas on your project? Send us a message! You'll get to talk with our awesome delivery team on your very first call.