A First Try at AutoGen

Posted on 2024-08-01 Views: Waline: Word count in article: 2k Reading time ≈ 7 mins.

This article introduces AutoGen, an open - source framework for building AI Agents and simplifying multi - Agent cooperation. Its goal is to provide a convenient and flexible framework for generative AI development and research. It also provides installation and usage example code, demonstrating multi - agent collaboration.

Introduction to AutoGen

AutoGen was first created in FLAML (A Fast Library for Automated Machine Learning & Tuning) on May 29, 2023. It is an open - source framework for building AI Agents and simplifying the cooperation of multiple Agents. Currently, it is mainly developed by Microsoft, Pennsylvania State University, and the University of Washington. The goal of AutoGen is to provide an easy - to - use and flexible framework for the development and research of generative AI.

In the official introduction of AutoGen, it compares itself to generative AI as PyTorch is to deep learning.

There are many definitions of Agents. In AutoGen, an Agent is an entity that can receive messages, send messages, and generate responses using one or more of large language models, tools, and human input.

AutoGen is developed by numerous researchers and engineers. It incorporates the latest achievements in multi - agent research and has been applied to many practical applications, such as investment, AI employees, writing, blockchain, customer service, cybersecurity, data analysis, etc.

Initial Attempt

Before using it, you need to install it first.

1	pip install pyautogen

The following is a sample code. In the code, I used Azure OPENAI gpt - 4o. You can run it by replacing the variables with your real values.

import os
import autogen
from autogen import AssistantAgent, UserProxyAgent
from autogen import ConversableAgent


llm_config = {
    "model": "gpt - 4o",
    "api_key": os.environ.get("OPENAI_API_KEY"),
    "base_url": os.environ.get("OPENAI_API_BASE"),
    "api_type": "azure",
    "api_version": "2023 - 03 - 15 - preview",
    "temperature": 0.9,
}

assistant = AssistantAgent("assistant", llm_config = llm_config)
user_proxy = UserProxyAgent("user_proxy", code_execution_config = False)

user_proxy.initiate_chat(
    assistant,
    message="Introduce something about multi - agent collaboration",
)

The execution result is as follows:

user_proxy (to assistant):

Introduce something about multi - agent collaboration

--------------------------------------------------------------------------------
assistant (to user_proxy):

Multi - Agent Collaboration is an important topic in the field of artificial intelligence research. It involves multiple autonomous agents working together to complete a task or achieve a goal. The following are some key contents and points of multi - agent collaboration:

### 1. Basic Concepts

- **Agent**: An agent is an autonomous entity that can perceive its environment and take actions to achieve its goals. Agents can be software programs, robots, sensors, etc.
- **Multi - Agent System (MAS)**: This refers to a system in which multiple agents interact, communicate, and collaborate with each other to complete complex tasks.

### 2. Types of Collaboration

- **Centralized Collaboration**: In this model, there is a central control unit to coordinate the activities of each agent. The advantage of this method is that it is easy to control and manage, but the disadvantage is that the failure of the central control unit will affect the entire system.
- **Distributed Collaboration**: Agents achieve collaboration through distributed communication and decision - making. Each agent has a certain degree of autonomy and can make decisions based on local information and feedback from other agents. This method is more robust and scalable.

### 3. Collaboration Strategies

- **Task Allocation**: Decompose large tasks into multiple subtasks and assign these subtasks to different agents.
- **Resource Sharing**: Agents share resources to improve efficiency and performance.
- **Communication and Coordination**: Agents exchange information and coordinate actions through communication.

### 4. Challenges of Collaboration

- **Communication Overhead**: In a distributed system, communication between agents may generate significant overhead.
- **Consistency**: How to ensure the consistency of decisions among agents and avoid conflicts.
- **Robustness**: How the system can maintain normal operation in the face of the failure of individual agents.
- **Learning and Adaptation**: How agents can improve collaboration efficiency through learning and adaptation.

### 5. Application Fields

- **Robot Soccer**: Multiple robots collaborate in a team to play a game.
- **Autonomous Driving Fleets**: Multiple autonomous vehicles drive in coordination to improve road safety and traffic efficiency.
- **Supply Chain Management**: Collaboration between different enterprises or departments to optimize the efficiency of the entire supply chain.

### 6. Related Algorithms and Technologies

- **Reinforcement Learning**: Used to train agents to learn the best strategies through trial and error in complex environments.
- **Game Theory**: Analyze the strategies and payoffs among agents.
- **Auction Algorithms**: Used to solve problems such as task allocation.

Here is a simple Python example code showing the basic concept of two agents collaborating in a shared environment:

```python
# filename: multi_agent_collaboration.py

class Agent:
    def __init__(self, name):
        self.name = name
    
    def act(self, environment):
        raise NotImplementedError

class Environment:
    def __init__(self):
        self.state = 0
        self.goal = 10
    
    def update(self, action):
        self.state += action
        return self.state
    
    def is_goal_reached(self):
        return self.state >= self.goal

class CooperativeAgent(Agent):
    def act(self, environment):
        if environment.state < environment.goal:
            action = 1
        else:
            action = 0
        print(f"{self.name} performs action: {action}")
        return action

def main():
    env = Environment()
    agents = [CooperativeAgent("Agent 1"), CooperativeAgent("Agent 2")]
    
    while not env.is_goal_reached():
        for agent in agents:
            action = agent.act(env)
            env.update(action)
            print(f"Environment state: {env.state}")
            if env.is_goal_reached():
                break

if __name__ == "__main__":
    main()

This code defines a simple environment and two cooperative agents. Each agent performs a simple action until the goal state of the environment is reached. In this way, the basic concept of multi - agent collaboration is demonstrated.

I hope this information is helpful to you! If you have any other questions, feel free to continue asking. TERMINATE


> What I originally wanted to ask was specifically about multi - agents in the field of large language models, but it seems that's not what it thought...

From the effect of the above demo, it is no different from talking to a large language model through a browser. However, AutoGen can also execute code. Take a look at the following example.
```python
import os
import autogen
from autogen import AssistantAgent, UserProxyAgent
from autogen import ConversableAgent


llm_config = {
    "model": "gpt - 4o",
    "api_key": os.environ.get("OPENAI_API_KEY"),
    "base_url": os.environ.get("OPENAI_API_BASE"),
    "api_type": "azure",
    "api_version": "2023 - 03 - 15 - preview",
    "temperature": 0.9,
}

assistant = AssistantAgent("assistant", llm_config = llm_config)
user_proxy = UserProxyAgent(
    "user_proxy",
    code_execution_config={
        "executor": autogen.coding.LocalCommandLineCodeExecutor(work_dir = "coding")
    },
)

# Start the chat
user_proxy.initiate_chat(
    assistant,
    message="The system you are in is an ubuntu. Please check the current system version, CPU, and memory - related information",
)

The following is the return:

user_proxy (to assistant):

The system you are in is an ubuntu. Please check the current system version, CPU, and memory - related information

--------------------------------------------------------------------------------
assistant (to user_proxy):

Okay, I will obtain the current system version, CPU, and memory information through a shell script. Please execute the following script:

```sh
# filename: system_info.sh
#!/bin/bash

echo "System version information:"
lsb_release -a

echo -e "\nCPU information:"
lscpu

echo -e "\nMemory information:"
free -h

This script will output the system version information, CPU information, and memory information. Please execute the script and let me know the result.

--------------------------------------------------------------------------------
Replying as user_proxy. Provide feedback to assistant. Press enter to skip and use auto - reply, or type 'exit' to end the conversation:

NO HUMAN INPUT RECEIVED.

USING AUTO REPLY...

EXECUTING CODE BLOCK (inferred language is sh)...
user_proxy (to assistant):

exitcode: 0 (execution succeeded)
Code output: No LSB modules are available.
System version information:
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.4 LTS
Release:        20.04
Codename:       focal
-e 
CPU information:
Architecture:                    x86_64
CPU op - mode(s):                  32 - bit, 64 - bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          8
On - line CPU(s) list:             0 - 7
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           142
Model name:                      Intel(R) Core(TM) i7 - 10610U CPU @ 1.80GHz
Stepping:                        12
CPU MHz:                         2304.007
BogoMIPS:                        4608.01
Virtualization:                  VT - x
Hypervisor vendor:               Microsoft
Virtualization type:             full
L1d cache:                       128 KiB
L1i cache:                       128 KiB
L2 cache:                        1 MiB
L3 cache:                        8 MiB
Vulnerability Itlb multihit:     KVM: Mitigation: VMX disabled
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Enhanced IBRS, IBPB conditional, RSB filling
Vulnerability Srbds:             Mitigation; TSX disabled
Vulnerability Tsx async abort:   Mitigation; TSX disabled
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves md_clear flush_l1d arch_capabilities
-e 
Memory information:
              total        used        free      shared  buff/cache   available
Mem:          7.6Gi       2.6Gi       3.5Gi        34Mi       1.5Gi       4.7Gi
Swap:         2.0Gi          0B       2.0Gi


--------------------------------------------------------------------------------
assistant (to user_proxy):

Based on the output information you provided, I can summarize the current system version, CPU, and memory information as follows:

### System Version Information
- **Publisher**: Ubuntu
- **Version Description**: Ubuntu 20.04.4 LTS
- **Release Version**: 20.04
- **Codename**: focal

### CPU Information
- **Architecture**: x86_64
- **Operation Mode**: 32 - bit, 64 - bit
- **Byte Order**: Little Endian
- **Number of CPUs**: 8
- **Online CPU List**: 0 - 7
- **Number of Threads per Core**: 2
- **Number of Cores per Socket**: 4
- **Number of Sockets**: 1
- **Vendor ID**: GenuineIntel
- **CPU Family**: 6
- **Model**: 142
- **Model Name**: Intel(R) Core(TM) i7 - 10610U CPU @ 1.80GHz
- **Stepping**: 12
- **CPU Frequency**: 2304.007 MHz
- **Virtualization**: VT - x

### Memory Information
- **Total Memory**: 7.6 GiB
- **Used Memory**: 2.6 GiB
- **Free Memory**: 3.5 GiB
- **Shared Memory**: 34 MiB
- **Buffered/Cached**: 1.5 GiB
- **Available Memory**: 4.7 GiB
- **Swap Partition**: 2.0 GiB (unused)

This information can help you understand the basic situation of the current system. Please let me know if there is anything else I can help with.

TERMINATE


As you can see, AutoGen can indeed execute code. In this way, **AutoGen has the ability to interact with the physical world to a certain extent**, and can do more things.

AutoGen has two ways to execute code. One is to execute directly in the current environment (Local Execution), just like in the above case; the other is to execute in a container (Docker Execution), mainly for security reasons, worrying that the code generated by the large language model has security issues.

## Multi Agents

The two cases above are single - agent. AutoGen supports multiple agents. The following is a multi - agent case. In this case, I set up two agents, one is a Python developer role, and the other is a front - end developer role. Let them develop a personal blog website. The code is as follows:
```python
import os
import autogen
from autogen import AssistantAgent, UserProxyAgent
from autogen import ConversableAgent


llm_config = {
    "model": "gpt - 4o",
    "api_key": os.environ.get("OPENAI_API_KEY"),
    "base_url": os.environ.get("OPENAI_API_BASE"),
    "api_type": "azure",
    "api_version": "2023 - 03 - 15 - preview",
    "temperature": 0.9,
}

cathy = ConversableAgent(
    "cathy",
    system_message="""
    You are an expert in Python development, proficient in Python syntax, good at writing high - performance and easy - to - maintain Python code, and proficient in frameworks such as Django, Django Rest Framework, FastAPI, as well as VUE3 and ElementPlus development. Please complete the development tasks I assign to you.
    You are good at choosing and selecting the best tools and try your best to avoid unnecessary repetition and complexity.
    Please complete the tasks I assign to you. When solving problems, you will break the problem into small problems and improvement items, and suggest small tests after each step to ensure that things are on the right track.
    If there is anything unclear or ambiguous, you will always ask for clarification. You will pause the discussion to weigh and consider implementation options if there is a need to make a choice.
    It is important that you follow this method and do your best to teach your interlocutor how to make effective decisions. You avoid unnecessary apologies and do not repeat earlier mistakes when reviewing the conversation.
    """,
    llm_config = llm_config,
    human_input_mode="NEVER",  # Never ask for human input.
)

joe = ConversableAgent(
    "joe",
    system_message="""
    You are an expert in web development, including CSS, JavaScript, React, Tailwind, Node.JS, and Hugo/Markdown. You are good at choosing and selecting the best tools and try your best to avoid unnecessary repetition and complexity.
    When making suggestions, you break things down into discrete changes and suggest small tests after each stage to ensure that things are on the right track.
    Write code to illustrate examples or write code when instructed in the conversation. If you can answer without code, this is preferred, and you will be asked to elaborate when necessary.