PROGRAMMING LANGUAGES Dec. 21, 2025, 5:30 a.m.

Building Multi-Agent Systems with CrewAI and AutoGen

Imagine a system where dozens of specialized AI agents collaborate seamlessly to solve complex tasks—booking flights, answering support tickets, or even orchestrating supply‑chain logistics. With the rise of open‑source frameworks like CrewAI and AutoGen, building such multi‑agent ecosystems has become far more approachable. In this guide we’ll walk through the core concepts, set up a development environment, and stitch together two real‑world agents to demonstrate a powerful, extensible workflow.

Why Multi‑Agent Architectures Matter

Single‑purpose language models excel at answering isolated queries, but they often stumble when a problem spans multiple domains. A multi‑agent architecture distributes responsibilities, allowing each agent to specialize—one might handle data extraction, another could manage scheduling, while a third focuses on user interaction.

Beyond specialization, agents can operate concurrently, dramatically reducing latency for long‑running pipelines. They also provide a natural way to enforce privacy or compliance by sandboxing sensitive operations within dedicated agents.

Both CrewAI and AutoGen embrace these principles, offering abstractions that let you define roles, communication patterns, and execution policies with minimal boilerplate.

Getting Started: Install the Toolkits

First, ensure you have Python 3.10+ and a virtual environment ready. The following command pulls the latest releases of CrewAI and AutoGen, along with their optional dependencies for OpenAI and Anthropic providers.

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install crewai autogen[openai]  # add [anthropic] if needed

After installation, verify the packages load without errors.

python -c "import crewai, autogen; print('All good!')"

If you plan to use OpenAI’s GPT‑4, set the OPENAI_API_KEY environment variable now.

Core Concepts in CrewAI

CrewAI models a team as a crew—a collection of agents that share a common goal. Each agent has a role, a prompt that defines its behavior, and optional tools it can invoke.

Agents communicate through a shared TaskBoard, which tracks progress and passes results downstream. This decoupled design lets you plug in new agents without rewriting existing logic.

Below is a minimal crew definition that extracts flight details from a user query.

from crewai import Agent, Crew, Task

# Define the extraction agent
extractor = Agent(
    role="FlightInfoExtractor",
    goal="Pull departure, destination, and dates from free‑text input",
    backstory="You are a meticulous data analyst with a knack for spotting dates and locations.",
    tools=[],
)

# Define the task that uses the agent
extract_task = Task(
    description="Extract flight details from: {{ user_input }}",
    agent=extractor,
    expected_output="JSON with keys: departure, destination, start_date, end_date"
)

# Assemble the crew
flight_crew = Crew(
    agents=[extractor],
    tasks=[extract_task],
    verbose=True
)

Running flight_crew.kickoff(user_input="I need a round‑trip from NYC to Tokyo next month") will produce a structured JSON payload ready for downstream processing.

Core Concepts in AutoGen

AutoGen focuses on conversation‑driven agent interaction. It treats each agent as a chat participant that can call functions, spawn sub‑agents, or even modify its own prompt at runtime.

The central class, AssistantAgent, wraps a language model and exposes a run method that accepts a list of messages. You can attach Tool objects that the model may invoke, enabling actions like web searches or database queries.

Here’s a concise AutoGen snippet that creates a customer‑support agent capable of looking up order status.

from autogen import AssistantAgent, Tool

# Mock order lookup function
def get_order_status(order_id: str) -> str:
    # In real life, query your DB or ERP system
    return f"Order {order_id} is currently shipped and will arrive in 2 days."

# Wrap the function as a tool AutoGen can call
order_tool = Tool(
    name="get_order_status",
    description="Retrieve the shipping status of an order given its ID.",
    func=get_order_status,
)

support_agent = AssistantAgent(
    name="SupportBot",
    system_message="You are a helpful support assistant. Use the provided tools when needed.",
    tools=[order_tool],
    model="gpt-4o-mini"
)

# Example interaction
messages = [
    {"role": "user", "content": "Hey, can you tell me where my order 12345 is?"},
]
response = support_agent.run(messages)
print(response)

The model will automatically decide to call get_order_status and embed the result in its reply, demonstrating AutoGen’s seamless tool integration.

Bridging CrewAI and AutoGen

While CrewAI shines at orchestrating linear pipelines, AutoGen excels in dynamic, conversational loops. Combining them lets you build a system where a crew extracts structured data, and an AutoGen agent engages the user to refine or act on that data.

Consider a travel‑booking assistant: CrewAI parses the user’s itinerary request, then AutoGen negotiates flight options, handles payment, and sends confirmations.

The integration point is a simple Python function that feeds CrewAI’s output into AutoGen’s message stream.

def crew_to_autogen(crew_output: dict, user_id: str) -> str:
    # Convert CrewAI JSON into a natural language prompt for AutoGen
    prompt = (
        f"User {user_id} wants to book a flight from {crew_output['departure']} "
        f"to {crew_output['destination']} departing on {crew_output['start_date']} "
        f"and returning on {crew_output['end_date']}. Propose the best options."
    )
    return prompt

Now we can wire the two systems together in a single workflow.

Full End‑to‑End Example: Travel Planner Bot

Below is a working script that combines the earlier extraction crew with a conversational AutoGen agent. It demonstrates how to handle user input, extract flight details, and then let the AutoGen agent finalize the booking.

import json
from crewai import Agent, Crew, Task
from autogen import AssistantAgent, Tool

# ---------- CrewAI: Flight Info Extraction ----------
extractor = Agent(
    role="FlightInfoExtractor",
    goal="Parse free‑text travel requests into structured JSON",
    backstory="You are a seasoned travel analyst with an eye for dates and locations.",
)

extract_task = Task(
    description="Extract flight details from: {{ user_input }}",
    agent=extractor,
    expected_output="JSON with keys: departure, destination, start_date, end_date",
)

flight_crew = Crew(agents=[extractor], tasks=[extract_task], verbose=False)

# ---------- AutoGen: Booking Conversation ----------
def mock_flight_search(departure, destination, start_date, end_date):
    # Placeholder for an actual flight‑search API
    return [
        {"airline": "AirX", "price": "$450", "duration": "12h"},
        {"airline": "SkyFly", "price": "$480", "duration": "11h 45m"},
    ]

search_tool = Tool(
    name="search_flights",
    description="Search for flights given origin, destination, and dates.",
    func=mock_flight_search,
)

booking_agent = AssistantAgent(
    name="TravelBot",
    system_message=(
        "You are a friendly travel assistant. Use the search_flights tool to "
        "provide at most three options, then ask the user which they prefer."
    ),
    tools=[search_tool],
    model="gpt-4o-mini"
)

def run_travel_flow(user_input: str, user_id: str = "U123"):
    # Step 1: CrewAI extracts structured data
    crew_result = flight_crew.kickoff(user_input=user_input)
    flight_data = json.loads(crew_result[0].output)  # assume first task's output
    
    # Step 2: Convert to AutoGen prompt
    autogen_prompt = crew_to_autogen(flight_data, user_id)
    
    # Step 3: Start the conversation with AutoGen
    messages = [{"role": "user", "content": autogen_prompt}]
    response = booking_agent.run(messages)
    return response

# Demo
if __name__ == "__main__":
    user_msg = "I want to fly from San Francisco to Berlin next Friday and return the following Monday."
    print(run_travel_flow(user_msg))

Running the script prints a friendly list of flight options, letting the user pick their favorite. In a production setting you would replace mock_flight_search with a real API call and persist the conversation state in a database.

Real‑World Use Cases

Enterprise Helpdesks: CrewAI extracts ticket metadata while AutoGen handles back‑and‑forth clarification, reducing average resolution time.
Supply‑Chain Coordination: One crew monitors inventory levels, another forecasts demand; an AutoGen agent negotiates with vendors in real time.
Personal Finance Advisors: A crew aggregates spending data, then an AutoGen chat agent suggests budgeting actions and answers follow‑up questions.

These scenarios share a common pattern: data collection → structured representation → interactive decision making.

Pro Tips for Scaling Multi‑Agent Systems

1. Keep prompts DRY. Store reusable role descriptions and system messages in external JSON/YAML files. This prevents drift when you update a role across dozens of agents.

2. Leverage async execution. Both CrewAI and AutoGen support asynchronous calls; wrap your tasks in asyncio.gather to run independent agents in parallel.

3. Monitor token usage. Multi‑agent chats can quickly balloon. Use OpenAI’s usage field to log per‑agent token counts and set hard limits.

4. Version your agents. Treat each agent’s prompt as code—store it in Git, tag releases, and use CI pipelines to validate output against regression tests.

5. Secure tool execution. When exposing functions as AutoGen tools, validate inputs rigorously to avoid injection attacks or accidental data leaks.

Testing and Debugging Strategies

Unit‑testing agents is easier than you think. Mock the language model responses using unittest.mock and assert that the agent produces the expected JSON schema.

from unittest.mock import patch

@patch("crewai.models.openai.ChatCompletion.create")
def test_extractor(mock_chat):
    mock_chat.return_value = {"choices": [{"message": {"content": '{"departure":"NYC","destination":"Paris","start_date":"2025-05-01","end_date":"2025-05-07"}'}}]}
    result = flight_crew.kickoff(user_input="Fly me to Paris next May")
    assert "Paris" in result[0].output

For AutoGen, enable the built‑in verbose flag to see a step‑by‑step log of tool calls and message exchanges. This transparency is invaluable when agents misinterpret user intent.

Deploying to Production

Containerize your application with Docker, exposing a single HTTP endpoint that receives user messages, runs the CrewAI‑AutoGen pipeline, and returns the agent’s reply. Below is a minimal Flask wrapper.

from flask import Flask, request, jsonify
app = Flask(__name__)

@app.route("/chat", methods=["POST"])
def chat():
    data = request.json
    user_msg = data.get("message")
    reply = run_travel_flow(user_msg, user_id=data.get("user_id", "anon"))
    return jsonify({"reply": reply})

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080)

Scale horizontally with a Kubernetes deployment, and use a sidecar container to cache LLM responses for repeated queries, cutting costs dramatically.

Future Directions

Both CrewAI and AutoGen are evolving rapidly. Upcoming features include native support for agent memory (persistent state across sessions) and dynamic role swapping (agents that can change their purpose mid‑conversation).

Integrating retrieval‑augmented generation (RAG) will let agents pull in up‑to‑date knowledge from internal document stores, making them even more useful for compliance‑heavy industries.

Keeping an eye on the community repositories and contributing bug fixes or new tool wrappers is a great way to stay ahead of the curve.

Conclusion

Building multi‑agent systems no longer requires reinventing the wheel. By leveraging CrewAI’s pipeline orchestration and AutoGen’s conversational tooling, you can prototype sophisticated assistants in a matter of hours.

The key is to think of each agent as a microservice with a clear contract: a well‑defined role, a concise prompt, and a set of safe tools. When these contracts are respected, the agents cooperate naturally, delivering robust, scalable solutions for real‑world problems.

Start experimenting today—extract data with CrewAI, engage users with AutoGen, and watch your applications become smarter, faster, and more collaborative.

Share this article