Building AI Agents with Claude 3.5 and Python
RELEASES Dec. 1, 2025, 5:31 p.m.

Building AI Agents with Claude 3.5 and Python

Ever dreamed of building your own AI sidekick that can tackle real tasks autonomously? With Anthropic's Claude 3.5 Sonnet and Python, you can create powerful AI agents that reason, use tools, and remember conversations—all without needing a PhD in machine learning.

These agents go beyond simple chatbots. They plan steps, call external functions, and adapt to new info, making them perfect for automation in apps, research, or daily workflows. Let's dive in and build some together.

What Are AI Agents?

AI agents are like digital butlers: they observe their environment, reason about goals, and take actions to achieve them. Claude 3.5 shines here thanks to its top-tier reasoning and tool-using abilities.

The secret sauce? Patterns like ReAct (Reason + Act), where the agent thinks aloud, decides on tools, observes results, and repeats until done. No more rigid scripts—pure flexibility.

Pro Tip: Start simple. Claude 3.5 handles complex chains of thought natively, so focus on clear goals in your prompts.

Setting Up Claude 3.5 with Python

First, grab your Anthropic API key from console.anthropic.com. It's free to start, with generous limits.

Install the SDK:

pip install anthropic python-dotenv

Create a .env file:

ANTHROPIC_API_KEY=your_key_here

Load it in Python:

import os
from dotenv import load_dotenv
load_dotenv()

Now you're ready. Import the client: import anthropic and client = anthropic.Anthropic(). Boom—Claude awaits.

Building Your First Agent: Task Solver

Let's create a basic agent that solves everyday tasks by breaking them down. It uses Claude's reasoning to plan and execute mentally first—no tools yet.

Here's a working example: an agent that plans a trip itinerary.

import anthropic
import os
from dotenv import load_dotenv

load_dotenv()
client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))

def simple_task_agent(task: str, model: str = "claude-3-5-sonnet-20240620"):
    system_prompt = """
    You are a helpful AI agent. For any task, think step-by-step:
    1. Understand the goal.
    2. List subtasks.
    3. Execute or simulate each.
    4. Summarize the solution.
    Respond concisely but completely.
    """
    
    message = client.messages.create(
        model=model,
        max_tokens=1024,
        system=system_prompt,
        messages=[{"role": "user", "content": task}]
    )
    return message.content[0].text

# Example usage
task = "Plan a 3-day trip to Paris for a family of 4, budget $2000, including flights from NYC."
result = simple_task_agent(task)
print(result)

This spits out a detailed plan: flights, hotels, sights. Run it—Claude 3.5 nails the reasoning chain every time.

Why it works: The system prompt enforces structure. Claude's context window (200K tokens) handles long plans effortlessly.

Pro Tip: Always specify output format in prompts, like JSON for parsing: "Output as JSON with keys: plan, budget_breakdown, risks."

Level Up: Agents with Tools

Tools make agents unstoppable. Claude 3.5 supports function calling—define JSON schemas, let it decide when/how to call.

Real power: Integrate calculators, web search, or APIs. Agent observes tool outputs and iterates.

Defining Tools

Tools are dicts with name, description, input_schema. Claude picks them via XML tags in responses.

Example 2: Math & Weather Agent

Build an agent that calculates budgets and checks weather via a mock API (replace with real like OpenWeather).

import json
import requests  # For real APIs later

def calculator_tool(input_str: str) -> str:
    """Simple eval for math (use safely!)."""
    try:
        return str(eval(input_str))
    except:
        return "Error in calculation."

def weather_tool(city: str) -> str:
    """Mock weather API."""
    # Real: requests.get(f"https://api.openweathermap.org/data/2.5/weather?q={city}")
    return f"Weather in {city}: Sunny, 22°C."  # Mock response

TOOLS = [
    {
        "name": "calculator",
        "description": "Calculate math expressions.",
        "input_schema": {
            "type": "object",
            "properties": {"expression": {"type": "string"}},
        }
    },
    {
        "name": "get_weather",
        "get_weather": "Get current weather for a city.",
        "input_schema": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
        }
    }
]

def tool_agent(task: str, max_steps: int = 5):
    messages = [{"role": "user", "content": task}]
    system = "You are an agent with tools. Use <tool> tags to call, observe results."
    
    for step in range(max_steps):
        response = client.messages.create(
            model="claude-3-5-sonnet-20240620",
            max_tokens=1024,
            system=system,
            messages=messages,
            tools=TOOLS
        )
        
        content = response.content[0]
        messages.append({"role": "assistant", "content": content})
        
        if content.type == "text":
            return content.text
        
        # Handle tool calls
        for tool in content.tool_calls or []:
            func = {"calculator": calculator_tool, "get_weather": weather_tool}[tool.name]
            args = json.loads(tool.input)
            result = func(**args)
            
            messages.append({
                "role": "user",
                "content": [{"type": "tool_result", "tool_use_id": tool.id, "content": result}]
            })
    
    return "Max steps reached."

# Usage
result = tool_agent("What's the weather in Paris? Budget for 4 people: hotel $800 + food $400 * 1.1 tax.")
print(result)

This agent calls weather, then calculator for tax—outputs: "Weather: Sunny 22C. Total budget: 880 + 440 = $1320."

Loop handles multi-step: Claude reasons after each tool result. Scale by adding real APIs like SerpAPI for search.

Pro Tip: Parse tool calls robustly. Claude's XML is precise, but validate args with Pydantic for safety.

Adding Memory for Stateful Agents

Agents forget without memory. Use a list of messages to persist context.

Enhance our task agent: Track user prefs across turns.

class MemoryAgent:
    def __init__(self):
        self.memory = [{"role": "system", "content": """
        You are a personal assistant with memory. Recall past convos.
        Track user prefs like budget, allergies, etc.
        """}]
    
    def chat(self, user_input: str):
        self.memory.append({"role": "user", "content": user_input})
        
        response = client.messages.create(
            model="claude-3-5-sonnet-20240620",
            max_tokens=1024,
            messages=self.memory,
            tools=TOOLS  # Optional
        )
        
        ai_msg = response.content[0]
        self.memory.append({"role": "assistant", "content": ai_msg})
        return ai_msg.text

# Usage
agent = MemoryAgent()
print(agent.chat("Plan Paris trip, family of 4, $2000 budget."))
print(agent.chat("Adjust for vegetarian food only."))

Second response remembers budget/family—adapts meals. Persist memory to file/DB for production.

Word count so far? Building up. This class is reusable for bots.

Pro Tip: Trim old messages if context nears 200K tokens: keep last N exchanges + summaries.

Real-World Use Cases

Customer Support Bot: Route queries, check DB (tool), escalate. Saves hours for teams.

  • Integrate with Slack/Telegram via webhooks.
  • Claude classifies intent, pulls CRM data.

Code Assistant: Generate/debug code. Tool: exec() sandbox for testing.

Feeds: "Write a Flask app for user auth"—reviews, iterates.

Research Agent: Summarizes papers, searches web. Tools: Tavily/SerpAPI + PDF parsers.

  1. Query: "Latest on quantum computing."
  2. Calls search, reads top results, synthesizes report.

Data Analyst: Upload CSV (tool parses), asks viz/query via Pandas/SQL tools.

These scale to enterprise: Deploy on AWS Lambda for cheap, fast inference.

Pro Tip: Monitor costs—Claude 3.5 is $3/million input tokens. Cache common queries.

Handling Errors and Best Practices

Agents fail gracefully? Add retries, fallback prompts.

Validate tool inputs: Use schemas strictly.

Security: Sanitize evals/execs. Never expose raw API keys.

Scaling to Production

Wrap in FastAPI: POST /agent endpoint.

Async: asyncio for parallel tools.

Observability: Log traces with LangSmith or custom.

Pro Tip: Test with adversarial inputs. Claude resists jailbreaks, but prompt defensively: "Stay on task, no hypotheticals."

Conclusion

From simple tasks to tool-wielding pros, Claude 3.5 + Python unlocks agent magic. You've got three battle-tested examples—tweak, deploy, iterate.

Real-world wins await: Automate your workflow today. What's your first agent? Share on Codeyaan forums!

Happy coding—agents are the future.

Share this article