Handmade Oasis

Agent at Home Part 1b

• 2538 words • 12 min read

Alright, we covered the conceptual stuff in Part 1a. Now it’s time to actually look at some code. In this post I’m going to walk through the entire agent.py file line by line. By the end you’ll see that there really is no magic here, just a while loop, some API calls, and tool execution. Let’s dig in.

Getting Started

Before we dive into the code, let’s get the environment set up. I’m using uv for Python project management because it’s fast and just works. If you don’t have it installed, check out their docs, it’s a single curl command.

To set everything up from scratch:

1# Create a new project folder
2mkdir agent-at-home
3cd agent-at-home
4
5# Initialize a new uv project
6uv init

Now replace the contents of the generated pyproject.toml with:

 1[project]
 2name = "agent-at-home"
 3version = "0.1.0"
 4description = "Add your description here"
 5readme = "README.md"
 6requires-python = ">=3.13"
 7dependencies = [
 8    "litellm>=1.72.0",
 9    "python-dotenv>=1.2.1",
10]

Then install dependencies and set up your API key:

 1# Install dependencies
 2uv sync
 3
 4# Set up your API key based on which provider you want to use
 5export ANTHROPIC_API_KEY="your-key-here"  # For Claude models
 6export OPENAI_API_KEY="your-key-here"     # For GPT models
 7export GEMINI_API_KEY="your-key-here"     # For Gemini models
 8
 9# Run the agent - this command is for the end when agent.py is done
10uv run python agent.py

You only need to export the key for the provider you’re actually using. If you set MODEL_NAME = "anthropic/claude-haiku-4-5-20251001" in the code, you need ANTHROPIC_API_KEY. If you switch to "openai/gpt-4o", you need OPENAI_API_KEY. And so on.

That’s it. Two dependencies: litellm for the unified LLM interface across providers, and python-dotenv if you prefer keeping your API key in a .env file instead of exporting it. Nothing else needed.

The Imports

1import asyncio
2import json
3from pathlib import Path
4
5from litellm import acompletion
6
7MODEL_NAME = "anthropic/claude-haiku-4-5-20251001"
8MAX_STEPS = 10

Nothing fancy here. We’re using asyncio because we want to stream responses from the LLM. Without async, we’d have to wait for the entire response before showing anything. With async and streaming, we can print tokens as they arrive, giving that nice “typing” effect and making the agent feel more responsive. json for parsing tool arguments. pathlib because it’s just nicer than string manipulation for file paths.

Quick note on pathlib if you haven’t used it much: Path objects overload the / operator for path joining. So self.working_dir / "somefile.txt" isn’t division, it’s equivalent to os.path.join(self.working_dir, "somefile.txt") but reads much cleaner. You’ll see this pattern throughout the code.

The interesting import is litellm. This is the one dependency I’m allowing myself because it abstracts away the differences between OpenAI, Anthropic, Google, and other providers into a single unified interface. The acompletion function is the async version of their completion call. This means I can swap MODEL_NAME to "openai/gpt-4o" or "gemini/gemini-2.0-flash" and everything just works. No framework magic, just a thin wrapper over HTTP calls.

MAX_STEPS is our safety valve. We don’t want the agent running forever if it gets stuck in some weird loop. Ten iterations is plenty for most tasks.

The Agent Class

1class CodingAgent:
2    def __init__(self, working_dir="."):
3        self.working_dir = Path(working_dir).resolve()
4        self.messages = []
5        self.running = True
6        self.system_prompt = self._load_system_prompt()

Four pieces of state. That’s it.

A note on messages: this follows the OpenAI chat format that most LLM APIs use. Each message is a dict with a role (user, assistant, system, or tool) and content. The model expects this specific structure, and litellm handles translating it for different providers.

The Main Loop

 1async def run(self):
 2    print("\nCoding Agent Started")
 3    print("Commands: /clear /exit")
 4    print("Tip: Press enter twice to submit\n")
 5
 6    while self.running:
 7        try:
 8            user_input = self.get_input()
 9            if not user_input:
10                continue
11            if user_input.startswith("/"):
12                self.handle_command(user_input)
13            else:
14                await self.react_loop(user_input)
15        except KeyboardInterrupt:
16            print("\n\nGoodbye!")
17            break
18        except Exception as e:
19            print(f"\nError: {e}")

This is the outer loop. Not the ReAct loop, just the user interaction loop. Get input, check if it’s a command (like /clear or /exit), otherwise hand it off to the ReAct loop. The KeyboardInterrupt catch is so you can ctrl+c out gracefully.

Getting User Input

 1def get_input(self):
 2    print("\nYou:")
 3    lines = []
 4    while True:
 5        line = input("> " if not lines else "  ")
 6        if not line and lines:
 7            break
 8        if line:
 9            lines.append(line)
10    return "\n".join(lines)

This is a small quality of life thing. Instead of single-line input, you can type multiple lines and submit by pressing enter on an empty line. The prompt changes from > to (indented) after the first line so you know you’re in multi-line mode. Nothing revolutionary but makes it nicer to paste in code blocks or longer prompts.

Commands

1def handle_command(self, command):
2    if command == "/clear":
3        self.messages = []
4        print("Context cleared.")
5    elif command == "/exit":
6        print("\n\nGoodbye!")
7        self.running = False
8    else:
9        print(f"Unknown command: {command}")

Two commands for now. /clear wipes the conversation history which is useful when context gets too long or you want to start fresh. /exit does what you’d expect. We’ll add more commands as the series progresses, things like /undo, /save, etc.

The System Prompt

1def _load_system_prompt(self):
2    prompt_path = Path(__file__).parent / "system_prompt.txt"
3    return prompt_path.read_text()

The system prompt lives in a separate file and is loaded once during initialization (note the _ prefix indicating it’s a private method called from __init__). This is intentional for two reasons: first, you’ll be tweaking the prompt constantly as you figure out what instructions work best, and having it in a separate file means you don’t have to touch Python code to iterate. Second, caching it avoids re-reading from disk on every LLM call. The current system prompt is minimal:

You are a coding assistant that uses tools to help with file operations.

Use the ReAct pattern:
1. Think through the problem
2. Use tools to gather information or make changes
3. Explain what you did

Available tools:
- read_tool: Read a file's contents
- list_files: List files in a directory
- write_tool: Create or update a file

Be concise and clear. Ask for clarification when needed.

We explicitly tell the model to use the ReAct pattern. This is that Chain of Thought prompting technique I mentioned in Part 1a, we’re asking it to show its reasoning steps.

Defining Tools

 1def get_tools(self):
 2    return [
 3        {
 4            "type": "function",
 5            "function": {
 6                "name": "read_tool",
 7                "description": "Read the contents of a file at the given path",
 8                "parameters": {
 9                    "type": "object",
10                    "properties": {"path": {"type": "string"}},
11                    "required": ["path"],
12                },
13            },
14        },
15        {
16            "type": "function",
17            "function": {
18                "name": "write_tool",
19                "description": "Write content to a file at the given path",
20                "parameters": {
21                    "type": "object",
22                    "properties": {
23                        "path": {"type": "string"},
24                        "content": {"type": "string"},
25                    },
26                    "required": ["path", "content"],
27                },
28            },
29        },
30        {
31            "type": "function",
32            "function": {
33                "name": "list_files",
34                "description": "List files and directories at the given path",
35                "parameters": {
36                    "type": "object",
37                    "properties": {"path": {"type": "string"}},
38                    "required": [],
39                },
40            },
41        },
42    ]

This is the OpenAI function calling format (which litellm normalizes across providers). Each tool has a name, description, and a JSON Schema for its parameters. The description matters a lot since that’s what the model uses to decide when to call each tool.

Three tools to start:

These are the basics you need for a coding agent. We’ll add more sophisticated tools later (grep, glob, bash execution, etc.) but this is enough to be useful.

Executing Tools

 1async def execute_tool(self, name, input_data):
 2    try:
 3        if name == "read_tool":
 4            path = self.working_dir / input_data["path"]
 5            return {"content": path.read_text()}
 6        elif name == "write_tool":
 7            print(f"\nWrite to {input_data['path']}? (y/n): ", end="")
 8            if input().strip().lower() != "y":
 9                return {"error": "Cancelled"}
10            path = self.working_dir / input_data["path"]
11            path.parent.mkdir(parents=True, exist_ok=True)
12            path.write_text(input_data["content"])
13            return {"success": True}
14        elif name == "list_files":
15            dir_path = self.working_dir / input_data.get("path", ".")
16            files = [
17                {"name": item.name, "type": "dir" if item.is_dir() else "file"}
18                for item in dir_path.iterdir()
19            ]
20            return {"files": files}
21        return {"error": f"Unknown tool: {name}"}
22    except Exception as e:
23        return {"error": str(e)}

A few things to note:

  1. All paths are joined with working_dir. This is a basic sandboxing measure.
  2. write_tool asks for confirmation before writing. This is important! You don’t want an agent silently overwriting your files. We’ll explore more sophisticated approaches later (diffs, undo, etc.) but for now a simple y/n works.
  3. The line path.parent.mkdir(parents=True, exist_ok=True) creates any missing parent directories before writing. So if you ask the agent to write src/utils/helper.py and src/utils/ doesn’t exist, it creates it automatically.
  4. list_files returns structured data, not just strings. This makes it easier for the model to reason about the results.
  5. Everything is wrapped in try/except. Tool execution failing shouldn’t crash the agent, we just return an error message and let the model figure out what to do.

The ReAct Loop (The Heart of It All)

 1async def react_loop(self, user_input):
 2    self.messages.append({"role": "user", "content": user_input})
 3
 4    for step in range(1, MAX_STEPS + 1):
 5        print(f"\n[Step {step}]")
 6        print("Assistant: ", end="", flush=True)
 7
 8        messages_with_system = [
 9            {"role": "system", "content": self.system_prompt}
10        ] + self.messages
11
12        response = await acompletion(
13            model=MODEL_NAME,
14            max_tokens=4000,
15            tools=self.get_tools(),
16            messages=messages_with_system,
17            stream=True,
18        )

Here’s where the magic (that isn’t magic) happens. Remember from Part 1a: Thought → Action → Observation → repeat.

  1. Add the user message to history
  2. Loop up to MAX_STEPS times
  3. Each iteration: send the full conversation (system prompt + message history) to the model with the tool definitions. Notice we prepend the system prompt fresh each time rather than storing it in self.messages. This keeps the system instructions separate from the rolling conversation history, which makes it easier to manage context length later.

The stream=True is why we need all that chunk handling code coming next. Streaming means we get tokens as they’re generated instead of waiting for the full response.

 1        collected_content = ""
 2        tool_calls = []
 3
 4        async for chunk in response:
 5            delta = chunk.choices[0].delta
 6            if delta.content:
 7                print(delta.content, end="", flush=True)
 8                collected_content += delta.content
 9            if delta.tool_calls:
10                for tc in delta.tool_calls:
11                    if tc.index >= len(tool_calls):
12                        tool_calls.append({"id": tc.id or "", "name": "", "arguments": ""})
13                    if tc.function and tc.function.name:
14                        tool_calls[tc.index]["name"] = tc.function.name
15                        print(f"\n[Tool: {tc.function.name}]")
16                    if tc.function and tc.function.arguments:
17                        tool_calls[tc.index]["arguments"] += tc.function.arguments
18                    if tc.id:
19                        tool_calls[tc.index]["id"] = tc.id
20        print()

This chunk handling code is a bit gnarly, I’ll admit. Tool calls come in pieces across multiple chunks, so we need to accumulate them. The tc.index tells us which tool call this chunk belongs to (models can request multiple tools at once). We print content as it streams in for that nice typing effect.

Here’s what’s happening: when you stream a response, each chunk contains a delta with partial data. For text content, you might get a few words at a time. For tool calls, it’s trickier. The first chunk might contain the tool name and ID, then subsequent chunks stream in the JSON arguments piece by piece. That’s why we do arguments += tc.function.arguments to accumulate them. The tc.index exists because models can call multiple tools in parallel, so we need to know which tool call each chunk belongs to. We check tc.index >= len(tool_calls) to detect when a new tool call starts and initialize a fresh dict for it.

 1        assistant_message = {"role": "assistant", "content": collected_content}
 2        if tool_calls:
 3            assistant_message["tool_calls"] = [
 4                {
 5                    "id": tc["id"],
 6                    "type": "function", 
 7                    "function": {"name": tc["name"], "arguments": tc["arguments"]},
 8                }
 9                for tc in tool_calls
10            ]
11        self.messages.append(assistant_message)
12
13        if not tool_calls:
14            return

After collecting the full response, we add it to message history. If there were no tool calls, we’re done, the model has finished reasoning and given us a final answer. This is the exit condition for our ReAct loop.

 1        for tc in tool_calls:
 2            arguments = json.loads(tc["arguments"]) if tc["arguments"] else {}
 3            result = await self.execute_tool(tc["name"], arguments)
 4            result_str = str(result)
 5            display = f"{result_str[:200]}..." if len(result_str) > 200 else result_str
 6            print(f"[Result: {display}]")
 7            self.messages.append({
 8                "role": "tool",
 9                "tool_call_id": tc["id"],
10                "content": json.dumps(result),
11            })
12
13    print(f"\n[Warning: Reached max steps ({MAX_STEPS}) without completion]")

If there were tool calls, execute them and add the results to message history with the special "role": "tool" format. The tool_call_id links the result back to the specific tool call. Then we loop back to the top and let the model reason about these new observations.

Note the result display logic: we only show ... when the result actually exceeds 200 characters. And if the loop exhausts MAX_STEPS without the model finishing naturally, we print a warning so you know something might be off.

That’s the ReAct loop. Thought (model response) → Action (tool execution) → Observation (tool result added to context) → repeat until no more actions needed.

Entry Point

 1async def main():
 2    print("=" * 50)
 3    print(f"Coding Agent ({MODEL_NAME})")
 4    print("=" * 50)
 5    agent = CodingAgent()
 6    await agent.run()
 7
 8if __name__ == "__main__":
 9    try:
10        asyncio.run(main())
11    except KeyboardInterrupt:
12        pass

Standard async entry point. Create agent, run it. The outer KeyboardInterrupt catch is a fallback in case ctrl+c happens before the agent’s main loop starts.

Wrapping Up

And that’s it. Around 250 lines of Python, most of it boilerplate as well, and you have a working coding agent. No frameworks, no abstractions you don’t understand. Every line is here, every decision is visible.

Is this production ready? No. Is it competitive with Claude Code or Cursor? Not even close. But that was never the point. The point is that you now understand exactly what’s happening when an agent “thinks” and “acts”. There’s no magic. It’s a while loop, some API calls, and executing tools based on the model’s decisions.

Here’s the fun part: you can now use the agent to improve itself. Point it at its own codebase and ask it to add a new tool, refactor the streaming logic, or improve error handling. It can read agent.py, understand what’s there, and write changes. There’s something satisfying about that recursive loop, using the thing you built to make the thing you built better.

In the next part we’ll start adding more sophisticated features. Better tools, smarter context management, maybe a nicer TUI. But the core loop you’ve seen here? That stays the same.

The full code is available on GitHub: github.com/ramtinJ95/agent-at-home (branch: Version-1-base). Clone it, run it, break it, improve it.

What I cannot create, I do not understand. Now you understand.

#agent at home series #ai agents #developer tooling #generative ai #software engineering