Engineering Agents

Deconstructing Potpie’s Debugging agent

Software development is fundamentally an iterative process, and debugging is no exception. What makes our debugging agent unique is its ability to mirror how developers actually debug code. The traditional debugging process can be broken down into clear steps:

Understanding the stacktrace
Understanding the code around the stacktrace
Coming up with a hypothesis
Testing the hypothesis
Repeating until the bug is fixed

To automate this process effectively, Potpie builds a comprehensive knowledge graph of your codebase that tracks relationships between functions, files, classes, etc.

This graph serves as the agent's "mental model" of your codebase. For each node in the graph, we generate and embed inferences for each node that can be retrieved through similarity search based on the user's query. This allows the agent to:

Understand code flow and project structure
Curate relevant context for any debugging scenario
Reason about code relationships and dependencies
Make informed debugging decisions

Here is a breakdown of how the Debugging agent works:

Classification

The Debugging Agent flow starts with a classification system that decides whether the query can be answered by the history and base LLM training data or whether it needs additional context. The system classifies every query + history combination into:

LLM_SUFFICIENT Handles general debugging concepts using base knowledge and history
AGENT_REQUIRED Requires diving into project-specific code context

The classification prompt embodies different personas when making this decision:

Error Analyst Deciphers stack traces
Code Detective Investigates specific code implementations
Context Evaluator Understands when broader project knowledge is needed or if the chat history is enough

Knowledge graph tooling

The agent possesses a well-equipped toolbox. It’s core capabilities include:

Knowledge Graph Queries Perform similarity search over embeddings of docstrings in the knowledge graph. This is immensely useful to find exact code context from natural language. E.g “Which function handles authentication” is an amazing query for the docstring “The purpose of this function is to handle authentication using Firebase”
Code Retrieval Fetching code implementations of relevant nodes in the knowledge graph. Most of the other tools return the node id and docstring so that the agent is able to determine which ones are the most relevant. This tool is able to fetch exact code for each node so that the agent has precise code context of relevant files and functions.
Node Neighbour Analysis Understanding end to end code context by fetching every node’s neighbours. This helps it look at broader edge cases and not be myopic to only the function mentioned in the stack trace. E.g The input to the function in the trace could be null, fetching the neighbours context helps the agent see why that occurs.
Tag-based Retrieval Finding relevant code sections from the knowledge graph that were tagged with a specific label like “API” or “Database”. This helps get precise context needed to solve a bug. For e.g. the query might be something like ‘Fix authentication in POST /document API’ , this tag can help the agent figure out where that API is defined in your code.

Agent configuration

In practice, the agent utilizes these tools in order to act like an experienced pair programmer with you. The prompts for the agent include directives for:

Query Processing
- Classifies the debugging issue based on the query - Extract relevant filenames from the stack trace, understand the problem from the error message.
- Determines needed tools - Based on the above analysis, determines the best tools to gather the necessary context to debug.
- Reviews chat history for any hints or previous work - For example, it should be able to make sense of an input with just logs, if the previous message from the agent was adding a bunch of print statements to pivotal steps in the code.
- Transforms the query and history into a format that is closer to the format of the knowledge graph embeddings for similarity search.
Analysis & Response Steps
- Forms hypotheses about root causes - Uses input context and retrieved code context to arrive at a hypothesis.
- Validates assumptions using tools - Retrieves precise context that helps in validating whether the hypothesis holds any ground.
- Provides step-by-step solutions - Generates a debugging plan to validate and fix the root cause.
- Streams responses in real-time.

Your Pair Programmer

The agent's utility isn't just limited to its technical capabilities - it's also about how it communicates. Having a good user experience is important for the user to want to use the agent. Through carefully crafted prompts for the chat chain, it maintains a conversation with you about your problem statement:

System Prompt
- Establishes persona as a debugging assistant
- Sets guidelines for accuracy and transparency i.e. reviews answers for hallucinations and tries not to make assumptions without data.
- Asks for additional information when it is not able to gather enough data.
Human Interaction Prompt
- Guides conversation flow - It should feel like a pair programming session with a peer
- Adapts to developer expertise - Based on your response, its answers should adapt to your level of expertise.
- Maintains context throughout debugging - Message history is extremely important context when it comes to debugging using techniques like print debugging.

How to use the debugging agent

As we saw above, this agent specialized in analyzing stacktraces and errors and relate it to your codebase. It will help you by providing debugging directions, add print debugging statements for you and help you come to the root cause iteratively in a conversation.

Use it for asking questions like:

✅ I’m getting a 401 unauthorized error from @update_document API , help me debug

✅ Why am I getting this error. Stacktrace :

‍

Traceback (most recent call last):
File "/app/training_unit.py", line 125, in delete_training_unit
raise HTTPException(
fastapi.exceptions.HTTPException: Could not delete training unit. 
Error: (sqlite3.DatabaseError) Failed to delete record

‍

✅ Help me fix the TypeError in this line query.lower().split() of @query_vector_store

‍

Liked what you read? See it in action!