Agent Execution Loop¶
Understanding how agents process messages, call tools, and manage state.
Overview¶
The agent execution loop handles:
- Message preparation (context + new prompt)
- Model inference (via pydantic-ai)
- Tool execution (if requested)
- Response processing
- State updates (context, usage tracking)
Basic Execution Flow¶
sequenceDiagram
participant User
participant Agent
participant Context
participant PydanticAI
participant Model
participant Tools
User->>Agent: run(prompt)
Agent->>Context: get_messages()
Context-->>Agent: message_history
loop Until complete or max_iterations
Agent->>PydanticAI: run(messages)
PydanticAI->>Model: API request
Model-->>PydanticAI: response
alt Has tool calls
PydanticAI->>Tools: execute_tool()
Tools-->>PydanticAI: tool_result
Note over PydanticAI: Add result to messages
else No tool calls
Note over PydanticAI: Response complete
end
end
PydanticAI-->>Agent: RunResult
Agent->>Context: add_messages(new)
Agent->>Agent: track_usage()
Agent-->>User: AgentResult
Iteration Behavior¶
Tool Calling Loop¶
When the model requests tools, execution continues:
# Simplified execution logic
for iteration in range(max_iterations):
response = await model.complete(messages)
if response.has_tool_calls:
for tool_call in response.tool_calls:
result = await execute_tool(tool_call)
messages.append(tool_result_message(result))
else:
# No more tool calls, done
return response.content
Max Iterations¶
Prevents infinite loops:
config = AgentConfig(max_iterations=10) # Default
# If reached:
# - Execution stops
# - Current response returned
# - Warning logged
Context Integration¶
Message History¶
Context is automatically included:
# Before each run
messages = [
system_prompt,
*context_manager.get_messages(), # History
new_user_message, # Current prompt
]
After Execution¶
New messages are added:
# After run completes
context_manager.add_messages([
user_message,
assistant_response,
*tool_messages, # If any
])
Token Tracking¶
Per-Request Tracking¶
# After each model call
usage_tracker.record_usage(
prompt_tokens=response.usage.prompt_tokens,
completion_tokens=response.usage.completion_tokens,
model=model_name,
)
Aggregate Access¶
# Available after run
usage = agent.get_usage() # TokenUsage
cost = agent.get_cost() # float
history = agent.get_usage_history() # list[UsageRecord]
Auto-Compaction¶
Threshold Check¶
# After adding messages
if config.auto_compact and context_manager.should_compact():
await context_manager.compact()
Compaction Triggers¶
# should_compact() returns True when:
current_tokens = context_manager.get_token_count()
threshold = config.context.trigger_threshold_tokens
should_compact = current_tokens >= threshold
Execution Methods¶
Synchronous¶
Asynchronous¶
Streaming¶
Tool Execution¶
Tool Discovery¶
Tools are registered at agent creation:
Tool Call Flow¶
graph TD
A[Model Response] --> B{Has Tool Calls?}
B -->|Yes| C[Extract Tool Calls]
C --> D[Execute Each Tool]
D --> E[Collect Results]
E --> F[Add to Messages]
F --> G[Continue Loop]
B -->|No| H[Return Response]
Error Handling¶
# Tool errors don't stop execution
try:
result = await tool.execute(args)
return ToolResult(success=True, content=result)
except Exception as e:
return ToolResult(success=False, error=str(e))
# Model receives error and can retry or adjust
State Management¶
Context State¶
state = agent.get_context_state()
# ContextState(
# token_count=1500,
# message_count=12,
# system_prompt="...",
# compaction_history=[...]
# )
Usage State¶
usage = agent.get_usage()
# TokenUsage(
# prompt_tokens=1000,
# completion_tokens=500,
# total_tokens=1500,
# request_count=3
# )
Reset Operations¶
# Clear context only
agent.clear_context()
# Reset usage tracking only
agent.reset_tracking()
# Reset everything
agent.reset_all()
Multi-Turn Conversations¶
Automatic Context¶
# Turn 1
agent.run_sync("My name is Alice")
# Turn 2 - context includes Turn 1
result = agent.run_sync("What's my name?")
# Agent remembers "Alice"
Manual Context Control¶
# Start fresh conversation
agent.clear_context()
# Or with custom history
messages = [...]
agent.context_manager.add_messages(messages)
Performance Considerations¶
Token Efficiency¶
- Context grows with each turn
- Auto-compaction prevents overflow
- Choose appropriate compaction strategy
Latency¶
- Each iteration adds latency
- Tool execution time varies
- Consider timeouts for long operations
Cost¶
- More tokens = higher cost
- Tool calls add iterations
- Monitor with
agent.get_cost()
Debugging¶
Enable Debug Logging¶
Inspect Messages¶
# After a run
messages = agent.get_messages()
for msg in messages:
print(f"{msg['role']}: {msg['content'][:100]}...")