LangChain-09-Agents模块
模块概览
职责与定位
Agents模块是LangChain中实现自主任务执行的核心架构层。该模块包含两个层次:
- 核心数据层 (
langchain_core.agents): 定义Agent的数据结构和协议 - 执行框架层 (
langchain.agents): 提供Agent执行器和具体Agent实现
Agent是LangChain中最高层的抽象,能够:
- 根据用户输入动态规划行动序列
- 自主选择和调用工具
- 根据中间观察结果调整策略
- 循环执行直到达成目标
核心职责:
- 定义Agent的输入输出数据结构(AgentAction、AgentFinish、AgentStep)
- 提供Agent执行循环框架(AgentExecutor)
- 支持多种Agent策略(ReAct、OpenAI Functions、Structured Chat等)
- 管理中间步骤(intermediate_steps)和错误处理
- 提供流式输出和异步执行能力
输入输出
上游输入(用户/应用层):
{
"input": str, # 用户任务描述
"intermediate_steps": List[Tuple[AgentAction, str]], # 历史步骤(可选)
"chat_history": List[BaseMessage], # 对话历史(可选)
}
Agent决策输出(AgentAction):
AgentAction(
tool: str, # 选择的工具名称
tool_input: dict | str, # 工具参数
log: str, # 推理过程(Thought)
)
最终输出(AgentFinish):
AgentFinish(
return_values: dict, # 最终返回值 {"output": "..."}
log: str, # 完整推理过程
)
上下游依赖
核心依赖:
langchain_core.runnables: Runnable接口,Agent实现为Runnablelangchain_core.messages: 消息格式(AIMessage、HumanMessage等)langchain_core.tools: 工具定义和执行langchain_core.language_models: LLM/ChatModel用于决策langchain_core.prompts: 提示模板(agent_scratchpad)langchain_core.output_parsers: 解析LLM输出为AgentAction/AgentFinishlangchain_core.callbacks: 回调和追踪机制pydantic: 数据验证
被依赖:
langchain.agents: 具体Agent实现(ReAct、OpenAI Functions、Structured Chat等)langgraph: 新一代图式Agent框架(推荐)- 应用层Agent系统(多Agent协作、工作流等)
架构演进说明
⚠️ 重要: LangChain的Agent架构正在向LangGraph迁移:
| 架构 | 版本 | 状态 | 适用场景 |
|---|---|---|---|
| langchain_core.agents | Classic | ⚠️ 维护中 | 核心数据结构(持续使用) |
| langchain.agents.AgentExecutor | Classic | ⚠️ 维护中 | 简单单Agent场景 |
| langgraph | Modern | ✅ 推荐 | 复杂Agent、多Agent、工作流 |
本文档重点介绍核心数据结构和执行机制,这些概念在新旧架构中都适用。
整体架构图
系统级架构
flowchart TB
subgraph UserApp["用户应用层"]
USER[用户输入任务]
OUTPUT[接收最终结果]
end
subgraph ExecutorLayer["执行器层 (langchain.agents)"]
EXECUTOR[AgentExecutor<br/>执行循环管理器]
ITERATOR[AgentExecutorIterator<br/>迭代器支持]
end
subgraph AgentLayer["Agent决策层"]
subgraph AgentTypes["Agent实现类型"]
REACT[ReActAgent<br/>思维链推理]
OPENAI[OpenAIFunctionsAgent<br/>函数调用]
STRUCT[StructuredChatAgent<br/>结构化输入]
CUSTOM[RunnableAgent<br/>自定义Runnable]
end
AGENT_RUNNABLE[Agent Runnable<br/>统一执行接口]
end
subgraph CoreDataLayer["核心数据层 (langchain_core.agents)"]
ACTION[AgentAction<br/>工具调用决策]
FINISH[AgentFinish<br/>任务完成标记]
STEP[AgentStep<br/>执行步骤记录]
end
subgraph IntegrationLayer["集成层"]
LLM[Language Models<br/>决策引擎]
PROMPT[Prompts<br/>agent_scratchpad]
PARSER[OutputParsers<br/>响应解析]
TOOLS[Tools<br/>工具注册表]
CALLBACKS[Callbacks<br/>追踪回调]
end
subgraph ExecutionFlow["执行流转"]
PLAN[plan 决策]
EXEC[execute 执行]
OBS[observe 观察]
LOOP[loop 循环]
end
%% 用户交互流
USER -->|invoke/stream| EXECUTOR
EXECUTOR -->|返回结果| OUTPUT
%% 执行器到Agent
EXECUTOR -->|调用plan| AGENT_RUNNABLE
AGENT_RUNNABLE -->|继承实现| AgentTypes
%% Agent决策流
AGENT_RUNNABLE -->|输出| ACTION
AGENT_RUNNABLE -->|输出| FINISH
%% 执行步骤
EXECUTOR -->|工具执行| EXEC
EXEC -->|生成| STEP
STEP -->|包含| ACTION
%% 集成调用
AGENT_RUNNABLE -->|使用| LLM
AGENT_RUNNABLE -->|使用| PROMPT
LLM -->|输出解析| PARSER
PARSER -->|转换为| ACTION
PARSER -->|转换为| FINISH
ACTION -->|查找工具| TOOLS
EXECUTOR -->|事件通知| CALLBACKS
%% 循环流转
PLAN -.决策阶段.-> EXEC
EXEC -.执行阶段.-> OBS
OBS -.观察阶段.-> LOOP
LOOP -.循环判断.-> PLAN
style ACTION fill:#e1f5ff
style FINISH fill:#e8f5e9
style STEP fill:#fff4e1
style EXECUTOR fill:#ffe1f5
style LLM fill:#f5e1ff
架构层次说明
1. 用户应用层(User App Layer)
职责: 接收用户任务,返回最终结果
交互接口:
# 同步调用
result = agent_executor.invoke({"input": "用户任务"})
# 流式调用
for chunk in agent_executor.stream({"input": "用户任务"}):
print(chunk)
# 迭代器调用
for step in agent_executor.iter({"input": "用户任务"}):
print(f"Step {step.action.tool}: {step.observation}")
2. 执行器层(Executor Layer)
核心组件: AgentExecutor
职责:
- 管理Agent执行循环(while循环)
- 控制最大迭代次数(max_iterations,默认15)
- 管理超时时间(max_execution_time)
- 处理解析错误(handle_parsing_errors)
- 裁剪中间步骤(trim_intermediate_steps)
- 管理工具映射(name_to_tool_map)
- 触发回调事件
核心代码路径:
libs/langchain/langchain_classic/agents/agent.py
|- class AgentExecutor(Chain)
|- _call(): 同步执行循环
|- _acall(): 异步执行循环
|- _iter_next_step(): 单步迭代
|- _perform_agent_action(): 执行工具调用
3. Agent决策层(Agent Decision Layer)
核心组件: Agent实现类 + Runnable接口
Agent类型:
| Agent类型 | 实现类 | 决策机制 | 适用场景 |
|---|---|---|---|
| ReAct | ReActAgent |
思维链Prompt(Thought/Action/Observation) | 通用推理任务,需要可解释性 |
| OpenAI Functions | OpenAIFunctionsAgent |
原生Function Calling API | OpenAI/Anthropic模型,需要结构化工具调用 |
| Structured Chat | StructuredChatAgent |
JSON Schema + Prompt | 复杂工具参数,多模态输入 |
| Runnable | RunnableAgent |
自定义Runnable链 | 完全自定义决策逻辑 |
统一接口:
class BaseSingleActionAgent:
def plan(
self,
intermediate_steps: List[Tuple[AgentAction, str]],
callbacks: Callbacks = None,
**kwargs: Any,
) -> AgentAction | AgentFinish:
"""单步决策:返回下一步动作或完成标记"""
4. 核心数据层(Core Data Layer)
定义位置: langchain_core.agents
数据结构:
class AgentAction(Serializable):
"""Agent决定调用工具的输出"""
tool: str # 工具名称
tool_input: str | dict # 工具参数
log: str # 推理过程(完整LLM输出)
type: Literal["AgentAction"] = "AgentAction"
class AgentFinish(Serializable):
"""Agent完成任务的输出"""
return_values: dict # 返回值 {"output": "..."}
log: str # 最终推理日志
type: Literal["AgentFinish"] = "AgentFinish"
class AgentStep(Serializable):
"""执行步骤记录"""
action: AgentAction # Agent决策
observation: Any # 工具执行结果
扩展类型:
class AgentActionMessageLog(AgentAction):
"""带消息历史的Action(用于ChatModel)"""
message_log: Sequence[BaseMessage]
class AgentFinishMessageLog(AgentFinish):
"""带消息历史的Finish(用于ChatModel)"""
message_log: Sequence[BaseMessage]
5. 集成层(Integration Layer)
Language Models:
- 作为Agent的"大脑",负责推理和决策
- 输入:prompt + agent_scratchpad(历史步骤)
- 输出:文本/tool_calls(取决于Agent类型)
Prompts:
- 核心占位符:
{agent_scratchpad}- 包含intermediate_steps的格式化文本 - ReAct格式:
Thought: ... Action: tool_name Action Input: {...} Observation: tool_result - OpenAI Functions格式:MessagesPlaceholder(“agent_scratchpad”)
OutputParsers:
ReActSingleInputOutputParser: 解析ReAct格式输出OpenAIFunctionsAgentOutputParser: 解析function_call为AgentAction- 负责将LLM文本/结构化输出转换为AgentAction/AgentFinish
Tools:
- 注册到Agent的可用工具列表
- AgentExecutor维护name_to_tool_map映射
- 根据AgentAction.tool查找并执行
Callbacks:
on_agent_action: Agent决定调用工具时触发on_agent_finish: Agent完成任务时触发on_tool_start/on_tool_end: 工具执行前后触发
6. 执行流转(Execution Flow)
┌──────────────────────────────────────────────────────┐
│ Plan (决策阶段) │
│ - 调用 agent.plan(intermediate_steps, **inputs) │
│ - LLM推理 + OutputParser解析 │
│ - 返回 AgentAction | AgentFinish │
└──────────────┬───────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────┐
│ Execute (执行阶段) │
│ - 如果是AgentFinish: 结束循环,返回结果 │
│ - 如果是AgentAction: 查找工具并执行 │
│ - tool.invoke(agent_action.tool_input) │
└──────────────┬───────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────┐
│ Observe (观察阶段) │
│ - 记录 AgentStep(action=action, observation=result) │
│ - 添加到 intermediate_steps │
└──────────────┬───────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────┐
│ Loop (循环判断) │
│ - 检查迭代次数 < max_iterations │
│ - 检查执行时间 < max_execution_time │
│ - 如果满足条件: 返回Plan阶段 │
│ - 否则: 强制停止或生成最终答案 │
└──────────────────────────────────────────────────────┘
关键设计模式
1. 策略模式(Agent Types)
不同Agent类型实现不同的决策策略,但都遵循统一的plan()接口:
# ReAct策略:思维链Prompt
agent = create_react_agent(llm, tools, prompt)
# OpenAI Functions策略:Function Calling
agent = create_openai_functions_agent(llm, tools, prompt)
# 统一执行
agent_executor = AgentExecutor(agent=agent, tools=tools)
result = agent_executor.invoke({"input": "..."})
2. 迭代器模式(Execution Loop)
AgentExecutor通过while循环实现迭代执行:
def _call(self, inputs):
intermediate_steps = []
iterations = 0
while self._should_continue(iterations, time_elapsed):
# 决策
output = self._action_agent.plan(intermediate_steps, **inputs)
# 判断结束
if isinstance(output, AgentFinish):
return self._return(output, intermediate_steps)
# 执行工具
observation = tool.run(output.tool_input)
# 记录步骤
intermediate_steps.append((output, observation))
iterations += 1
3. 组合模式(Runnable Composition)
Agent实现为Runnable,支持LCEL组合:
# ReAct Agent的Runnable链
agent_runnable = (
RunnablePassthrough.assign(
agent_scratchpad=lambda x: format_log_to_str(x["intermediate_steps"])
)
| prompt
| llm_with_stop
| output_parser
)
4. 观察者模式(Callbacks)
通过回调系统实现可观测性:
from langchain.callbacks import StdOutCallbackHandler
agent_executor.invoke(
{"input": "任务"},
config={"callbacks": [StdOutCallbackHandler()]}
)
# 输出:
# > Entering new AgentExecutor chain...
# Thought: ...
# Action: search
# Observation: ...
模块交互时序图
时序图1: Agent执行器完整生命周期
sequenceDiagram
autonumber
participant User as 用户/应用
participant Executor as AgentExecutor
participant Agent as Agent Runnable
participant Prompt as ChatPromptTemplate
participant LLM as Language Model
participant Parser as OutputParser
participant Tool as Tool Registry
participant Callback as CallbackManager
Note over User,Callback: 阶段1: 初始化(构造时)
User->>Executor: AgentExecutor(agent, tools, max_iterations=15)
Executor->>Executor: 构建name_to_tool_map<br/>初始化回调管理器
Note over User,Callback: 阶段2: 启动执行
User->>Executor: invoke({"input": "帮我查询天气并总结"})
Executor->>Callback: on_chain_start("AgentExecutor")
Executor->>Executor: intermediate_steps = []<br/>iterations = 0
Note over User,Callback: 阶段3: 第1轮决策 - 调用search工具
Executor->>Agent: plan(intermediate_steps=[], input="...")
Agent->>Prompt: format({<br/> input: "...",<br/> agent_scratchpad: ""<br/>})
Prompt-->>Agent: formatted_messages
Agent->>LLM: invoke(formatted_messages)
Note right of LLM: 模型推理决策:<br/>需要先查询天气数据
LLM-->>Agent: AIMessage(content="Thought: 需要查询天气<br/>Action: search<br/>Action Input: {query: '北京天气'}")
Agent->>Parser: parse(llm_output)
Parser-->>Agent: AgentAction(tool="search", tool_input={"query":"北京天气"})
Agent-->>Executor: AgentAction
Executor->>Callback: on_agent_action(AgentAction)
Executor->>Tool: lookup_tool("search")
Tool-->>Executor: search_tool
Executor->>Callback: on_tool_start("search", {"query":"北京天气"})
Executor->>Tool: search_tool.invoke({"query":"北京天气"})
Note right of Tool: 执行搜索:<br/>调用天气API
Tool-->>Executor: observation = "北京今日晴,25°C"
Executor->>Callback: on_tool_end(observation)
Executor->>Executor: intermediate_steps.append(<br/> (AgentAction, observation)<br/>)
Executor->>Executor: iterations = 1
Note over User,Callback: 阶段4: 第2轮决策 - 调用summarize工具
Executor->>Agent: plan(intermediate_steps=[step1], input="...")
Agent->>Prompt: format({<br/> input: "...",<br/> agent_scratchpad: "Action: search\nObservation: 北京今日晴,25°C"<br/>})
Prompt-->>Agent: formatted_messages
Agent->>LLM: invoke(formatted_messages)
Note right of LLM: 模型根据观察结果<br/>决定总结
LLM-->>Agent: AIMessage(content="Thought: 现在总结天气信息<br/>Action: summarize<br/>Action Input: {text: '北京今日晴,25°C'}")
Agent->>Parser: parse(llm_output)
Parser-->>Agent: AgentAction(tool="summarize", tool_input={"text":"..."})
Agent-->>Executor: AgentAction
Executor->>Callback: on_agent_action(AgentAction)
Executor->>Tool: lookup_tool("summarize")
Tool-->>Executor: summarize_tool
Executor->>Callback: on_tool_start("summarize", {...})
Executor->>Tool: summarize_tool.invoke({"text":"..."})
Note right of Tool: 执行总结:<br/>调用LLM生成摘要
Tool-->>Executor: observation = "今天北京天气晴朗,气温适宜..."
Executor->>Callback: on_tool_end(observation)
Executor->>Executor: intermediate_steps.append(<br/> (AgentAction, observation)<br/>)
Executor->>Executor: iterations = 2
Note over User,Callback: 阶段5: 第3轮决策 - 完成任务
Executor->>Agent: plan(intermediate_steps=[step1,step2], input="...")
Agent->>Prompt: format({<br/> input: "...",<br/> agent_scratchpad: "...(所有历史步骤)"<br/>})
Prompt-->>Agent: formatted_messages
Agent->>LLM: invoke(formatted_messages)
Note right of LLM: 模型判断<br/>信息已足够,返回最终答案
LLM-->>Agent: AIMessage(content="Thought: 已完成任务<br/>Final Answer: 今天北京天气晴朗...")
Agent->>Parser: parse(llm_output)
Parser-->>Agent: AgentFinish(return_values={"output":"今天北京天气晴朗..."})
Agent-->>Executor: AgentFinish
Executor->>Callback: on_agent_finish(AgentFinish)
Executor->>Executor: 构建final_output<br/>可选添加intermediate_steps
Executor->>Callback: on_chain_end(final_output)
Executor-->>User: {"output": "今天北京天气晴朗,气温适宜..."}
时序图详解
阶段1: 初始化(第1-2步)
步骤1: 用户创建AgentExecutor
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_openai import ChatOpenAI
# 创建Agent
agent = create_openai_functions_agent(ChatOpenAI(), tools, prompt)
# 创建执行器
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
max_iterations=15, # 最大迭代次数
max_execution_time=300, # 超时时间(秒)
handle_parsing_errors=True, # 处理解析错误
return_intermediate_steps=False, # 是否返回中间步骤
)
步骤2: AgentExecutor内部初始化
- 构建
name_to_tool_map:{"search": search_tool, "summarize": summarize_tool} - 验证Agent和Tools的兼容性
- 初始化回调管理器
关键代码:
# libs/langchain/langchain_classic/agents/agent.py
class AgentExecutor(Chain):
def __init__(self, agent, tools, **kwargs):
super().__init__(**kwargs)
self.agent = agent
self.tools = tools
# 构建工具映射
self.name_to_tool_map = {tool.name: tool for tool in tools}
阶段2: 启动执行(第3-5步)
步骤3: 用户调用invoke
result = agent_executor.invoke({
"input": "帮我查询北京天气并总结"
})
步骤4: 触发on_chain_start回调
- 记录执行开始时间
- 打印debug信息(如果verbose=True)
- 通知LangSmith/LangFuse等追踪系统
步骤5: 初始化循环变量
intermediate_steps: List[Tuple[AgentAction, str]] = []
iterations = 0
time_elapsed = 0.0
start_time = time.time()
阶段3: 第1轮决策 - 工具调用(第6-20步)
步骤6-7: Agent决策入口
# AgentExecutor._call()
output = self._action_agent.plan(
intermediate_steps, # 当前为空[]
callbacks=run_manager.get_child(),
**inputs # {"input": "帮我查询北京天气并总结"}
)
步骤8-9: 格式化prompt
# Agent Runnable处理(以ReAct为例)
# libs/langchain/langchain_classic/agents/react/agent.py
agent_runnable = (
RunnablePassthrough.assign(
agent_scratchpad=lambda x: format_log_to_str(x["intermediate_steps"])
)
| prompt
| llm_with_stop
| output_parser
)
# 第1轮时 agent_scratchpad = "" (因为intermediate_steps为空)
步骤10-11: 调用LLM
# LLM接收到的prompt(ReAct格式):
"""
Answer the following questions as best you can. You have access to the following tools:
search: 搜索互联网获取信息
summarize: 总结文本内容
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [search, summarize]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: 帮我查询北京天气并总结
Thought:
"""
# LLM输出:
"""
Thought: 我需要先查询北京的天气信息
Action: search
Action Input: {"query": "北京天气"}
"""
步骤12-14: 解析LLM输出
# libs/langchain/langchain_classic/agents/output_parsers/react_single_input.py
class ReActSingleInputOutputParser(AgentOutputParser):
def parse(self, text: str) -> AgentAction | AgentFinish:
# 查找 "Action:" 和 "Action Input:"
if "Final Answer:" in text:
return AgentFinish(...)
else:
action = extract_action(text) # "search"
action_input = extract_action_input(text) # {"query":"北京天气"}
return AgentAction(
tool=action,
tool_input=action_input,
log=text # 完整LLM输出
)
步骤15: 返回AgentAction到Executor
步骤16-19: 执行工具
# AgentExecutor._perform_agent_action()
if agent_action.tool in name_to_tool_map:
tool = name_to_tool_map[agent_action.tool] # search_tool
# 触发on_tool_start回调
run_manager.on_tool_start(...)
# 执行工具
observation = tool.run(
agent_action.tool_input, # {"query":"北京天气"}
verbose=self.verbose,
callbacks=run_manager.get_child()
)
# observation = "北京今日晴,25°C,空气质量良好"
# 触发on_tool_end回调
run_manager.on_tool_end(observation)
步骤20-21: 记录执行步骤
step = AgentStep(
action=agent_action,
observation=observation
)
intermediate_steps.append((agent_action, observation))
iterations += 1 # iterations = 1
阶段4: 第2轮决策 - 再次工具调用(第22-35步)
关键差异: agent_scratchpad现在包含第1轮的历史
# 步骤23: 格式化agent_scratchpad
agent_scratchpad = """
Action: search
Action Input: {"query": "北京天气"}
Observation: 北京今日晴,25°C,空气质量良好
"""
# LLM接收到的完整prompt:
"""
Question: 帮我查询北京天气并总结
Thought:
Action: search
Action Input: {"query": "北京天气"}
Observation: 北京今日晴,25°C,空气质量良好
Thought:
"""
# LLM输出(步骤26):
"""
Thought: 现在我有了天气数据,需要总结一下
Action: summarize
Action Input: {"text": "北京今日晴,25°C,空气质量良好"}
"""
步骤27-35: 重复工具执行流程
- 解析为AgentAction(tool=“summarize”)
- 查找summarize_tool
- 执行总结工具
- 记录第2个AgentStep
- iterations = 2
阶段5: 第3轮决策 - 完成任务(第36-45步)
步骤37: Agent_scratchpad包含2轮历史
agent_scratchpad = """
Action: search
Action Input: {"query": "北京天气"}
Observation: 北京今日晴,25°C,空气质量良好
Action: summarize
Action Input: {"text": "北京今日晴,25°C,空气质量良好"}
Observation: 今天北京天气晴朗,气温适宜,适合户外活动
"""
步骤40: LLM判断任务已完成
# LLM输出:
"""
Thought: 我已经查询并总结了北京天气,可以给出最终答案了
Final Answer: 今天北京天气晴朗,气温25°C,空气质量良好,适合户外活动
"""
步骤42: 解析为AgentFinish
parser.parse(llm_output)
# 检测到 "Final Answer:"
return AgentFinish(
return_values={"output": "今天北京天气晴朗,气温25°C..."},
log=llm_output
)
步骤43-45: 结束循环
# AgentExecutor._call()
if isinstance(output, AgentFinish):
run_manager.on_agent_finish(output)
final_output = output.return_values
if self.return_intermediate_steps:
final_output["intermediate_steps"] = intermediate_steps
return final_output
边界条件处理
1. 最大迭代次数限制
# AgentExecutor._should_continue()
def _should_continue(self, iterations: int, time_elapsed: float) -> bool:
if self.max_iterations is not None and iterations >= self.max_iterations:
return False
if self.max_execution_time is not None and time_elapsed >= self.max_execution_time:
return False
return True
# 如果达到限制
if not self._should_continue(iterations, time_elapsed):
if self.early_stopping_method == "force":
# 强制返回
return AgentFinish(
return_values={"output": "Agent stopped due to iteration limit"},
log=""
)
elif self.early_stopping_method == "generate":
# 让LLM基于当前信息生成最终答案
final_output = self._action_agent.return_stopped_response(...)
return final_output
2. 解析错误处理
try:
output = self._action_agent.plan(intermediate_steps, **inputs)
except OutputParserException as e:
if self.handle_parsing_errors == True:
# 将错误作为observation返回给Agent
observation = "Invalid or incomplete response"
output = AgentAction("_Exception", observation, str(e))
yield AgentStep(action=output, observation=observation)
elif isinstance(self.handle_parsing_errors, str):
observation = self.handle_parsing_errors
yield AgentStep(action=output, observation=observation)
elif callable(self.handle_parsing_errors):
observation = self.handle_parsing_errors(e)
yield AgentStep(action=output, observation=observation)
else:
raise
3. 工具执行失败
try:
observation = tool.run(agent_action.tool_input)
except Exception as e:
# 将异常信息作为observation
observation = f"Tool execution failed: {str(e)}"
# Agent将在下一轮看到这个错误信息,可以选择重试或改用其他工具
性能要点
| 维度 | 影响因素 | 优化建议 |
|---|---|---|
| 延迟 | 每轮迭代需要1次LLM调用 | 使用更快的模型(如GPT-3.5);减少不必要的迭代 |
| Token消耗 | agent_scratchpad随迭代增长 | 设置trim_intermediate_steps;总结历史步骤 |
| 并发 | 串行执行工具 | 使用MultiActionAgent并行调用多个工具 |
| 可靠性 | LLM可能输出无效格式 | 设置handle_parsing_errors=True;使用结构化输出 |
异常流与回退
stateDiagram-v2
[*] --> Plan决策
Plan决策 --> 解析成功: LLM输出合法
Plan决策 --> 解析失败: LLM输出非法
解析成功 --> AgentAction: 非Final Answer
解析成功 --> AgentFinish: Final Answer
解析失败 --> 错误处理: handle_parsing_errors=True
解析失败 --> 抛出异常: handle_parsing_errors=False
错误处理 --> AgentAction: 构造_Exception Action
AgentAction --> 工具执行
工具执行 --> 执行成功: Tool正常返回
工具执行 --> 执行失败: Tool抛出异常
执行成功 --> 记录步骤: AgentStep
执行失败 --> 记录步骤: AgentStep(observation=error)
记录步骤 --> 循环判断
循环判断 --> Plan决策: iterations < max
循环判断 --> 强制停止: iterations >= max
AgentFinish --> [*]
强制停止 --> [*]
抛出异常 --> [*]
时序图2: ReAct vs OpenAI Functions Agent对比
sequenceDiagram
autonumber
participant User
participant Executor
box lightblue ReAct Agent
participant ReactAgent as ReAct Agent
participant ReactParser as ReActParser
end
box lightgreen OpenAI Functions Agent
participant OpenAIAgent as OpenAI Agent
participant OpenAIParser as FunctionParser
end
participant LLM as Language Model
participant Tool
Note over User,Tool: 场景1: ReAct Agent - 基于文本解析
User->>Executor: invoke({"input": "查询天气"})
Executor->>ReactAgent: plan(intermediate_steps=[])
ReactAgent->>ReactAgent: 构建ReAct Prompt:<br/>"Thought: ...<br/>Action: ...<br/>Action Input: ..."
ReactAgent->>LLM: invoke(prompt + stop=["\\nObservation"])
LLM-->>ReactAgent: "Thought: 需要查询天气<br/>Action: search<br/>Action Input: {'query':'天气'}"
ReactAgent->>ReactParser: parse(text)
ReactParser->>ReactParser: 正则提取 Action 和 Action Input
ReactParser-->>ReactAgent: AgentAction(tool="search", tool_input={'query':'天气'})
ReactAgent-->>Executor: AgentAction
Executor->>Tool: search.invoke({'query':'天气'})
Tool-->>Executor: "今天晴天25°C"
Executor->>Executor: intermediate_steps.append(...)
Executor->>ReactAgent: plan(intermediate_steps=[step1])
ReactAgent->>ReactAgent: 格式化scratchpad:<br/>"Action: search<br/>Observation: 今天晴天25°C"
ReactAgent->>LLM: invoke(prompt + scratchpad)
LLM-->>ReactAgent: "Thought: 信息足够<br/>Final Answer: 今天晴天25°C"
ReactAgent->>ReactParser: parse(text)
ReactParser-->>ReactAgent: AgentFinish(output="今天晴天25°C")
ReactAgent-->>Executor: AgentFinish
Executor-->>User: {"output": "今天晴天25°C"}
Note over User,Tool: 场景2: OpenAI Functions Agent - 原生Function Calling
User->>Executor: invoke({"input": "查询天气"})
Executor->>OpenAIAgent: plan(intermediate_steps=[])
OpenAIAgent->>OpenAIAgent: 构建Messages + functions schema:<br/>[{"name":"search", "parameters":{...}}]
OpenAIAgent->>LLM: invoke(messages, functions=[...])
Note right of LLM: OpenAI/Anthropic原生支持<br/>返回结构化tool_calls
LLM-->>OpenAIAgent: AIMessage(tool_calls=[{<br/> "id": "call_123",<br/> "function": {"name":"search", "arguments":"{\"query\":\"天气\"}"}}<br/>])
OpenAIAgent->>OpenAIParser: parse(ai_message)
OpenAIParser->>OpenAIParser: 提取tool_calls字段
OpenAIParser-->>OpenAIAgent: AgentActionMessageLog(tool="search", ...)
OpenAIAgent-->>Executor: AgentAction
Executor->>Tool: search.invoke({'query':'天气'})
Tool-->>Executor: "今天晴天25°C"
Executor->>Executor: intermediate_steps.append(...)
Executor->>OpenAIAgent: plan(intermediate_steps=[step1])
OpenAIAgent->>OpenAIAgent: 格式化为消息历史:<br/>AIMessage(tool_calls=[...])<br/>ToolMessage(content="今天晴天25°C")
OpenAIAgent->>LLM: invoke(messages + history)
LLM-->>OpenAIAgent: AIMessage(content="今天晴天25°C", tool_calls=None)
Note right of LLM: 无tool_calls表示完成
OpenAIAgent->>OpenAIParser: parse(ai_message)
OpenAIParser-->>OpenAIAgent: AgentFinish(output="今天晴天25°C")
OpenAIAgent-->>Executor: AgentFinish
Executor-->>User: {"output": "今天晴天25°C"}
两种Agent类型对比
| 维度 | ReAct Agent | OpenAI Functions Agent |
|---|---|---|
| 决策机制 | 文本解析(正则表达式) | 原生Function Calling API |
| Prompt格式 | 固定模板(Thought/Action/Observation) | 系统消息 + functions schema |
| LLM输出 | 自由文本(需严格遵循格式) | 结构化tool_calls字段 |
| 解析可靠性 | ⚠️ 依赖LLM输出格式 | ✅ 高可靠性(结构化) |
| 模型支持 | ✅ 通用(任何LLM) | ⚠️ 限OpenAI/Anthropic/部分开源模型 |
| Stop Sequence | 需设置["\nObservation"] |
不需要(API层面控制) |
| Token效率 | ⚠️ Prompt模板较长 | ✅ 更紧凑 |
| 调试难度 | ⚠️ 文本格式易出错 | ✅ 结构化易调试 |
| 扩展性 | ⚠️ 修改Prompt和Parser | ✅ 修改functions schema |
ReAct Agent关键代码
Prompt模板:
# libs/langchain/langchain_classic/agents/react/agent.py
template = '''Answer the following questions as best you can. You have access to the following tools:
{tools}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: {input}
Thought:{agent_scratchpad}'''
格式化intermediate_steps:
# libs/langchain/langchain_classic/agents/format_scratchpad.py
def format_log_to_str(intermediate_steps: List[Tuple[AgentAction, str]]) -> str:
"""Format intermediate steps as text."""
thoughts = ""
for action, observation in intermediate_steps:
thoughts += action.log # 包含 "Thought: ...\nAction: ...\nAction Input: ..."
thoughts += f"\nObservation: {observation}\n"
thoughts += "Thought: " # 为下一轮Thought留空
return thoughts
输出解析:
# libs/langchain/langchain_classic/agents/output_parsers/react_single_input.py
class ReActSingleInputOutputParser(AgentOutputParser):
def parse(self, text: str) -> AgentAction | AgentFinish:
# 检查是否完成
if "Final Answer:" in text:
return AgentFinish(
return_values={"output": text.split("Final Answer:")[-1].strip()},
log=text,
)
# 提取Action和Action Input
action_match = re.search(r"Action\s*\d*\s*:(.*?)\n", text, re.DOTALL)
action_input_match = re.search(
r"Action\s*\d*\s*Input\s*\d*\s*:(.*?)($|\n)", text, re.DOTALL
)
if not action_match or not action_input_match:
raise OutputParserException(f"Could not parse LLM output: `{text}`")
action = action_match.group(1).strip()
action_input = action_input_match.group(1).strip()
# 尝试解析为JSON
try:
action_input = json.loads(action_input)
except json.JSONDecodeError:
pass # 保持字符串
return AgentAction(tool=action, tool_input=action_input, log=text)
Runnable链:
# libs/langchain/langchain_classic/agents/react/agent.py
def create_react_agent(llm, tools, prompt):
prompt = prompt.partial(
tools=render_text_description(tools),
tool_names=", ".join([t.name for t in tools]),
)
llm_with_stop = llm.bind(stop=["\nObservation"])
return (
RunnablePassthrough.assign(
agent_scratchpad=lambda x: format_log_to_str(x["intermediate_steps"])
)
| prompt
| llm_with_stop
| ReActSingleInputOutputParser()
)
OpenAI Functions Agent关键代码
Functions Schema转换:
# libs/langchain_core/tools/convert.py
def convert_to_openai_function(tool: BaseTool) -> dict:
"""Convert Tool to OpenAI function format."""
return {
"name": tool.name,
"description": tool.description,
"parameters": {
"type": "object",
"properties": {
# 从tool.args_schema生成JSON Schema
param_name: {
"type": param_type,
"description": param_desc,
}
for param_name, param_type, param_desc in tool.args
},
"required": tool.required_params,
},
}
格式化intermediate_steps为消息:
# libs/langchain/langchain_classic/agents/format_scratchpad/openai_functions.py
def format_to_openai_function_messages(
intermediate_steps: List[Tuple[AgentAction, str]]
) -> List[BaseMessage]:
"""Convert intermediate steps to message format."""
messages = []
for action, observation in intermediate_steps:
# 如果action包含message_log,使用原始消息
if isinstance(action, AgentActionMessageLog):
messages.extend(action.message_log)
else:
# 否则构造AIMessage
messages.append(
AIMessage(
content="",
additional_kwargs={
"function_call": {
"name": action.tool,
"arguments": json.dumps(action.tool_input),
}
},
)
)
# 添加工具结果消息
messages.append(
FunctionMessage(
name=action.tool,
content=observation,
)
)
return messages
输出解析:
# libs/langchain/langchain_classic/agents/output_parsers/openai_functions.py
class OpenAIFunctionsAgentOutputParser(AgentOutputParser):
def parse(self, ai_message: AIMessage) -> AgentAction | AgentFinish:
# 检查是否有function_call
if "function_call" in ai_message.additional_kwargs:
function_call = ai_message.additional_kwargs["function_call"]
tool_name = function_call["name"]
tool_input = json.loads(function_call["arguments"])
return AgentActionMessageLog(
tool=tool_name,
tool_input=tool_input,
log=str(function_call),
message_log=[ai_message], # 保存完整消息
)
# 检查新版tool_calls格式
if "tool_calls" in ai_message.additional_kwargs:
tool_calls = ai_message.additional_kwargs["tool_calls"]
if tool_calls:
tool_call = tool_calls[0] # 单action agent只取第一个
return AgentActionMessageLog(
tool=tool_call["function"]["name"],
tool_input=json.loads(tool_call["function"]["arguments"]),
log=str(tool_call),
message_log=[ai_message],
)
# 无function_call,返回AgentFinish
return AgentFinish(
return_values={"output": ai_message.content},
log=ai_message.content,
)
Runnable链:
# libs/langchain/langchain_classic/agents/openai_functions_agent/base.py
def create_openai_functions_agent(llm, tools, prompt):
# 将tools转换为functions格式并绑定到LLM
llm_with_tools = llm.bind(
functions=[convert_to_openai_function(t) for t in tools]
)
return (
RunnablePassthrough.assign(
agent_scratchpad=lambda x: format_to_openai_function_messages(
x["intermediate_steps"]
)
)
| prompt
| llm_with_tools
| OpenAIFunctionsAgentOutputParser()
)
时序图3: 流式输出Agent
sequenceDiagram
autonumber
participant User
participant Executor as AgentExecutor
participant Agent
participant LLM
participant Tool
Note over User,Tool: 场景: 流式Agent执行(stream模式)
User->>Executor: stream({"input": "查询天气"})
loop 每个执行步骤
Note over Executor,Agent: 步骤1: 第1轮决策
Executor->>Agent: plan(intermediate_steps=[])
Agent->>LLM: invoke(...)
LLM-->>Agent: AgentAction(tool="search")
Agent-->>Executor: AgentAction
Executor-->>User: {"actions": [AgentAction], "messages": [...]}
Note left of User: 实时接收Agent决策
Note over Executor,Tool: 步骤2: 执行工具
Executor->>Tool: search.invoke(...)
Tool-->>Executor: observation
Executor-->>User: {"steps": [AgentStep]}
Note left of User: 实时接收工具结果
Note over Executor,Agent: 步骤3: 第2轮决策
Executor->>Agent: plan(intermediate_steps=[step1])
Agent->>LLM: invoke(...)
LLM-->>Agent: AgentFinish
Agent-->>Executor: AgentFinish
Executor-->>User: {"output": "最终答案"}
Note left of User: 接收最终结果
end
流式输出实现
使用stream方法:
from langchain.agents import AgentExecutor, create_openai_functions_agent
agent_executor = AgentExecutor(agent=agent, tools=tools)
# 流式执行
for chunk in agent_executor.stream({"input": "查询天气并总结"}):
if "actions" in chunk:
# Agent决策阶段
for action in chunk["actions"]:
print(f"🤔 决定调用工具: {action.tool}")
print(f"📝 推理过程: {action.log}")
if "steps" in chunk:
# 工具执行阶段
for step in chunk["steps"]:
print(f"🔧 工具 {step.action.tool} 返回: {step.observation[:100]}...")
if "output" in chunk:
# 最终结果
print(f"✅ 最终答案: {chunk['output']}")
使用iter迭代器:
# 迭代器模式:逐步执行
for step_output in agent_executor.iter({"input": "查询天气"}):
if isinstance(step_output, AgentStep):
print(f"Step: {step_output.action.tool} -> {step_output.observation}")
elif isinstance(step_output, AgentFinish):
print(f"Finish: {step_output.return_values}")
实现原理:
# libs/langchain/langchain_classic/agents/agent.py
class AgentExecutor(Chain):
def _stream(
self,
inputs: dict[str, Any],
run_manager: CallbackManagerForChainRun | None = None,
**kwargs: Any,
) -> Iterator[dict[str, Any]]:
"""Stream output from agent execution."""
intermediate_steps: list[tuple[AgentAction, str]] = []
iterations = 0
while self._should_continue(iterations, ...):
# 决策
output = self._action_agent.plan(intermediate_steps, **inputs)
if isinstance(output, AgentFinish):
# 流式返回最终结果
yield {"output": output.return_values["output"]}
return
# 流式返回Agent决策
actions = [output] if isinstance(output, AgentAction) else output
yield {"actions": actions}
# 执行工具
steps = []
for action in actions:
step = self._perform_agent_action(..., action, ...)
steps.append(step)
intermediate_steps.append((action, step.observation))
# 流式返回工具结果
yield {"steps": steps}
iterations += 1
流式输出的优势
| 维度 | 批量执行 | 流式执行 |
|---|---|---|
| 用户体验 | ⏳ 等待所有步骤完成 | ✅ 实时看到进度 |
| 可观测性 | ⚠️ 黑盒执行 | ✅ 透明执行过程 |
| 调试 | ⚠️ 失败后才知道问题 | ✅ 实时发现异常 |
| 中断能力 | ❌ 无法中途停止 | ✅ 可随时中断 |
| 内存占用 | ⚠️ 保存所有中间结果 | ✅ 按需生成 |
核心数据结构
classDiagram
class AgentAction {
+tool: str
+tool_input: dict|str
+log: str
+type: str
+messages: list~BaseMessage~
}
class AgentActionMessageLog {
+tool: str
+tool_input: dict|str
+log: str
+message_log: list~BaseMessage~
}
class AgentFinish {
+return_values: dict
+log: str
+type: str
+messages: list~BaseMessage~
}
class AgentFinishMessageLog {
+return_values: dict
+log: str
+message_log: list~BaseMessage~
}
class AgentStep {
+action: AgentAction
+observation: Any
}
AgentAction <|-- AgentActionMessageLog
AgentFinish <|-- AgentFinishMessageLog
AgentStep o-- AgentAction
数据结构说明
AgentAction字段
| 字段 | 类型 | 必填 | 说明 |
|---|---|---|---|
| tool | str | 是 | 要调用的工具名称 |
| tool_input | dict|str | 是 | 工具的输入参数(推荐使用dict) |
| log | str | 是 | Agent的推理过程/思考链(完整LLM输出) |
| type | Literal[“AgentAction”] | 是 | 类型标识,用于序列化 |
| messages | list[BaseMessage] | 否 | 转换为消息格式(只读属性) |
AgentFinish字段
| 字段 | 类型 | 必填 | 说明 |
|---|---|---|---|
| return_values | dict | 是 | 返回给用户的值,通常包含"output"键 |
| log | str | 是 | 最终的推理过程 |
| type | str | 是 | 固定为"AgentFinish" |
| messages | list[BaseMessage] | 否 | 完整的消息历史 |
AgentStep字段
| 字段 | 类型 | 说明 |
|---|---|---|
| action | AgentAction | Agent的决策 |
| observation | Any | 工具执行的结果 |
核心API详解
API-1: AgentAction创建
基本信息
- 名称:
AgentAction - 类型: Pydantic数据类
- 幂等性: 幂等
功能说明
表示Agent决定调用工具的决策。
请求结构体
tool: str # 工具名称
tool_input: dict | str # 工具参数
log: str # 推理日志
入口函数与关键代码
class AgentAction(Serializable):
"""Agent决定调用工具的输出"""
tool: str
tool_input: Union[str, dict]
log: str
type: Literal["AgentAction"] = "AgentAction"
@property
def messages(self) -> list[BaseMessage]:
"""转换为消息格式(用于对话历史)"""
if self.log:
return [AIMessage(content=self.log)]
return []
def dict(self, **kwargs) -> dict:
"""序列化为字典"""
return {
"tool": self.tool,
"tool_input": self.tool_input,
"log": self.log,
}
使用示例:
# 创建AgentAction
action = AgentAction(
tool="search_web",
tool_input={"query": "LangChain教程"},
log="Thought: 需要搜索LangChain的最新信息\nAction: search_web"
)
print(action.tool) # "search_web"
print(action.tool_input) # {"query": "LangChain教程"}
# 转换为消息
messages = action.messages
# [AIMessage(content="Thought: 需要搜索...")]
API-2: AgentFinish创建
基本信息
- 名称:
AgentFinish - 类型: Pydantic数据类
- 幂等性: 幂等
功能说明
表示Agent完成任务并返回最终结果。
请求结构体
return_values: dict # 返回值
log: str # 推理日志
入口函数与关键代码
class AgentFinish(Serializable):
"""Agent完成任务的输出"""
return_values: dict
log: str
type: Literal["AgentFinish"] = "AgentFinish"
@property
def messages(self) -> list[BaseMessage]:
"""转换为消息格式"""
if self.log:
return [AIMessage(content=self.log)]
return []
使用示例:
# 创建AgentFinish
finish = AgentFinish(
return_values={"output": "LangChain是一个用于构建LLM应用的框架..."},
log="Thought: 我已经收集到足够的信息\nFinal Answer: LangChain是..."
)
print(finish.return_values["output"])
# "LangChain是一个用于构建LLM应用的框架..."
# 检查是否完成
if isinstance(result, AgentFinish):
print("Agent完成任务")
return result.return_values["output"]
API-3: AgentStep组合
基本信息
- 名称:
AgentStep - 类型: NamedTuple
- 幂等性: 幂等
功能说明
表示Agent执行的一个完整步骤(Action + Observation)。
请求结构体
action: AgentAction # Agent决策
observation: Any # 工具执行结果
入口函数与关键代码
class AgentStep(NamedTuple):
"""Agent执行步骤"""
action: AgentAction
observation: Any
使用示例:
# 创建执行步骤
action = AgentAction(
tool="calculator",
tool_input={"expression": "25 * 4"},
log="Thought: 需要计算25乘以4"
)
# 执行工具(假设)
tool_result = 100
# 记录步骤
step = AgentStep(action=action, observation=tool_result)
print(step.action.tool) # "calculator"
print(step.observation) # 100
# 构建intermediate_steps列表
intermediate_steps = []
intermediate_steps.append(step)
# 在下一轮迭代中使用
agent_response = agent.invoke({
"input": "用户问题",
"intermediate_steps": intermediate_steps
})
典型使用场景
场景1: ReAct Agent循环
from langchain_core.agents import AgentAction, AgentFinish, AgentStep
from langchain_core.tools import tool
# 定义工具
@tool
def search_web(query: str) -> str:
\"\"\"搜索互联网\"\"\"
return f"搜索结果: {query}"
@tool
def calculator(expression: str) -> float:
\"\"\"计算数学表达式\"\"\"
return eval(expression)
# 简化的Agent执行循环
def run_agent(agent, tools, user_input, max_iterations=10):
intermediate_steps = []
for i in range(max_iterations):
# 1) Agent推理
result = agent.invoke({
"input": user_input,
"intermediate_steps": intermediate_steps
})
# 2) 检查是否完成
if isinstance(result, AgentFinish):
return result.return_values["output"]
# 3) 执行工具
tool_name = result.tool
tool_input = result.tool_input
tool = next(t for t in tools if t.name == tool_name)
observation = tool.invoke(tool_input)
# 4) 记录步骤
step = AgentStep(action=result, observation=observation)
intermediate_steps.append(step)
return "达到最大迭代次数"
# 使用
output = run_agent(my_agent, [search_web, calculator], "2024年奥运会在哪里举办?")
场景2: 流式Agent输出
from langchain_core.agents import AgentActionMessageLog, AgentFinishMessageLog
# 流式输出AgentAction
async def stream_agent_actions(agent, input):
async for chunk in agent.astream({"input": input}):
if isinstance(chunk, AgentActionMessageLog):
print(f"🔧 调用工具: {chunk.tool}")
print(f"📝 思考过程: {chunk.log}")
# 显示消息历史
for msg in chunk.message_log:
print(f" {msg.type}: {msg.content[:100]}...")
elif isinstance(chunk, AgentFinishMessageLog):
print(f"✅ 完成: {chunk.return_values['output']}")
# 显示完整历史
for msg in chunk.message_log:
print(f" {msg.type}: {msg.content[:100]}...")
# 使用
await stream_agent_actions(my_agent, "分析这个问题并给出答案")
场景3: 自定义Agent逻辑
from typing import Union
class CustomAgent:
def __init__(self, llm, tools):
self.llm = llm
self.tools = {t.name: t for t in tools}
def invoke(self, inputs: dict) -> Union[AgentAction, AgentFinish]:
\"\"\"单步推理\"\"\"
user_input = inputs["input"]
intermediate_steps = inputs.get("intermediate_steps", [])
# 构建prompt(包含历史步骤)
prompt = self._build_prompt(user_input, intermediate_steps)
# LLM推理
response = self.llm.invoke(prompt)
# 解析响应
return self._parse_output(response.content)
def _parse_output(self, text: str) -> Union[AgentAction, AgentFinish]:
\"\"\"解析LLM输出为AgentAction或AgentFinish\"\"\"
if "Final Answer:" in text:
# 任务完成
answer = text.split("Final Answer:")[1].strip()
return AgentFinish(
return_values={"output": answer},
log=text
)
else:
# 需要调用工具
# 简化的解析逻辑
tool_name = self._extract_tool_name(text)
tool_input = self._extract_tool_input(text)
return AgentAction(
tool=tool_name,
tool_input=tool_input,
log=text
)
时序图
Agent执行循环
sequenceDiagram
autonumber
participant User
participant Executor as Agent Executor
participant Agent as Agent
participant Tool
User->>Executor: invoke("帮我计算2024*365")
loop Agent循环
Executor->>Agent: invoke({input, intermediate_steps})
Agent->>Agent: LLM推理
alt 返回AgentAction
Agent-->>Executor: AgentAction(tool="calculator")
Executor->>Tool: invoke("2024*365")
Tool-->>Executor: 738760
Executor->>Executor: 记录AgentStep
else 返回AgentFinish
Agent-->>Executor: AgentFinish(output="答案是738760")
Executor-->>User: "答案是738760"
end
end
时序图说明:
图意: 展示Agent执行器如何循环调用Agent和工具,直到任务完成。
边界条件:
- max_iterations限制最大迭代次数
- 超时机制防止无限循环
- 错误处理确保工具失败不会中断
性能要点:
- 每次迭代需要一次LLM调用(成本较高)
- intermediate_steps累积会增加prompt长度
- 工具执行时间影响总体响应时间
最佳实践
1. 结构化工具输入
# 推荐: 使用字典格式
action = AgentAction(
tool="send_email",
tool_input={
"to": "user@example.com",
"subject": "Hello",
"body": "Message content"
},
log="..."
)
# 不推荐: 字符串格式(难以解析)
action = AgentAction(
tool="send_email",
tool_input="to=user@example.com,subject=Hello,body=...",
log="..."
)
2. 保留完整推理日志
# 推荐: 详细的log
action = AgentAction(
tool="calculator",
tool_input={"expression": "2024*365"},
log="""Thought: 用户想知道2024年有多少天
I need to calculate 2024 multiplied by 365
Action: calculator
Action Input: {"expression": "2024*365"}"""
)
# 这样便于:
# - 调试Agent行为
# - 向用户展示推理过程
# - 分析Agent决策质量
3. 处理intermediate_steps
# 推荐: 限制历史步骤数量(避免prompt过长)
def trim_intermediate_steps(steps: list[AgentStep], max_steps: int = 5) -> list[AgentStep]:
\"\"\"保留最近的N个步骤\"\"\"
return steps[-max_steps:] if len(steps) > max_steps else steps
# 使用
intermediate_steps = trim_intermediate_steps(all_steps, max_steps=5)
result = agent.invoke({
"input": user_input,
"intermediate_steps": intermediate_steps
})