LangChain-09-Agents模块

模块概览

职责与定位

Agents模块是LangChain中实现自主任务执行的核心架构层。该模块包含两个层次:

  1. 核心数据层 (langchain_core.agents): 定义Agent的数据结构和协议
  2. 执行框架层 (langchain.agents): 提供Agent执行器和具体Agent实现

Agent是LangChain中最高层的抽象,能够:

  • 根据用户输入动态规划行动序列
  • 自主选择和调用工具
  • 根据中间观察结果调整策略
  • 循环执行直到达成目标

核心职责:

  • 定义Agent的输入输出数据结构(AgentAction、AgentFinish、AgentStep)
  • 提供Agent执行循环框架(AgentExecutor)
  • 支持多种Agent策略(ReAct、OpenAI Functions、Structured Chat等)
  • 管理中间步骤(intermediate_steps)和错误处理
  • 提供流式输出和异步执行能力

输入输出

上游输入(用户/应用层):

{
    "input": str,  # 用户任务描述
    "intermediate_steps": List[Tuple[AgentAction, str]],  # 历史步骤(可选)
    "chat_history": List[BaseMessage],  # 对话历史(可选)
}

Agent决策输出(AgentAction):

AgentAction(
    tool: str,  # 选择的工具名称
    tool_input: dict | str,  # 工具参数
    log: str,  # 推理过程(Thought)
)

最终输出(AgentFinish):

AgentFinish(
    return_values: dict,  # 最终返回值 {"output": "..."}
    log: str,  # 完整推理过程
)

上下游依赖

核心依赖:

  • langchain_core.runnables: Runnable接口,Agent实现为Runnable
  • langchain_core.messages: 消息格式(AIMessage、HumanMessage等)
  • langchain_core.tools: 工具定义和执行
  • langchain_core.language_models: LLM/ChatModel用于决策
  • langchain_core.prompts: 提示模板(agent_scratchpad)
  • langchain_core.output_parsers: 解析LLM输出为AgentAction/AgentFinish
  • langchain_core.callbacks: 回调和追踪机制
  • pydantic: 数据验证

被依赖:

  • langchain.agents: 具体Agent实现(ReAct、OpenAI Functions、Structured Chat等)
  • langgraph: 新一代图式Agent框架(推荐)
  • 应用层Agent系统(多Agent协作、工作流等)

架构演进说明

⚠️ 重要: LangChain的Agent架构正在向LangGraph迁移:

架构 版本 状态 适用场景
langchain_core.agents Classic ⚠️ 维护中 核心数据结构(持续使用)
langchain.agents.AgentExecutor Classic ⚠️ 维护中 简单单Agent场景
langgraph Modern ✅ 推荐 复杂Agent、多Agent、工作流

本文档重点介绍核心数据结构执行机制,这些概念在新旧架构中都适用。

整体架构图

系统级架构

flowchart TB
    subgraph UserApp["用户应用层"]
        USER[用户输入任务]
        OUTPUT[接收最终结果]
    end

    subgraph ExecutorLayer["执行器层 (langchain.agents)"]
        EXECUTOR[AgentExecutor<br/>执行循环管理器]
        ITERATOR[AgentExecutorIterator<br/>迭代器支持]
    end

    subgraph AgentLayer["Agent决策层"]
        subgraph AgentTypes["Agent实现类型"]
            REACT[ReActAgent<br/>思维链推理]
            OPENAI[OpenAIFunctionsAgent<br/>函数调用]
            STRUCT[StructuredChatAgent<br/>结构化输入]
            CUSTOM[RunnableAgent<br/>自定义Runnable]
        end

        AGENT_RUNNABLE[Agent Runnable<br/>统一执行接口]
    end

    subgraph CoreDataLayer["核心数据层 (langchain_core.agents)"]
        ACTION[AgentAction<br/>工具调用决策]
        FINISH[AgentFinish<br/>任务完成标记]
        STEP[AgentStep<br/>执行步骤记录]
    end

    subgraph IntegrationLayer["集成层"]
        LLM[Language Models<br/>决策引擎]
        PROMPT[Prompts<br/>agent_scratchpad]
        PARSER[OutputParsers<br/>响应解析]
        TOOLS[Tools<br/>工具注册表]
        CALLBACKS[Callbacks<br/>追踪回调]
    end

    subgraph ExecutionFlow["执行流转"]
        PLAN[plan 决策]
        EXEC[execute 执行]
        OBS[observe 观察]
        LOOP[loop 循环]
    end

    %% 用户交互流
    USER -->|invoke/stream| EXECUTOR
    EXECUTOR -->|返回结果| OUTPUT

    %% 执行器到Agent
    EXECUTOR -->|调用plan| AGENT_RUNNABLE
    AGENT_RUNNABLE -->|继承实现| AgentTypes

    %% Agent决策流
    AGENT_RUNNABLE -->|输出| ACTION
    AGENT_RUNNABLE -->|输出| FINISH

    %% 执行步骤
    EXECUTOR -->|工具执行| EXEC
    EXEC -->|生成| STEP
    STEP -->|包含| ACTION

    %% 集成调用
    AGENT_RUNNABLE -->|使用| LLM
    AGENT_RUNNABLE -->|使用| PROMPT
    LLM -->|输出解析| PARSER
    PARSER -->|转换为| ACTION
    PARSER -->|转换为| FINISH
    ACTION -->|查找工具| TOOLS
    EXECUTOR -->|事件通知| CALLBACKS

    %% 循环流转
    PLAN -.决策阶段.-> EXEC
    EXEC -.执行阶段.-> OBS
    OBS -.观察阶段.-> LOOP
    LOOP -.循环判断.-> PLAN

    style ACTION fill:#e1f5ff
    style FINISH fill:#e8f5e9
    style STEP fill:#fff4e1
    style EXECUTOR fill:#ffe1f5
    style LLM fill:#f5e1ff

架构层次说明

1. 用户应用层(User App Layer)

职责: 接收用户任务,返回最终结果

交互接口:

# 同步调用
result = agent_executor.invoke({"input": "用户任务"})

# 流式调用
for chunk in agent_executor.stream({"input": "用户任务"}):
    print(chunk)

# 迭代器调用
for step in agent_executor.iter({"input": "用户任务"}):
    print(f"Step {step.action.tool}: {step.observation}")

2. 执行器层(Executor Layer)

核心组件: AgentExecutor

职责:

  • 管理Agent执行循环(while循环)
  • 控制最大迭代次数(max_iterations,默认15)
  • 管理超时时间(max_execution_time)
  • 处理解析错误(handle_parsing_errors)
  • 裁剪中间步骤(trim_intermediate_steps)
  • 管理工具映射(name_to_tool_map)
  • 触发回调事件

核心代码路径:

libs/langchain/langchain_classic/agents/agent.py
  |- class AgentExecutor(Chain)
      |- _call(): 同步执行循环
      |- _acall(): 异步执行循环
      |- _iter_next_step(): 单步迭代
      |- _perform_agent_action(): 执行工具调用

3. Agent决策层(Agent Decision Layer)

核心组件: Agent实现类 + Runnable接口

Agent类型:

Agent类型 实现类 决策机制 适用场景
ReAct ReActAgent 思维链Prompt(Thought/Action/Observation) 通用推理任务,需要可解释性
OpenAI Functions OpenAIFunctionsAgent 原生Function Calling API OpenAI/Anthropic模型,需要结构化工具调用
Structured Chat StructuredChatAgent JSON Schema + Prompt 复杂工具参数,多模态输入
Runnable RunnableAgent 自定义Runnable链 完全自定义决策逻辑

统一接口:

class BaseSingleActionAgent:
    def plan(
        self,
        intermediate_steps: List[Tuple[AgentAction, str]],
        callbacks: Callbacks = None,
        **kwargs: Any,
    ) -> AgentAction | AgentFinish:
        """单步决策:返回下一步动作或完成标记"""

4. 核心数据层(Core Data Layer)

定义位置: langchain_core.agents

数据结构:

class AgentAction(Serializable):
    """Agent决定调用工具的输出"""
    tool: str  # 工具名称
    tool_input: str | dict  # 工具参数
    log: str  # 推理过程(完整LLM输出)
    type: Literal["AgentAction"] = "AgentAction"

class AgentFinish(Serializable):
    """Agent完成任务的输出"""
    return_values: dict  # 返回值 {"output": "..."}
    log: str  # 最终推理日志
    type: Literal["AgentFinish"] = "AgentFinish"

class AgentStep(Serializable):
    """执行步骤记录"""
    action: AgentAction  # Agent决策
    observation: Any  # 工具执行结果

扩展类型:

class AgentActionMessageLog(AgentAction):
    """带消息历史的Action(用于ChatModel)"""
    message_log: Sequence[BaseMessage]

class AgentFinishMessageLog(AgentFinish):
    """带消息历史的Finish(用于ChatModel)"""
    message_log: Sequence[BaseMessage]

5. 集成层(Integration Layer)

Language Models:

  • 作为Agent的"大脑",负责推理和决策
  • 输入:prompt + agent_scratchpad(历史步骤)
  • 输出:文本/tool_calls(取决于Agent类型)

Prompts:

  • 核心占位符:{agent_scratchpad} - 包含intermediate_steps的格式化文本
  • ReAct格式:
    Thought: ...
    Action: tool_name
    Action Input: {...}
    Observation: tool_result
    
  • OpenAI Functions格式:MessagesPlaceholder(“agent_scratchpad”)

OutputParsers:

  • ReActSingleInputOutputParser: 解析ReAct格式输出
  • OpenAIFunctionsAgentOutputParser: 解析function_call为AgentAction
  • 负责将LLM文本/结构化输出转换为AgentAction/AgentFinish

Tools:

  • 注册到Agent的可用工具列表
  • AgentExecutor维护name_to_tool_map映射
  • 根据AgentAction.tool查找并执行

Callbacks:

  • on_agent_action: Agent决定调用工具时触发
  • on_agent_finish: Agent完成任务时触发
  • on_tool_start/on_tool_end: 工具执行前后触发

6. 执行流转(Execution Flow)

┌──────────────────────────────────────────────────────┐
  Plan (决策阶段)                                      
  - 调用 agent.plan(intermediate_steps, **inputs)     
  - LLM推理 + OutputParser解析                        
  - 返回 AgentAction | AgentFinish                    
└──────────────┬───────────────────────────────────────┘
               
               
┌──────────────────────────────────────────────────────┐
  Execute (执行阶段)                                   
  - 如果是AgentFinish: 结束循环,返回结果              
  - 如果是AgentAction: 查找工具并执行                 
  - tool.invoke(agent_action.tool_input)              
└──────────────┬───────────────────────────────────────┘
               
               
┌──────────────────────────────────────────────────────┐
  Observe (观察阶段)                                   
  - 记录 AgentStep(action=action, observation=result) 
  - 添加到 intermediate_steps                         
└──────────────┬───────────────────────────────────────┘
               
               
┌──────────────────────────────────────────────────────┐
  Loop (循环判断)                                      
  - 检查迭代次数 < max_iterations                     
  - 检查执行时间 < max_execution_time                 
  - 如果满足条件: 返回Plan阶段                        
  - 否则: 强制停止或生成最终答案                      
└──────────────────────────────────────────────────────┘

关键设计模式

1. 策略模式(Agent Types)

不同Agent类型实现不同的决策策略,但都遵循统一的plan()接口:

# ReAct策略:思维链Prompt
agent = create_react_agent(llm, tools, prompt)

# OpenAI Functions策略:Function Calling
agent = create_openai_functions_agent(llm, tools, prompt)

# 统一执行
agent_executor = AgentExecutor(agent=agent, tools=tools)
result = agent_executor.invoke({"input": "..."})

2. 迭代器模式(Execution Loop)

AgentExecutor通过while循环实现迭代执行:

def _call(self, inputs):
    intermediate_steps = []
    iterations = 0

    while self._should_continue(iterations, time_elapsed):
        # 决策
        output = self._action_agent.plan(intermediate_steps, **inputs)

        # 判断结束
        if isinstance(output, AgentFinish):
            return self._return(output, intermediate_steps)

        # 执行工具
        observation = tool.run(output.tool_input)

        # 记录步骤
        intermediate_steps.append((output, observation))
        iterations += 1

3. 组合模式(Runnable Composition)

Agent实现为Runnable,支持LCEL组合:

# ReAct Agent的Runnable链
agent_runnable = (
    RunnablePassthrough.assign(
        agent_scratchpad=lambda x: format_log_to_str(x["intermediate_steps"])
    )
    | prompt
    | llm_with_stop
    | output_parser
)

4. 观察者模式(Callbacks)

通过回调系统实现可观测性:

from langchain.callbacks import StdOutCallbackHandler

agent_executor.invoke(
    {"input": "任务"},
    config={"callbacks": [StdOutCallbackHandler()]}
)
# 输出:
# > Entering new AgentExecutor chain...
# Thought: ...
# Action: search
# Observation: ...

模块交互时序图

时序图1: Agent执行器完整生命周期

sequenceDiagram
    autonumber
    participant User as 用户/应用
    participant Executor as AgentExecutor
    participant Agent as Agent Runnable
    participant Prompt as ChatPromptTemplate
    participant LLM as Language Model
    participant Parser as OutputParser
    participant Tool as Tool Registry
    participant Callback as CallbackManager

    Note over User,Callback: 阶段1: 初始化(构造时)
    User->>Executor: AgentExecutor(agent, tools, max_iterations=15)
    Executor->>Executor: 构建name_to_tool_map<br/>初始化回调管理器

    Note over User,Callback: 阶段2: 启动执行
    User->>Executor: invoke({"input": "帮我查询天气并总结"})
    Executor->>Callback: on_chain_start("AgentExecutor")
    Executor->>Executor: intermediate_steps = []<br/>iterations = 0

    Note over User,Callback: 阶段3: 1轮决策 - 调用search工具
    Executor->>Agent: plan(intermediate_steps=[], input="...")
    Agent->>Prompt: format({<br/>  input: "...",<br/>  agent_scratchpad: ""<br/>})
    Prompt-->>Agent: formatted_messages
    Agent->>LLM: invoke(formatted_messages)
    Note right of LLM: 模型推理决策:<br/>需要先查询天气数据
    LLM-->>Agent: AIMessage(content="Thought: 需要查询天气<br/>Action: search<br/>Action Input: {query: '北京天气'}")
    Agent->>Parser: parse(llm_output)
    Parser-->>Agent: AgentAction(tool="search", tool_input={"query":"北京天气"})
    Agent-->>Executor: AgentAction

    Executor->>Callback: on_agent_action(AgentAction)
    Executor->>Tool: lookup_tool("search")
    Tool-->>Executor: search_tool
    Executor->>Callback: on_tool_start("search", {"query":"北京天气"})
    Executor->>Tool: search_tool.invoke({"query":"北京天气"})
    Note right of Tool: 执行搜索:<br/>调用天气API
    Tool-->>Executor: observation = "北京今日晴,25°C"
    Executor->>Callback: on_tool_end(observation)

    Executor->>Executor: intermediate_steps.append(<br/>  (AgentAction, observation)<br/>)
    Executor->>Executor: iterations = 1

    Note over User,Callback: 阶段4: 2轮决策 - 调用summarize工具
    Executor->>Agent: plan(intermediate_steps=[step1], input="...")
    Agent->>Prompt: format({<br/>  input: "...",<br/>  agent_scratchpad: "Action: search\nObservation: 北京今日晴,25°C"<br/>})
    Prompt-->>Agent: formatted_messages
    Agent->>LLM: invoke(formatted_messages)
    Note right of LLM: 模型根据观察结果<br/>决定总结
    LLM-->>Agent: AIMessage(content="Thought: 现在总结天气信息<br/>Action: summarize<br/>Action Input: {text: '北京今日晴,25°C'}")
    Agent->>Parser: parse(llm_output)
    Parser-->>Agent: AgentAction(tool="summarize", tool_input={"text":"..."})
    Agent-->>Executor: AgentAction

    Executor->>Callback: on_agent_action(AgentAction)
    Executor->>Tool: lookup_tool("summarize")
    Tool-->>Executor: summarize_tool
    Executor->>Callback: on_tool_start("summarize", {...})
    Executor->>Tool: summarize_tool.invoke({"text":"..."})
    Note right of Tool: 执行总结:<br/>调用LLM生成摘要
    Tool-->>Executor: observation = "今天北京天气晴朗,气温适宜..."
    Executor->>Callback: on_tool_end(observation)

    Executor->>Executor: intermediate_steps.append(<br/>  (AgentAction, observation)<br/>)
    Executor->>Executor: iterations = 2

    Note over User,Callback: 阶段5: 3轮决策 - 完成任务
    Executor->>Agent: plan(intermediate_steps=[step1,step2], input="...")
    Agent->>Prompt: format({<br/>  input: "...",<br/>  agent_scratchpad: "...(所有历史步骤)"<br/>})
    Prompt-->>Agent: formatted_messages
    Agent->>LLM: invoke(formatted_messages)
    Note right of LLM: 模型判断<br/>信息已足够,返回最终答案
    LLM-->>Agent: AIMessage(content="Thought: 已完成任务<br/>Final Answer: 今天北京天气晴朗...")
    Agent->>Parser: parse(llm_output)
    Parser-->>Agent: AgentFinish(return_values={"output":"今天北京天气晴朗..."})
    Agent-->>Executor: AgentFinish

    Executor->>Callback: on_agent_finish(AgentFinish)
    Executor->>Executor: 构建final_output<br/>可选添加intermediate_steps
    Executor->>Callback: on_chain_end(final_output)
    Executor-->>User: {"output": "今天北京天气晴朗,气温适宜..."}

时序图详解

阶段1: 初始化(第1-2步)

步骤1: 用户创建AgentExecutor

from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_openai import ChatOpenAI

# 创建Agent
agent = create_openai_functions_agent(ChatOpenAI(), tools, prompt)

# 创建执行器
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    max_iterations=15,  # 最大迭代次数
    max_execution_time=300,  # 超时时间(秒)
    handle_parsing_errors=True,  # 处理解析错误
    return_intermediate_steps=False,  # 是否返回中间步骤
)

步骤2: AgentExecutor内部初始化

  • 构建name_to_tool_map{"search": search_tool, "summarize": summarize_tool}
  • 验证Agent和Tools的兼容性
  • 初始化回调管理器

关键代码:

# libs/langchain/langchain_classic/agents/agent.py
class AgentExecutor(Chain):
    def __init__(self, agent, tools, **kwargs):
        super().__init__(**kwargs)
        self.agent = agent
        self.tools = tools
        # 构建工具映射
        self.name_to_tool_map = {tool.name: tool for tool in tools}
阶段2: 启动执行(第3-5步)

步骤3: 用户调用invoke

result = agent_executor.invoke({
    "input": "帮我查询北京天气并总结"
})

步骤4: 触发on_chain_start回调

  • 记录执行开始时间
  • 打印debug信息(如果verbose=True)
  • 通知LangSmith/LangFuse等追踪系统

步骤5: 初始化循环变量

intermediate_steps: List[Tuple[AgentAction, str]] = []
iterations = 0
time_elapsed = 0.0
start_time = time.time()
阶段3: 第1轮决策 - 工具调用(第6-20步)

步骤6-7: Agent决策入口

# AgentExecutor._call()
output = self._action_agent.plan(
    intermediate_steps,  # 当前为空[]
    callbacks=run_manager.get_child(),
    **inputs  # {"input": "帮我查询北京天气并总结"}
)

步骤8-9: 格式化prompt

# Agent Runnable处理(以ReAct为例)
# libs/langchain/langchain_classic/agents/react/agent.py
agent_runnable = (
    RunnablePassthrough.assign(
        agent_scratchpad=lambda x: format_log_to_str(x["intermediate_steps"])
    )
    | prompt
    | llm_with_stop
    | output_parser
)

# 第1轮时 agent_scratchpad = "" (因为intermediate_steps为空)

步骤10-11: 调用LLM

# LLM接收到的prompt(ReAct格式):
"""
Answer the following questions as best you can. You have access to the following tools:

search: 搜索互联网获取信息
summarize: 总结文本内容

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [search, summarize]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 帮我查询北京天气并总结
Thought:
"""

# LLM输出:
"""
Thought: 我需要先查询北京的天气信息
Action: search
Action Input: {"query": "北京天气"}
"""

步骤12-14: 解析LLM输出

# libs/langchain/langchain_classic/agents/output_parsers/react_single_input.py
class ReActSingleInputOutputParser(AgentOutputParser):
    def parse(self, text: str) -> AgentAction | AgentFinish:
        # 查找 "Action:" 和 "Action Input:"
        if "Final Answer:" in text:
            return AgentFinish(...)
        else:
            action = extract_action(text)  # "search"
            action_input = extract_action_input(text)  # {"query":"北京天气"}
            return AgentAction(
                tool=action,
                tool_input=action_input,
                log=text  # 完整LLM输出
            )

步骤15: 返回AgentAction到Executor

步骤16-19: 执行工具

# AgentExecutor._perform_agent_action()
if agent_action.tool in name_to_tool_map:
    tool = name_to_tool_map[agent_action.tool]  # search_tool

    # 触发on_tool_start回调
    run_manager.on_tool_start(...)

    # 执行工具
    observation = tool.run(
        agent_action.tool_input,  # {"query":"北京天气"}
        verbose=self.verbose,
        callbacks=run_manager.get_child()
    )
    # observation = "北京今日晴,25°C,空气质量良好"

    # 触发on_tool_end回调
    run_manager.on_tool_end(observation)

步骤20-21: 记录执行步骤

step = AgentStep(
    action=agent_action,
    observation=observation
)
intermediate_steps.append((agent_action, observation))
iterations += 1  # iterations = 1
阶段4: 第2轮决策 - 再次工具调用(第22-35步)

关键差异: agent_scratchpad现在包含第1轮的历史

# 步骤23: 格式化agent_scratchpad
agent_scratchpad = """
Action: search
Action Input: {"query": "北京天气"}
Observation: 北京今日晴,25°C,空气质量良好
"""

# LLM接收到的完整prompt:
"""
Question: 帮我查询北京天气并总结
Thought:
Action: search
Action Input: {"query": "北京天气"}
Observation: 北京今日晴,25°C,空气质量良好
Thought:
"""

# LLM输出(步骤26):
"""
Thought: 现在我有了天气数据,需要总结一下
Action: summarize
Action Input: {"text": "北京今日晴,25°C,空气质量良好"}
"""

步骤27-35: 重复工具执行流程

  • 解析为AgentAction(tool=“summarize”)
  • 查找summarize_tool
  • 执行总结工具
  • 记录第2个AgentStep
  • iterations = 2
阶段5: 第3轮决策 - 完成任务(第36-45步)

步骤37: Agent_scratchpad包含2轮历史

agent_scratchpad = """
Action: search
Action Input: {"query": "北京天气"}
Observation: 北京今日晴,25°C,空气质量良好
Action: summarize
Action Input: {"text": "北京今日晴,25°C,空气质量良好"}
Observation: 今天北京天气晴朗,气温适宜,适合户外活动
"""

步骤40: LLM判断任务已完成

# LLM输出:
"""
Thought: 我已经查询并总结了北京天气,可以给出最终答案了
Final Answer: 今天北京天气晴朗,气温25°C,空气质量良好,适合户外活动
"""

步骤42: 解析为AgentFinish

parser.parse(llm_output)
# 检测到 "Final Answer:"
return AgentFinish(
    return_values={"output": "今天北京天气晴朗,气温25°C..."},
    log=llm_output
)

步骤43-45: 结束循环

# AgentExecutor._call()
if isinstance(output, AgentFinish):
    run_manager.on_agent_finish(output)

    final_output = output.return_values
    if self.return_intermediate_steps:
        final_output["intermediate_steps"] = intermediate_steps

    return final_output

边界条件处理

1. 最大迭代次数限制
# AgentExecutor._should_continue()
def _should_continue(self, iterations: int, time_elapsed: float) -> bool:
    if self.max_iterations is not None and iterations >= self.max_iterations:
        return False
    if self.max_execution_time is not None and time_elapsed >= self.max_execution_time:
        return False
    return True

# 如果达到限制
if not self._should_continue(iterations, time_elapsed):
    if self.early_stopping_method == "force":
        # 强制返回
        return AgentFinish(
            return_values={"output": "Agent stopped due to iteration limit"},
            log=""
        )
    elif self.early_stopping_method == "generate":
        # 让LLM基于当前信息生成最终答案
        final_output = self._action_agent.return_stopped_response(...)
        return final_output
2. 解析错误处理
try:
    output = self._action_agent.plan(intermediate_steps, **inputs)
except OutputParserException as e:
    if self.handle_parsing_errors == True:
        # 将错误作为observation返回给Agent
        observation = "Invalid or incomplete response"
        output = AgentAction("_Exception", observation, str(e))
        yield AgentStep(action=output, observation=observation)
    elif isinstance(self.handle_parsing_errors, str):
        observation = self.handle_parsing_errors
        yield AgentStep(action=output, observation=observation)
    elif callable(self.handle_parsing_errors):
        observation = self.handle_parsing_errors(e)
        yield AgentStep(action=output, observation=observation)
    else:
        raise
3. 工具执行失败
try:
    observation = tool.run(agent_action.tool_input)
except Exception as e:
    # 将异常信息作为observation
    observation = f"Tool execution failed: {str(e)}"
    # Agent将在下一轮看到这个错误信息,可以选择重试或改用其他工具

性能要点

维度 影响因素 优化建议
延迟 每轮迭代需要1次LLM调用 使用更快的模型(如GPT-3.5);减少不必要的迭代
Token消耗 agent_scratchpad随迭代增长 设置trim_intermediate_steps;总结历史步骤
并发 串行执行工具 使用MultiActionAgent并行调用多个工具
可靠性 LLM可能输出无效格式 设置handle_parsing_errors=True;使用结构化输出

异常流与回退

stateDiagram-v2
    [*] --> Plan决策
    Plan决策 --> 解析成功: LLM输出合法
    Plan决策 --> 解析失败: LLM输出非法

    解析成功 --> AgentAction: 非Final Answer
    解析成功 --> AgentFinish: Final Answer

    解析失败 --> 错误处理: handle_parsing_errors=True
    解析失败 --> 抛出异常: handle_parsing_errors=False

    错误处理 --> AgentAction: 构造_Exception Action
    AgentAction --> 工具执行

    工具执行 --> 执行成功: Tool正常返回
    工具执行 --> 执行失败: Tool抛出异常

    执行成功 --> 记录步骤: AgentStep
    执行失败 --> 记录步骤: AgentStep(observation=error)

    记录步骤 --> 循环判断
    循环判断 --> Plan决策: iterations < max
    循环判断 --> 强制停止: iterations >= max

    AgentFinish --> [*]
    强制停止 --> [*]
    抛出异常 --> [*]

时序图2: ReAct vs OpenAI Functions Agent对比

sequenceDiagram
    autonumber
    participant User
    participant Executor

    box lightblue ReAct Agent
    participant ReactAgent as ReAct Agent
    participant ReactParser as ReActParser
    end

    box lightgreen OpenAI Functions Agent
    participant OpenAIAgent as OpenAI Agent
    participant OpenAIParser as FunctionParser
    end

    participant LLM as Language Model
    participant Tool

    Note over User,Tool: 场景1: ReAct Agent - 基于文本解析
    User->>Executor: invoke({"input": "查询天气"})
    Executor->>ReactAgent: plan(intermediate_steps=[])

    ReactAgent->>ReactAgent: 构建ReAct Prompt:<br/>"Thought: ...<br/>Action: ...<br/>Action Input: ..."
    ReactAgent->>LLM: invoke(prompt + stop=["\\nObservation"])
    LLM-->>ReactAgent: "Thought: 需要查询天气<br/>Action: search<br/>Action Input: {'query':'天气'}"

    ReactAgent->>ReactParser: parse(text)
    ReactParser->>ReactParser: 正则提取 Action  Action Input
    ReactParser-->>ReactAgent: AgentAction(tool="search", tool_input={'query':'天气'})
    ReactAgent-->>Executor: AgentAction

    Executor->>Tool: search.invoke({'query':'天气'})
    Tool-->>Executor: "今天晴天25°C"
    Executor->>Executor: intermediate_steps.append(...)

    Executor->>ReactAgent: plan(intermediate_steps=[step1])
    ReactAgent->>ReactAgent: 格式化scratchpad:<br/>"Action: search<br/>Observation: 今天晴天25°C"
    ReactAgent->>LLM: invoke(prompt + scratchpad)
    LLM-->>ReactAgent: "Thought: 信息足够<br/>Final Answer: 今天晴天25°C"
    ReactAgent->>ReactParser: parse(text)
    ReactParser-->>ReactAgent: AgentFinish(output="今天晴天25°C")
    ReactAgent-->>Executor: AgentFinish
    Executor-->>User: {"output": "今天晴天25°C"}

    Note over User,Tool: 场景2: OpenAI Functions Agent - 原生Function Calling
    User->>Executor: invoke({"input": "查询天气"})
    Executor->>OpenAIAgent: plan(intermediate_steps=[])

    OpenAIAgent->>OpenAIAgent: 构建Messages + functions schema:<br/>[{"name":"search", "parameters":{...}}]
    OpenAIAgent->>LLM: invoke(messages, functions=[...])
    Note right of LLM: OpenAI/Anthropic原生支持<br/>返回结构化tool_calls
    LLM-->>OpenAIAgent: AIMessage(tool_calls=[{<br/>  "id": "call_123",<br/>  "function": {"name":"search", "arguments":"{\"query\":\"天气\"}"}}<br/>])

    OpenAIAgent->>OpenAIParser: parse(ai_message)
    OpenAIParser->>OpenAIParser: 提取tool_calls字段
    OpenAIParser-->>OpenAIAgent: AgentActionMessageLog(tool="search", ...)
    OpenAIAgent-->>Executor: AgentAction

    Executor->>Tool: search.invoke({'query':'天气'})
    Tool-->>Executor: "今天晴天25°C"
    Executor->>Executor: intermediate_steps.append(...)

    Executor->>OpenAIAgent: plan(intermediate_steps=[step1])
    OpenAIAgent->>OpenAIAgent: 格式化为消息历史:<br/>AIMessage(tool_calls=[...])<br/>ToolMessage(content="今天晴天25°C")
    OpenAIAgent->>LLM: invoke(messages + history)
    LLM-->>OpenAIAgent: AIMessage(content="今天晴天25°C", tool_calls=None)
    Note right of LLM: tool_calls表示完成
    OpenAIAgent->>OpenAIParser: parse(ai_message)
    OpenAIParser-->>OpenAIAgent: AgentFinish(output="今天晴天25°C")
    OpenAIAgent-->>Executor: AgentFinish
    Executor-->>User: {"output": "今天晴天25°C"}

两种Agent类型对比

维度 ReAct Agent OpenAI Functions Agent
决策机制 文本解析(正则表达式) 原生Function Calling API
Prompt格式 固定模板(Thought/Action/Observation) 系统消息 + functions schema
LLM输出 自由文本(需严格遵循格式) 结构化tool_calls字段
解析可靠性 ⚠️ 依赖LLM输出格式 ✅ 高可靠性(结构化)
模型支持 ✅ 通用(任何LLM) ⚠️ 限OpenAI/Anthropic/部分开源模型
Stop Sequence 需设置["\nObservation"] 不需要(API层面控制)
Token效率 ⚠️ Prompt模板较长 ✅ 更紧凑
调试难度 ⚠️ 文本格式易出错 ✅ 结构化易调试
扩展性 ⚠️ 修改Prompt和Parser ✅ 修改functions schema

ReAct Agent关键代码

Prompt模板:

# libs/langchain/langchain_classic/agents/react/agent.py
template = '''Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}'''

格式化intermediate_steps:

# libs/langchain/langchain_classic/agents/format_scratchpad.py
def format_log_to_str(intermediate_steps: List[Tuple[AgentAction, str]]) -> str:
    """Format intermediate steps as text."""
    thoughts = ""
    for action, observation in intermediate_steps:
        thoughts += action.log  # 包含 "Thought: ...\nAction: ...\nAction Input: ..."
        thoughts += f"\nObservation: {observation}\n"
        thoughts += "Thought: "  # 为下一轮Thought留空
    return thoughts

输出解析:

# libs/langchain/langchain_classic/agents/output_parsers/react_single_input.py
class ReActSingleInputOutputParser(AgentOutputParser):
    def parse(self, text: str) -> AgentAction | AgentFinish:
        # 检查是否完成
        if "Final Answer:" in text:
            return AgentFinish(
                return_values={"output": text.split("Final Answer:")[-1].strip()},
                log=text,
            )

        # 提取Action和Action Input
        action_match = re.search(r"Action\s*\d*\s*:(.*?)\n", text, re.DOTALL)
        action_input_match = re.search(
            r"Action\s*\d*\s*Input\s*\d*\s*:(.*?)($|\n)", text, re.DOTALL
        )

        if not action_match or not action_input_match:
            raise OutputParserException(f"Could not parse LLM output: `{text}`")

        action = action_match.group(1).strip()
        action_input = action_input_match.group(1).strip()

        # 尝试解析为JSON
        try:
            action_input = json.loads(action_input)
        except json.JSONDecodeError:
            pass  # 保持字符串

        return AgentAction(tool=action, tool_input=action_input, log=text)

Runnable链:

# libs/langchain/langchain_classic/agents/react/agent.py
def create_react_agent(llm, tools, prompt):
    prompt = prompt.partial(
        tools=render_text_description(tools),
        tool_names=", ".join([t.name for t in tools]),
    )

    llm_with_stop = llm.bind(stop=["\nObservation"])

    return (
        RunnablePassthrough.assign(
            agent_scratchpad=lambda x: format_log_to_str(x["intermediate_steps"])
        )
        | prompt
        | llm_with_stop
        | ReActSingleInputOutputParser()
    )

OpenAI Functions Agent关键代码

Functions Schema转换:

# libs/langchain_core/tools/convert.py
def convert_to_openai_function(tool: BaseTool) -> dict:
    """Convert Tool to OpenAI function format."""
    return {
        "name": tool.name,
        "description": tool.description,
        "parameters": {
            "type": "object",
            "properties": {
                # 从tool.args_schema生成JSON Schema
                param_name: {
                    "type": param_type,
                    "description": param_desc,
                }
                for param_name, param_type, param_desc in tool.args
            },
            "required": tool.required_params,
        },
    }

格式化intermediate_steps为消息:

# libs/langchain/langchain_classic/agents/format_scratchpad/openai_functions.py
def format_to_openai_function_messages(
    intermediate_steps: List[Tuple[AgentAction, str]]
) -> List[BaseMessage]:
    """Convert intermediate steps to message format."""
    messages = []
    for action, observation in intermediate_steps:
        # 如果action包含message_log,使用原始消息
        if isinstance(action, AgentActionMessageLog):
            messages.extend(action.message_log)
        else:
            # 否则构造AIMessage
            messages.append(
                AIMessage(
                    content="",
                    additional_kwargs={
                        "function_call": {
                            "name": action.tool,
                            "arguments": json.dumps(action.tool_input),
                        }
                    },
                )
            )

        # 添加工具结果消息
        messages.append(
            FunctionMessage(
                name=action.tool,
                content=observation,
            )
        )

    return messages

输出解析:

# libs/langchain/langchain_classic/agents/output_parsers/openai_functions.py
class OpenAIFunctionsAgentOutputParser(AgentOutputParser):
    def parse(self, ai_message: AIMessage) -> AgentAction | AgentFinish:
        # 检查是否有function_call
        if "function_call" in ai_message.additional_kwargs:
            function_call = ai_message.additional_kwargs["function_call"]
            tool_name = function_call["name"]
            tool_input = json.loads(function_call["arguments"])

            return AgentActionMessageLog(
                tool=tool_name,
                tool_input=tool_input,
                log=str(function_call),
                message_log=[ai_message],  # 保存完整消息
            )

        # 检查新版tool_calls格式
        if "tool_calls" in ai_message.additional_kwargs:
            tool_calls = ai_message.additional_kwargs["tool_calls"]
            if tool_calls:
                tool_call = tool_calls[0]  # 单action agent只取第一个
                return AgentActionMessageLog(
                    tool=tool_call["function"]["name"],
                    tool_input=json.loads(tool_call["function"]["arguments"]),
                    log=str(tool_call),
                    message_log=[ai_message],
                )

        # 无function_call,返回AgentFinish
        return AgentFinish(
            return_values={"output": ai_message.content},
            log=ai_message.content,
        )

Runnable链:

# libs/langchain/langchain_classic/agents/openai_functions_agent/base.py
def create_openai_functions_agent(llm, tools, prompt):
    # 将tools转换为functions格式并绑定到LLM
    llm_with_tools = llm.bind(
        functions=[convert_to_openai_function(t) for t in tools]
    )

    return (
        RunnablePassthrough.assign(
            agent_scratchpad=lambda x: format_to_openai_function_messages(
                x["intermediate_steps"]
            )
        )
        | prompt
        | llm_with_tools
        | OpenAIFunctionsAgentOutputParser()
    )

时序图3: 流式输出Agent

sequenceDiagram
    autonumber
    participant User
    participant Executor as AgentExecutor
    participant Agent
    participant LLM
    participant Tool

    Note over User,Tool: 场景: 流式Agent执行stream模式
    User->>Executor: stream({"input": "查询天气"})

    loop 每个执行步骤
        Note over Executor,Agent: 步骤1: 1轮决策
        Executor->>Agent: plan(intermediate_steps=[])
        Agent->>LLM: invoke(...)
        LLM-->>Agent: AgentAction(tool="search")
        Agent-->>Executor: AgentAction
        Executor-->>User: {"actions": [AgentAction], "messages": [...]}
        Note left of User: 实时接收Agent决策

        Note over Executor,Tool: 步骤2: 执行工具
        Executor->>Tool: search.invoke(...)
        Tool-->>Executor: observation
        Executor-->>User: {"steps": [AgentStep]}
        Note left of User: 实时接收工具结果

        Note over Executor,Agent: 步骤3: 2轮决策
        Executor->>Agent: plan(intermediate_steps=[step1])
        Agent->>LLM: invoke(...)
        LLM-->>Agent: AgentFinish
        Agent-->>Executor: AgentFinish
        Executor-->>User: {"output": "最终答案"}
        Note left of User: 接收最终结果
    end

流式输出实现

使用stream方法:

from langchain.agents import AgentExecutor, create_openai_functions_agent

agent_executor = AgentExecutor(agent=agent, tools=tools)

# 流式执行
for chunk in agent_executor.stream({"input": "查询天气并总结"}):
    if "actions" in chunk:
        # Agent决策阶段
        for action in chunk["actions"]:
            print(f"🤔 决定调用工具: {action.tool}")
            print(f"📝 推理过程: {action.log}")

    if "steps" in chunk:
        # 工具执行阶段
        for step in chunk["steps"]:
            print(f"🔧 工具 {step.action.tool} 返回: {step.observation[:100]}...")

    if "output" in chunk:
        # 最终结果
        print(f"✅ 最终答案: {chunk['output']}")

使用iter迭代器:

# 迭代器模式:逐步执行
for step_output in agent_executor.iter({"input": "查询天气"}):
    if isinstance(step_output, AgentStep):
        print(f"Step: {step_output.action.tool} -> {step_output.observation}")
    elif isinstance(step_output, AgentFinish):
        print(f"Finish: {step_output.return_values}")

实现原理:

# libs/langchain/langchain_classic/agents/agent.py
class AgentExecutor(Chain):
    def _stream(
        self,
        inputs: dict[str, Any],
        run_manager: CallbackManagerForChainRun | None = None,
        **kwargs: Any,
    ) -> Iterator[dict[str, Any]]:
        """Stream output from agent execution."""
        intermediate_steps: list[tuple[AgentAction, str]] = []
        iterations = 0

        while self._should_continue(iterations, ...):
            # 决策
            output = self._action_agent.plan(intermediate_steps, **inputs)

            if isinstance(output, AgentFinish):
                # 流式返回最终结果
                yield {"output": output.return_values["output"]}
                return

            # 流式返回Agent决策
            actions = [output] if isinstance(output, AgentAction) else output
            yield {"actions": actions}

            # 执行工具
            steps = []
            for action in actions:
                step = self._perform_agent_action(..., action, ...)
                steps.append(step)
                intermediate_steps.append((action, step.observation))

            # 流式返回工具结果
            yield {"steps": steps}

            iterations += 1

流式输出的优势

维度 批量执行 流式执行
用户体验 ⏳ 等待所有步骤完成 ✅ 实时看到进度
可观测性 ⚠️ 黑盒执行 ✅ 透明执行过程
调试 ⚠️ 失败后才知道问题 ✅ 实时发现异常
中断能力 ❌ 无法中途停止 ✅ 可随时中断
内存占用 ⚠️ 保存所有中间结果 ✅ 按需生成

核心数据结构

classDiagram
    class AgentAction {
        +tool: str
        +tool_input: dict|str
        +log: str
        +type: str
        +messages: list~BaseMessage~
    }

    class AgentActionMessageLog {
        +tool: str
        +tool_input: dict|str
        +log: str
        +message_log: list~BaseMessage~
    }

    class AgentFinish {
        +return_values: dict
        +log: str
        +type: str
        +messages: list~BaseMessage~
    }

    class AgentFinishMessageLog {
        +return_values: dict
        +log: str
        +message_log: list~BaseMessage~
    }

    class AgentStep {
        +action: AgentAction
        +observation: Any
    }

    AgentAction <|-- AgentActionMessageLog
    AgentFinish <|-- AgentFinishMessageLog
    AgentStep o-- AgentAction

数据结构说明

AgentAction字段

字段 类型 必填 说明
tool str 要调用的工具名称
tool_input dict|str 工具的输入参数(推荐使用dict)
log str Agent的推理过程/思考链(完整LLM输出)
type Literal[“AgentAction”] 类型标识,用于序列化
messages list[BaseMessage] 转换为消息格式(只读属性)

AgentFinish字段

字段 类型 必填 说明
return_values dict 返回给用户的值,通常包含"output"键
log str 最终的推理过程
type str 固定为"AgentFinish"
messages list[BaseMessage] 完整的消息历史

AgentStep字段

字段 类型 说明
action AgentAction Agent的决策
observation Any 工具执行的结果

核心API详解

API-1: AgentAction创建

基本信息

  • 名称: AgentAction
  • 类型: Pydantic数据类
  • 幂等性: 幂等

功能说明

表示Agent决定调用工具的决策。

请求结构体

tool: str  # 工具名称
tool_input: dict | str  # 工具参数
log: str  # 推理日志

入口函数与关键代码

class AgentAction(Serializable):
    """Agent决定调用工具的输出"""

    tool: str
    tool_input: Union[str, dict]
    log: str
    type: Literal["AgentAction"] = "AgentAction"

    @property
    def messages(self) -> list[BaseMessage]:
        """转换为消息格式(用于对话历史)"""
        if self.log:
            return [AIMessage(content=self.log)]
        return []

    def dict(self, **kwargs) -> dict:
        """序列化为字典"""
        return {
            "tool": self.tool,
            "tool_input": self.tool_input,
            "log": self.log,
        }

使用示例:

# 创建AgentAction
action = AgentAction(
    tool="search_web",
    tool_input={"query": "LangChain教程"},
    log="Thought: 需要搜索LangChain的最新信息\nAction: search_web"
)

print(action.tool)  # "search_web"
print(action.tool_input)  # {"query": "LangChain教程"}

# 转换为消息
messages = action.messages
# [AIMessage(content="Thought: 需要搜索...")]

API-2: AgentFinish创建

基本信息

  • 名称: AgentFinish
  • 类型: Pydantic数据类
  • 幂等性: 幂等

功能说明

表示Agent完成任务并返回最终结果。

请求结构体

return_values: dict  # 返回值
log: str  # 推理日志

入口函数与关键代码

class AgentFinish(Serializable):
    """Agent完成任务的输出"""

    return_values: dict
    log: str
    type: Literal["AgentFinish"] = "AgentFinish"

    @property
    def messages(self) -> list[BaseMessage]:
        """转换为消息格式"""
        if self.log:
            return [AIMessage(content=self.log)]
        return []

使用示例:

# 创建AgentFinish
finish = AgentFinish(
    return_values={"output": "LangChain是一个用于构建LLM应用的框架..."},
    log="Thought: 我已经收集到足够的信息\nFinal Answer: LangChain是..."
)

print(finish.return_values["output"])
# "LangChain是一个用于构建LLM应用的框架..."

# 检查是否完成
if isinstance(result, AgentFinish):
    print("Agent完成任务")
    return result.return_values["output"]

API-3: AgentStep组合

基本信息

  • 名称: AgentStep
  • 类型: NamedTuple
  • 幂等性: 幂等

功能说明

表示Agent执行的一个完整步骤(Action + Observation)。

请求结构体

action: AgentAction  # Agent决策
observation: Any  # 工具执行结果

入口函数与关键代码

class AgentStep(NamedTuple):
    """Agent执行步骤"""
    action: AgentAction
    observation: Any

使用示例:

# 创建执行步骤
action = AgentAction(
    tool="calculator",
    tool_input={"expression": "25 * 4"},
    log="Thought: 需要计算25乘以4"
)

# 执行工具(假设)
tool_result = 100

# 记录步骤
step = AgentStep(action=action, observation=tool_result)

print(step.action.tool)  # "calculator"
print(step.observation)  # 100

# 构建intermediate_steps列表
intermediate_steps = []
intermediate_steps.append(step)

# 在下一轮迭代中使用
agent_response = agent.invoke({
    "input": "用户问题",
    "intermediate_steps": intermediate_steps
})

典型使用场景

场景1: ReAct Agent循环

from langchain_core.agents import AgentAction, AgentFinish, AgentStep
from langchain_core.tools import tool

# 定义工具
@tool
def search_web(query: str) -> str:
    \"\"\"搜索互联网\"\"\"
    return f"搜索结果: {query}"

@tool
def calculator(expression: str) -> float:
    \"\"\"计算数学表达式\"\"\"
    return eval(expression)

# 简化的Agent执行循环
def run_agent(agent, tools, user_input, max_iterations=10):
    intermediate_steps = []

    for i in range(max_iterations):
        # 1) Agent推理
        result = agent.invoke({
            "input": user_input,
            "intermediate_steps": intermediate_steps
        })

        # 2) 检查是否完成
        if isinstance(result, AgentFinish):
            return result.return_values["output"]

        # 3) 执行工具
        tool_name = result.tool
        tool_input = result.tool_input

        tool = next(t for t in tools if t.name == tool_name)
        observation = tool.invoke(tool_input)

        # 4) 记录步骤
        step = AgentStep(action=result, observation=observation)
        intermediate_steps.append(step)

    return "达到最大迭代次数"

# 使用
output = run_agent(my_agent, [search_web, calculator], "2024年奥运会在哪里举办?")

场景2: 流式Agent输出

from langchain_core.agents import AgentActionMessageLog, AgentFinishMessageLog

# 流式输出AgentAction
async def stream_agent_actions(agent, input):
    async for chunk in agent.astream({"input": input}):
        if isinstance(chunk, AgentActionMessageLog):
            print(f"🔧 调用工具: {chunk.tool}")
            print(f"📝 思考过程: {chunk.log}")

            # 显示消息历史
            for msg in chunk.message_log:
                print(f"  {msg.type}: {msg.content[:100]}...")

        elif isinstance(chunk, AgentFinishMessageLog):
            print(f"✅ 完成: {chunk.return_values['output']}")

            # 显示完整历史
            for msg in chunk.message_log:
                print(f"  {msg.type}: {msg.content[:100]}...")

# 使用
await stream_agent_actions(my_agent, "分析这个问题并给出答案")

场景3: 自定义Agent逻辑

from typing import Union

class CustomAgent:
    def __init__(self, llm, tools):
        self.llm = llm
        self.tools = {t.name: t for t in tools}

    def invoke(self, inputs: dict) -> Union[AgentAction, AgentFinish]:
        \"\"\"单步推理\"\"\"
        user_input = inputs["input"]
        intermediate_steps = inputs.get("intermediate_steps", [])

        # 构建prompt(包含历史步骤)
        prompt = self._build_prompt(user_input, intermediate_steps)

        # LLM推理
        response = self.llm.invoke(prompt)

        # 解析响应
        return self._parse_output(response.content)

    def _parse_output(self, text: str) -> Union[AgentAction, AgentFinish]:
        \"\"\"解析LLM输出为AgentAction或AgentFinish\"\"\"
        if "Final Answer:" in text:
            # 任务完成
            answer = text.split("Final Answer:")[1].strip()
            return AgentFinish(
                return_values={"output": answer},
                log=text
            )
        else:
            # 需要调用工具
            # 简化的解析逻辑
            tool_name = self._extract_tool_name(text)
            tool_input = self._extract_tool_input(text)

            return AgentAction(
                tool=tool_name,
                tool_input=tool_input,
                log=text
            )

时序图

Agent执行循环

sequenceDiagram
    autonumber
    participant User
    participant Executor as Agent Executor
    participant Agent as Agent
    participant Tool

    User->>Executor: invoke("帮我计算2024*365")

    loop Agent循环
        Executor->>Agent: invoke({input, intermediate_steps})
        Agent->>Agent: LLM推理

        alt 返回AgentAction
            Agent-->>Executor: AgentAction(tool="calculator")
            Executor->>Tool: invoke("2024*365")
            Tool-->>Executor: 738760
            Executor->>Executor: 记录AgentStep
        else 返回AgentFinish
            Agent-->>Executor: AgentFinish(output="答案是738760")
            Executor-->>User: "答案是738760"
        end
    end

时序图说明:

图意: 展示Agent执行器如何循环调用Agent和工具,直到任务完成。

边界条件:

  • max_iterations限制最大迭代次数
  • 超时机制防止无限循环
  • 错误处理确保工具失败不会中断

性能要点:

  • 每次迭代需要一次LLM调用(成本较高)
  • intermediate_steps累积会增加prompt长度
  • 工具执行时间影响总体响应时间

最佳实践

1. 结构化工具输入

# 推荐: 使用字典格式
action = AgentAction(
    tool="send_email",
    tool_input={
        "to": "user@example.com",
        "subject": "Hello",
        "body": "Message content"
    },
    log="..."
)

# 不推荐: 字符串格式(难以解析)
action = AgentAction(
    tool="send_email",
    tool_input="to=user@example.com,subject=Hello,body=...",
    log="..."
)

2. 保留完整推理日志

# 推荐: 详细的log
action = AgentAction(
    tool="calculator",
    tool_input={"expression": "2024*365"},
    log="""Thought: 用户想知道2024年有多少天
I need to calculate 2024 multiplied by 365
Action: calculator
Action Input: {"expression": "2024*365"}"""
)

# 这样便于:
# - 调试Agent行为
# - 向用户展示推理过程
# - 分析Agent决策质量

3. 处理intermediate_steps

# 推荐: 限制历史步骤数量(避免prompt过长)
def trim_intermediate_steps(steps: list[AgentStep], max_steps: int = 5) -> list[AgentStep]:
    \"\"\"保留最近的N个步骤\"\"\"
    return steps[-max_steps:] if len(steps) > max_steps else steps

# 使用
intermediate_steps = trim_intermediate_steps(all_steps, max_steps=5)
result = agent.invoke({
    "input": user_input,
    "intermediate_steps": intermediate_steps
})