摘要:AI 智能体(Agent)、Agentic AI、Agentic 架构、Agentic 工作流、Agentic 模式,Agent 无处不在。但它们究竟是什么?如何构建强大高效的 Agentic 系统(Agentic systems)呢?尽管“Agent”一词用
原文:https://www.philschmid.de/agentic-pattern#reflection-pattern
AI 智能体(Agent)、Agentic AI、Agentic 架构、Agentic 工作流、Agentic 模式,Agent 无处不在。但它们究竟是什么?如何构建强大高效的 Agentic 系统(Agentic systems)呢?尽管“Agent”一词用法广泛,但其核心特点在于能够动态地规划和执行任务,通常会利用外部工具和记忆来达成复杂目标。
本文旨在探讨常见的 Agentic 设计模式(Agentic design patterns)。可以将这些模式视为构建 AI 应用的蓝图或可复用模板。理解这些模式能帮助我们建立解决复杂问题、设计可扩展、模块化和适应性强系统的思维模型。
在深入探讨模式之前,关键在于考虑何时真正需要 Agentic 方法。
始终优先考虑最简单的解决方案。如果您已经知道解决问题的确切步骤,固定工作流甚至简单的脚本可能比 Agent 更高效可靠。Agentic 系统通常会以增加延迟和计算成本为代价,换取在复杂、含糊或动态任务上潜在的更优性能。务必权衡这些成本是否物有所值。对于步骤已知、明确定义的任务,使用工作流能够保证可预测性和一致性。当需要灵活性、适应性以及模型驱动的决策能力时,使用 Agent。保持简洁依旧重要:即使构建 Agentic 系统,也要力求设计简洁有效。过度复杂的 Agent 可能难以调试和管理。Agentic 方法固有的不可预测性和潜在错误。Agentic 系统必须包含强大的错误日志记录、异常处理和重试机制,让系统(或底层 LLM)有机会进行自我纠正。接下来,我们将探索 3 种常见工作流模式和 4 种 Agentic 模式。为聚焦核心概念,我们将使用纯 API 调用进行说明,不依赖 LangChain、LangGraph、LlamaIndex 或 CrewAI 等特定框架。
将一个大型语言模型(LLM)调用的输出依次作为下一个 LLM 调用的输入。这种模式将任务分解为一系列固定的步骤。每个步骤由一个 LLM 调用处理,该调用处理前一个调用的输出。它适用于可以清晰分解为可预测、顺序子任务的任务。
典型用例:
生成结构化文档: LLM 1 创建大纲,LLM2 根据标准验证大纲,LLM 3 根据已验证的大纲撰写内容。多步骤数据处理: 提取信息、转换信息然后进行总结。生成新闻简报: 根据精选输入生成新闻简报。import osfrom google import genai# Configure the client (ensure GEMINI_API_KEY is set in your environment)client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])# --- Step 1: Summarize Text ---original_text = "Large language models are powerful AI systems trained on vast amounts of text data. They cangenerate human-like text, translate languages, write different kinds of creative content, and answer your questions in an informative way."prompt1 = f"Summarize the following text in one sentence: {original_text}"# Useclient.models.generate_contentresponse1 = client.models.generate_content(model='gemini-2.0-flash',contents=prompt1)summary = response1.text.stripprint(f"Summary: {summary}")# --- Step 2: Translate the Summary ---prompt2 = f"Translate the following summary into French, only return the translation, no other text: {summary}"# Use client.models.generate_contentresponse2 = client.models.generate_content(model='gemini-2.0-flash',contents=prompt2)translation = response2.text.stripprint(f"Translation: {translation}")一个初始的 LLM 充当路由器,对用户输入进行分类,并将其导向最合适的专业任务或 LLM。这种模式实现了关注点分离,能够独立地优化下游的各个任务(使用专门的提示、不同的模型或特定工具)。通过对简单任务使用较小的模型,可以提高效率并可能降低成本。当任务被路由后,被选定的 Agent 会接管完成任务的责任。典型用例:
客服系统: 将查询路由到专门处理账单、技术支持或产品信息的 Agent。分层 LLM 使用: 将简单的查询路由到更快、成本更低的模型(如 Llama 3.1 8B),将复杂或不寻常的问题路由到功能更强的模型(如 Gemini 1.5 Pro)。内容生成: 将撰写博客文章、社交媒体更新或广告文案的请求路由到不同的专门提示/模型。import osimport jsonfrom google import genaifrom pydantic import BaseModelimport enum# Configure the client(ensure GEMINI_API_KEY is set in your environment)client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])# Define Routing Schemaclass Category(enum.Enum):WEATHER = "weather"SCIENCE = "science"UNKNOWN = "unknown"class RoutingDecision(BaseModel):category: Categoryreasoning: str# Step 1:Route the Queryuser_query = "What's the weather like in Paris?"# user_query = "Explain quantum physics simply."# user_query = "What is the capital of France?"prompt_router= f"""Analyze the user query below and determine its category.Categories:- weather: For questions about weather conditions.- science: For questions about science.- unknown: If the category is unclear.Query: {user_query}"""# Use client.models.generate_content with config for structured outputresponse_router = client.models.generate_content(model= 'gemini-2.0-flash-lite',contents=prompt_router,config={'response_mime_type': 'application/json','response_schema': RoutingDecision,},)print(f"Routing Decision: Category={response_router.parsed.category}, Reasoning={response_router.parsed.reasoning}")# Step 2: Handoff based on Routingfinal_response = ""if response_router.parsed.category == Category.WEATHER:weather_prompt = f"Provide a brief weather forecast for the location mentioned in: '{user_query}'"weather_response = client.models.generate_content(model='gemini-2.0-flash',contents=weather_prompt)final_response = weather_response.textelif response_router.parsed.category == Category.SCIENCE:science_response = client.models.generate_content(model="gemini-2.5-flash-preview-04-17",contents=user_query)final_response = science_response.textelse:unknown_response = client.models.generate_content(model="gemini-2.0-flash-lite",contents=f"Theuser query is: {prompt_router}, but could not be answered. Here is the reasoning: {response_router.parsed.reasoning}. Write a helpful response to the user for him to try again.")final_response = unknown_response.textprint(f"\nFinal Response: {final_response}")将一个任务分解成独立的子任务,这些子任务由多个 LLM 同时处理,处理结果再进行聚合。这种模式利用了并发性。初始查询(或其部分)并行发送给多个 LLM,每个都有各自的提示/目标。一旦所有分支完成,它们的独立结果会被收集并传递给一个最终的聚合器 LLM,由其将结果合成为最终响应。如果子任务之间没有依赖关系,这可以改善延迟;或者通过多数投票或生成多样化选项等技术提高质量。
典型用例:
带有查询分解的 RAG (Retrieval Augmented Generation): 将复杂的查询分解为子查询,并行运行每个子查询的检索,然后合成结果。分析大型文档: 将文档分成多个部分,并行总结每个部分,然后合并总结。生成多视角内容: 使用不同的角色提示向多个 LLM 提出相同的问题,并聚合它们的响应。对数据进行 Map-Reduce 式操作。import osimport asyncioimport timefrom google import genai# Configure the client (ensure GEMINI_API_KEY is set in your environment)client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])async def generate_content(prompt: str) -> str:response = await client.aio.models.generate_content(model="gemini-2.0-flash",contents=prompt)returnresponse.text.stripasync def parallel_tasks:# Define Parallel Taskstopic = "a friendly robot exploring a jungle"prompts = [\f"Write a short, adventurous story ideaabout {topic}.",\f"Write a short, funny story idea about {topic}.",\f"Write a short, mysterious story idea about {topic}."\]# Run tasks concurrently andgather resultsstart_time = time.timetasks = [generate_content(prompt) for prompt in prompts]results = await asyncio.gather(*tasks)end_time = time.timeprint(f"Time taken: {end_time - start_time} seconds")print("\n--- Individual Results ---")for i, result in enumerate(results):print(f"Result {i+1}: {result}\n")# Aggregate results and generate final storystory_ideas = '\n'.join([f"Idea {i+1}: {result}" for i, result in enumerate(results)])aggregation_prompt = f"Combine the following three story ideas into a single, cohesive summary paragraph:{story_ideas}"aggregation_response = await client.aio.models.generate_content(model="gemini-2.5-flash-preview-04-17",contents=aggregation_prompt)return aggregation_response.textresult = await parallel_tasksprint(f"\n--- Aggregated Summary ---\n{result}")一个 Agent 评估自己的输出,并利用反馈进行迭代改进。这种模式也称为评估器-优化器模式(Evaluator-Optimizer),使用了自我纠正循环。初始 LLM 生成一个响应或完成一个任务。第二个 LLM 步骤(或者甚至是同一个 LLM,但使用不同的提示)然后充当反思者或评估者,根据要求或期望的质量来批判性地检查初始输出。然后将这个批判(反馈)反馈回去,促使 LLM 产生一个改进后的输出。这个循环可以重复进行,直到评估者确认符合要求或达到满意的输出为止。
典型用例:
代码生成: 编写代码、执行代码,并使用错误消息或测试结果作为反馈来修复错误。写作和修改: 生成初稿,反思其清晰度和语气,然后进行修改。复杂问题解决: 生成一个计划,评估其可行性,并根据评估进行完善。信息检索: 搜索信息,并使用评估器 LLM 检查是否找到了所有必需的详细信息,然后再呈现答案。import jsonfrom google import genaifrom pydantic import BaseModelimport enum# Configure the client (ensure GEMINI_API_KEY is set in your environment)client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])class EvaluationStatus(enum.Enum):PASS = "PASS"FAIL = "FAIL"class Evaluation(BaseModel):evaluation: EvaluationStatusfeedback: strreasoning: str# --- Initial Generation Function ---def generate_poem(topic: str, feedback: str = None) -> str:prompt = f"Write ashort, four-line poem about {topic}."if feedback:prompt += f"\nIncorporate this feedback: {feedback}"response = client.models.generate_content(model='gemini-2.0-flash',contents=prompt)poem = response.text.stripprint(f"Generated Poem:\n{poem}")return poem# --- Evaluation Function---def evaluate(poem: str) -> Evaluation:print("\n--- Evaluating Poem ---")prompt_critique = f"""Critique the following poem. Does it rhyme well? Is it exactly four lines?Is it creative? Respond with PASS or FAIL and provide feedback.Poem:{poem}"""response_critique = client.models.generate_content(model='gemini-2.0-flash',contents=prompt_critique,config={'response_mime_type': 'application/json','response_schema': Evaluation,},)critique = response_critique.parsedprint(f"Evaluation Status: {critique.evaluation}")print(f"Evaluation Feedback: {critique.feedback}")return critique# Reflection Loopmax_iterations = 3current_iteration = 0topic = "a robot learning to paint"# simulated poem which will not pass the evaluationcurrent_poem = "With circuits humming, cold and bright,\nAmetal hand now holds a brush"while current_iterationLLM 具备调用外部函数或 API 的能力,以便与外部世界互动、检索信息或执行操作。这种模式通常被称为“函数调用 (Function Calling)”。LLM 被提供可用工具(函数、API、数据库等)的定义(名称、描述、输入 schema)。根据用户查询,LLM 可以决定调用一个或多个工具,通过生成符合所需 schema 的结构化输出(例如 JSON)来实现。这个输出用于执行实际的外部工具/函数,并将结果返回给 LLM。然后,LLM 使用这个结果来组织给用户的最终响应。这极大地扩展了 LLM 的能力,使其超越了训练数据。
典型用例:
使用日程安排 API 预订约会。通过金融 API 检索实时股票价格。搜索向量数据库以获取相关文档(RAG)。控制智能家居设备。执行代码片段。import osfrom google import genaifrom google.genaiimport types# Configure the client (ensure GEMINI_API_KEY is set in your environment)client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])# Define thefunction declaration for the modelweather_function = {"name": "get_current_temperature","description": "Gets the current temperature for a given location.","parameters": {"type": "object","properties": {"location": {"type": "string","description": "The city name, e.g. San Francisco",},},"required": ["location"],},}# Placeholder function to simulate API calldef get_current_temperature(location: str) -> dict:return {"temperature": "15", "unit":"Celsius"}# Create the config object as shown in the user's example# Use client.models.generate_content with model, contents, and configtools = types.Tool(function_declarations=[weather_function])contents = ["What's the temperature in London right now?"]response = client.models.generate_content(model='gemini-2.0-flash',contents=contents,config = types.GenerateContentConfig(tools=[tools]))# Process the Response (Check for Function Call)response_part = response.candidates[0].content.parts[0]if response_part.function_call:function_call = response_part.function_callprint(f"Function to call: {function_call.name}")print(f"Arguments: {dict(function_call.args)}")# Execute the Functionif function_call.name == "get_current_temperature":# Call the actual functionapi_result = get_current_temperature(*function_call.args)# Append function call and result of the function execution to contentsfollow_up_contents = [\types.Part(function_call=function_call),\types.Part.from_function_response(\name="get_current_temperature",\response=api_result\)\]# Generate final responseresponse_final = client.models.generate_content(model="gemini-2.0-flash",contents=contents + follow_up_contents,config=types.GenerateContentConfig(tools=[tools]))print(response_final.text)else:print(f"Error: Unknown function call requested: {function_call.name}")else:print("No function call found in the response.")print(response.text)```一个中心规划器 LLM 将复杂任务分解为动态的子任务列表,然后委托给专门的工作者 Agent(通常使用工具)来执行。这种模式试图通过创建动态生成的初始规划来解决需要多步骤推理的复杂问题。随后将子任务分配给“工作者” Agent 执行,如果依赖允许,甚至可以并行执行。一个“协调者”或“合成器”LLM 收集工作者的结果,反思总体目标是否达成,然后合成最终输出,或者在必要时启动重新规划步骤。这降低了单个 LLM 调用的认知负担,提高了推理质量,减少了错误,并允许工作流程的动态调整。与路由模式的关键区别在于,规划器生成的是一个多步骤的计划,而不是选择单一的下一个步骤。
典型用例:
复杂的软件开发任务:将“构建一个功能”分解为规划、编码、测试和文档编写等子任务。研究和报告生成:规划文献检索、数据提取、分析和报告撰写等步骤。多模态任务: 规划涉及图像生成、文本分析和数据集成等步骤。执行复杂用户请求:例如“计划一个为期 3 天的巴黎旅行,并预订符合我预算的机票和酒店”。import osfrom google import genaifrom pydantic import BaseModel, Fieldfrom typing import List#Configure the client (ensure GEMINI_API_KEY is set in your environment)client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])# Define the Plan SchemaclassTask(BaseModel):task_id: intdescription: strassigned_to: str = Field(description="Which worker type should handle this? E.g., Researcher, Writer, Coder")class Plan(BaseModel):goal: strsteps: List[Task]# Step 1: Generate the Plan (Planner LLM)user_goal = "Write a short blog post about the benefits ofAI agents."prompt_planner = f"""Create a step-by-step plan to achieve the following goal.Assign each step to a hypothetical worker type (Researcher, Writer).Goal: {user_goal}"""print(f"Goal: {user_goal}")print("Generating plan...")# Use a model capable of planning and structured outputresponse_plan = client.models.generate_content(model='gemini-2.5-pro-preview-03-25',contents=prompt_planner,config={'response_mime_type': 'application/json','response_schema': Plan,},)# Step 2: Execute the Plan (Orchestrator/Workers - Omitted for brevity)for step in response_plan.parsed.steps:print(f"Step {step.task_id}: {step.description} (Assignee: {step.assigned_to})")协调者/经理方法 (Coordinator, Manager approach)
群体方法 (Swarm approach)
多个具有不同角色、人设或专业知识的独立 Agent 协同合作,以达成共同目标。这种模式使用了自主或半自主的 Agent。每个 Agent 可能拥有独特的角色(例如项目经理、编码员、测试员、评论员)、专业知识或特定工具的访问权限。它们相互交互和协作,通常由一个中心“协调者”或“经理” Agent 进行协调(就像图中的项目经理),或者使用移交流程(一个 Agent 将控制权移交给另一个 Agent)。
典型用例:
模拟不同 AI 角色的辩论或头脑风暴会议。涉及规划、编码、测试和部署等 Agent 的复杂软件创建。运行虚拟实验或模拟,由 Agents 代表不同的参与者。协作写作或内容创作过程。注意:以下示例是关于如何使用带有移交流程和结构化输出的多智能体模式的简化示例。推荐参考 LangGraph 多智能体 Swarm 或 Crew AI。
from google import genaifrom pydantic import BaseModel, Field# Configure the client (ensure GEMINI_API_KEY isset in your environment)client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])# Define Structured Output Schemasclass Response(BaseModel):handoff: str= Field(default="", description="The name/role of the agent to hand off to. Available agents: 'Restaurant Agent', 'Hotel Agent'")message: str = Field(description="The response message to the user or contextfor the next agent")# Agent Functiondef run_agent(agent_name: str, system_prompt: str, prompt: str) -> Response:response = client.models.generate_content(model='gemini-2.0-flash',contents=prompt,config = {'system_instruction': f'You are {agent_name}. {system_prompt}', 'response_mime_type': 'application/json', 'response_schema': Response})return response.parsed# Define System Prompts for the agentshotel_system_prompt = "You are a Hotel Booking Agent. You ONLY handle hotelbookings. If the user asks about restaurants, flights, or anything else, respond with a short handoff message containing the original request and set the 'handoff' field to 'Restaurant Agent'. Otherwise, handle the hotel request and leave 'handoff' empty."restaurant_system_prompt = "You are a Restaurant Booking Agent. You handle restaurant recommendations and bookings based on the user's request provided in the prompt."# Prompt to be about a restaurantinitial_prompt = "Can you book me a table at an Italian restaurant for 2 people tonight?"print(f"Initial User Request: {initial_prompt}")# Run the first agent (Hotel Agent) to force handofflogicoutput = run_agent("Hotel Agent", hotel_system_prompt, initial_prompt)# simulate a user interaction to change the prompt and handoffif output.handoff == "Restaurant Agent":print("Handoff Triggered: Hotel to Restaurant")output = run_agent("Restaurant Agent", restaurant_system_prompt, initial_prompt)elif output.handoff == "Hotel Agent":print("Handoff Triggered: Restaurant to Hotel")output = run_agent("Hotel Agent", hotel_system_prompt, initial_prompt)print(output.message)重要的是要记住,这些模式是灵活的积木,而非固定的规则。现实世界中的 Agentic 系统常常结合多种模式的元素。例如,一个规划 Agent 可能使用工具,其工作者也可能运用反思模式。多智能体系统内部则可能使用路由进行任务分配。
任何 LLM 应用,特别是复杂的 Agentic 系统,成功的关键在于经验评估。定义衡量指标、衡量性能、识别瓶颈或失败点,并迭代优化您的设计。避免过度设计。
来源:架构即人生