自然杂志 ›› 2026, Vol. 48 ›› Issue (2): 79-087.doi: 10.3969/j.issn.0253-9608.2026.02.001

• 特约专稿 •    下一篇

从工具到个人助理——AI Agent 的原理、演进与安全风险

程彭洲,张新鹏   

  1. ①上海交通大学 计算机学院,上海 200240;②上海大学 计算机工程与科学学院,上海 200444 
  • 收稿日期:2026-03-16 出版日期:2026-04-25 发布日期:2026-04-16

From tool to personal assistant: The principles, evolution, and security risks of AI agents

CHENG Pengzhou, ZHANG Xinpeng   

  1. ① School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240, China; ② School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
  • Received:2026-03-16 Online:2026-04-25 Published:2026-04-16

摘要: 人工智能智能体(artificial intelligence agent, AI Agent)作为2025 至2026 年最具变革性的技术方向之一,正在重塑人机交互的边界,推动人工智能从被动响应向主动服务的跨越。通过构建感知、规划、决策与反思等核心模块,结合工具调用能力与分层记忆管理机制,AI Agent 已具备多步骤推理与环境交互能力,成为大模型时代技术落地的核心应用形态。以OpenClaw 为代表的新一代AI Agent 框架,凭借自然语言指令驱动的桌面环境自动化操作能力,打破了传统智能工具的应用局限,推动智能系统实现了从“工具”向“个人助理”的范式跃迁,并展现出持续服务、个性适配和逐步演化为“用户数字分身”的发展趋势。然而,随着AI Agent 自主决策权限的提升与环境控制范围的扩大,其安全风险日益凸显,包括意图误解、感知幻觉等内生认知偏见问题,以及提示注入、隐私泄露和后门攻击等外部恶意威胁,使其成为一种新的高风险应用形态。本文系统回顾AI Agent 从工具调用到智能个人助理的发展历程,分析其关键原理与技术演进,并探讨其交互机制中的安全风险及未来研究方向。

关键词: 人工智能智能体, 工具调用, 个人助理, OpenClaw, 安全风险

Abstract: As one of the most transformative technological directions between 2025 and 2026, AI Agent is reshaping the boundaries of human-computer interaction and promoting the leap of artificial intelligence from passive response to active service. By constructing core modules such as perception, planning, decision-making, and reflection, combined with tool calling capabilities and hierarchical memory management mechanisms, AI Agent has acquired the abilities of multi-step reasoning and environmental interaction, becoming a core application form for the implementation of technologies in the large model era. Represented by frameworks such as OpenClaw, a new generation of AI Agent frameworks has broken the application limitations of traditional intelligent tools by virtue of the automatic operation capability in desktop environments driven by natural language instructions, promoting the paradigm shift of intelligent systems from tools to personal assistants, and demonstrating the development trends of continuous service, personalized adaptation and gradual evolution into user digital avatars. However, with the improvement of AI Agent’s autonomous decisionmaking authority and the expansion of environmental control scope, its security risks have been increasingly prominent, including issues of internal cognitive bias such as intent misunderstanding and perception hallucination, as well as external malicious threats such as prompt injection, privacy leakage and backdoor attacks, making it a new high-risk application form. This paper systematically reviews the developmental trajectory of AI Agents from tool calling to intelligent personal assistants, analyzes their key principles and technical evolution, and explores security risks within their interaction mechanisms alongside future research directions.