{"componentChunkName":"component---src-templates-post-template-js","path":"/posts/a-survey-of-agents","result":{"data":{"markdownRemark":{"id":"2c43fec5-500e-53d2-95ed-c1fd9798922c","html":"<p>2023 年 8 月の論文「<a href=\"https://arxiv.org/abs/2308.11432\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">A Survey on Large Language Model based Autonomous Agents</a>」を読んだメモです。</p>\n<p>GitHub: <a href=\"https://github.com/Paitesanshi/LLM-Agent-Survey\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">https://github.com/Paitesanshi/LLM-Agent-Survey</a></p>\n<h2 id=\"1-introduction\" style=\"position:relative;\"><a href=\"#1-introduction\" aria-label=\"1 introduction permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>1 Introduction</h2>\n<blockquote>\n<p>Specifically, we organize our survey based on three aspects including the construction, application, and evaluation of LLM-based autonomous agents.</p>\n</blockquote>\n<p>以下の 3 つの観点についてまとめているとのこと。</p>\n<ul>\n<li>construction</li>\n<li>application</li>\n<li>evaluation</li>\n</ul>\n<blockquote>\n<p>For the agent construction, we present a unified framework composed of four components, that is, a profile module to represent agent attributes, a memory module to store historical information, a planning module to strategize future actions, and an action module to execute the planned decisions.</p>\n</blockquote>\n<p>エージェントの構造として、4 つのコンポーネントに整理したフレームワークを提案。</p>\n<ul>\n<li>profile</li>\n<li>memory</li>\n<li>planning</li>\n<li>action</li>\n</ul>\n<h2 id=\"2-llm-based-autonomous-agent-construction\" style=\"position:relative;\"><a href=\"#2-llm-based-autonomous-agent-construction\" aria-label=\"2 llm based autonomous agent construction permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>2 LLM-based Autonomous Agent Construction</h2>\n<h3 id=\"21-agent-architecture-design\" style=\"position:relative;\"><a href=\"#21-agent-architecture-design\" aria-label=\"21 agent architecture design permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>2.1 Agent Architecture Design</h3>\n<h4 id=\"211-profiling-module\" style=\"position:relative;\"><a href=\"#211-profiling-module\" aria-label=\"211 profiling module permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>2.1.1 Profiling Module</h4>\n<p>profile の作成方法は以下の 3 つ。</p>\n<ul>\n<li>手作り</li>\n<li>LLM で生成</li>\n<li>現実のデータセットに基づく</li>\n</ul>\n<h4 id=\"212-memory-module\" style=\"position:relative;\"><a href=\"#212-memory-module\" aria-label=\"212 memory module permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>2.1.2 Memory Module</h4>\n<p>たとえば、短期記憶は context window、長期記憶は vector storage とのこと。</p>\n<h4 id=\"213-planning-module\" style=\"position:relative;\"><a href=\"#213-planning-module\" aria-label=\"213 planning module permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>2.1.3 Planning Module</h4>\n<p>以下の構造に整理。</p>\n<ul>\n<li>\n<p>フィードバックなしの計画</p>\n<ul>\n<li>サブゴールの分解</li>\n<li>マルチパス思考</li>\n<li>外部プランナー</li>\n</ul>\n</li>\n<li>\n<p>フィードバックを伴う計画</p>\n<ul>\n<li>\n<p>環境に関するフィードバック</p>\n<ul>\n<li>ReAct</li>\n</ul>\n</li>\n<li>人間のフィードバック</li>\n<li>モデルのフィードバック</li>\n</ul>\n</li>\n</ul>\n<h4 id=\"214-action-module\" style=\"position:relative;\"><a href=\"#214-action-module\" aria-label=\"214 action module permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>2.1.4 Action Module</h4>\n<p>ここは読み飛ばした。</p>\n<h2 id=\"3-llm-based-autonomous-agent-application\" style=\"position:relative;\"><a href=\"#3-llm-based-autonomous-agent-application\" aria-label=\"3 llm based autonomous agent application permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>3 LLM-based Autonomous Agent Application</h2>\n<p>どんな分野があるかまとめられている。</p>\n<p>大きくは以下の 3 分野。</p>\n<ul>\n<li>社会科学</li>\n<li>自然科学</li>\n<li>エンジニアリング</li>\n</ul>\n<h2 id=\"4-llm-based-autonomous-agent-evaluation\" style=\"position:relative;\"><a href=\"#4-llm-based-autonomous-agent-evaluation\" aria-label=\"4 llm based autonomous agent evaluation permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>4 LLM-based Autonomous Agent Evaluation</h2>\n<h3 id=\"41-subjective-evaluation\" style=\"position:relative;\"><a href=\"#41-subjective-evaluation\" aria-label=\"41 subjective evaluation permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>4.1 Subjective Evaluation</h3>\n<blockquote>\n<p>LLM-based agents have a wide range of applications. However, in many scenarios, there lacks general metrics to evaluate the performance of agents. Some potential properties, like agent’s intelligence and user-friendliness, cannot be measured by quantitative metrics as well. Therefore, subjective evaluation is indispensable for current research.</p>\n</blockquote>\n<p>評価のための一般的な指標が不足しており、定量評価できない特性もあるため、現在の研究では主観的な評価は不可欠としている。</p>\n<blockquote>\n<p>Subjective evaluation refers to the testing of the capabilities of LLM-based agents by humans through various means such as interaction, scoring, and so on.</p>\n</blockquote>\n<p>主観的な評価とは、人間が対話したりスコア付けするものなど。</p>\n<p>チューリングテストもある。</p>\n<p>主観的な評価に LLM を使うこともできる。</p>\n<p>たとえば EvaluatorGPT や ChatEval というものがある。</p>\n<h3 id=\"42-objective-evaluation\" style=\"position:relative;\"><a href=\"#42-objective-evaluation\" aria-label=\"42 objective evaluation permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>4.2 Objective Evaluation</h3>\n<p>客観的な評価の方法もいろいろある模様。</p>\n<h2 id=\"6-challenges\" style=\"position:relative;\"><a href=\"#6-challenges\" aria-label=\"6 challenges permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>6 Challenges</h2>\n<h3 id=\"62-generalized-human-alignment\" style=\"position:relative;\"><a href=\"#62-generalized-human-alignment\" aria-label=\"62 generalized human alignment permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>6.2 Generalized Human Alignment</h3>\n<p>LLM は人間の正しい価値観で動くよう調整されることが多いが、それではシミュレーション用途で不適切な可能性があるとのこと。</p>\n<h3 id=\"65-knowledge-boundary\" style=\"position:relative;\"><a href=\"#65-knowledge-boundary\" aria-label=\"65 knowledge boundary permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>6.5 Knowledge Boundary</h3>\n<p>人間をシミュレーションするうえでは、LLM は膨大な知識を持ちすぎている。</p>\n<p>知識がない想定で意思決定する様子をシミュレーションするときに困る。</p>\n<h2 id=\"感想\" style=\"position:relative;\"><a href=\"#%E6%84%9F%E6%83%B3\" aria-label=\"感想 permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>感想</h2>\n<p>エージェントの構成要素を以下の 4 つに整理しているのが分かりやすかった。</p>\n<ul>\n<li>profile</li>\n<li>memory</li>\n<li>planning</li>\n<li>action</li>\n</ul>\n<p>主観的な評価は不可欠だと明言されている点は、こう言われていますと引用しやすい。</p>\n<p>課題として、正しい倫理観を持ちすぎることや、知識がありすぎることにより、人間らしいシミュレーションをできない可能性が指摘されているのは面白い。</p>","fields":{"slug":"/posts/a-survey-of-agents","tagSlugs":["/tag/llm/","/tag/agent/"],"autoRecommendPosts":["llm-based-agents-survey","generative-agents","llm-patterns","mrkl-systems"]},"frontmatter":{"date":"2024-02-02T09:23:18.135Z","description":"2023 年 8 月の論文「A Survey on Large Language Model based Autonomous Agents」を読んだメモです。","tags":["llm","agent"],"title":"「A Survey on Large Language Model based Autonomous Agents」を読んだメモ","socialImage":null,"recommendPosts":null}}},"pageContext":{"slug":"/posts/a-survey-of-agents"}},"staticQueryHashes":["251939775","3942705351","401334301"]}