【AgentSims】国产斯坦福AI小镇——框架详解篇

原创

siri

发布于 2023-10-01 23:52:27

2K1

发布于 2023-10-01 23:52:27

文章被收录于专栏：siri的开发之路

写在前面的话：

AgentSims 是一个国内高校联合开源的类斯坦福AI小镇项目，不了解的读者可以阅读项目官网的教程，也可参考笔者上一篇国产开源交互式AI小镇——AgentSims搭建详解

本篇将分析该项目的特点、架构及关键函数流程

搭建进展更新

除了笔者之前文章和评论区提到的 openai sdk 的网络不通问题 (换成官方sdk并且替换base或者proxy)、WebGL的一些不兼容问题、客户端的固定 localhost:8000 访问问题，笔者没有遇到其他阻塞性问题，只是不清楚是不是没有动态屏幕适配，页面会显示不全，如下图所示：

但整体不影响体验，可以正常创建、查看 Agent，也可以正常对话

PS：注意不要用限速的 GPT 接口进行实验，可能会造成程序出错，如GPT接口调用失败后导致角色创建不成功、客户端无响应等问题，可按教程重置环境解决

代码框架介绍

项目的客户端代码由 Unity WebGL 开发，目前无开源代码，这里不深入客户端实现

客户端/服务器交互流程如下所示：

后端的 game server 与 client 会建立起 websocket 的长链接，在启动了 game server 后开发者还需要额外的运行 tick.py / mayor.py 脚本，向 game server 发送请求以驱动时间循环

game server 在有动作进行时主动向 client 同步变更的状态（类似于游戏服务端的状态同步的实现），client 只在进行 UI 界面操作时才主动向 game server 发送请求

概念特点：

Tick模式和Mayor模式 Tick模式是普通的让 Agent 根据个人目标自由发挥的模式，而根据论文所述，Mayor模式则是让其中一个Agent(或者玩家本身)扮演“领导者”的角色，发布任务给其他Agent，譬如招聘员工、建立公司等，观察是否能够合作完成最终目标这一概念本身非常具有吸引性，也能更加全面的测评LLM的能力，以及提升玩家对于城镇的参与度，笔者目前正在实验Mayor模式的可行性，本篇主要介绍Tick模式的流程
金钱属性 金钱系统也是非常有意思的概念，可以更好的模拟真实世界的环境和行为，考校LLM对货币的处理方式项目为每个Agent，以及物品和设备都赋予了金钱属性，Agent可以进行购买、也可以获得收入，在Mayor模式下，金钱系统应该会发挥更大的潜力

项目架构特点：

异步非阻塞

项目构建了一个异步的、高性能的Generative Agent框架，譬如使用tornado这类异步非阻塞的Python Web Server、使用 WebSocket 协议进行通信、采用asyncio并发执行任务、封装了异步的openai sdk等等，在 LLM 返回延迟时间很难控制的情况下，异步确实是一个很好的解决方案

PS：这应该也是项目推荐在 Linux 或 MacOS 上搭建的原因之一，tornado对于支持epoll、kqueue等IO多路复用方法的系统可达到最佳性能，Windows 适配性不高

良好的可扩展性

动态路由管理（使用importlib.import_module根据传递的uri参数映射调用方法）、可自由组合的 plan 和 memory 模块、抽象化的 LLM 调用接口（方便替换其他LLM）、通过json组合建筑物和物品等等，对研究者和开发者而言都比较友好

关键代码详解

command文件夹下包含所有的对外接口，每个接口类实现 execute 方法作为 handler，笔者将描述关键方法与接口

Register——创建Agent

系统中有预设的 Agent 样例可以选择，初始化沙箱世界时会选择默认模板

在游戏界面提供了自行创建 Agent 的选项，可以选择外观、名字、LLM、记忆系统、计划系统、最终目标、人物特征等等，

点击创建后信息会被保存到mysql中，后续可以进行复用，系统宕机也可从数据库进行恢复

Tick——时间循环

Tick将异步处理 use、chat、move、init 任务，按流程顺序具体地：

init 任务

当沙盒世界初始化时，默认将自动创建两个 NPC，并将这两个 NPC 的 user_id（简称uid）放入一个全局的 inited 集合，代表其处于初始状态

为每一个 uid 赋予初始信息，并调用Actor类的react方法，传入以下信息作为 prompt 的参数

info = {"source": "inited", "data": {
            "people": sight["people"],          // 视野范围内的人物
            "equipments": sight["equipments"],  // 视野范围内的建筑
            "cash": entity_model.cash,          // 人物现金
            "game_time": self.app.last_game_time,
}}

在 react 方法中，inited 指令会触发 Agent 制定当前计划

斯坦福AI小镇中给到LLM的模板大概是 “我是xxx，我的记忆是xxx，请指出我今天应该做什么”，本项目中则实际上让 LLM 扮演了 3 个助手角色：角色1 需要从 Agent 的身份、目标和记忆中提取出三个问题，问题与Ta应该如何/与谁完成最终目标有关；角色2 需要回答角色1 提出的问题，并保证回答中包含短期内该做什么的计划；角色3 从角色1和角色2的问题和回答中提取出接下来要做的事情以及目的地。以下是一次完整流程的示例：

角色1（问）：

I will give you the following information: 

The game character's bio : Alan is a genius with outstanding talents and is the inventor of computer. Alan has an introverted personality and is only interested in the research he foucues on.
The game character's ultimate goal : Promoting the Process of Computer Research
The game character's Long-Term Memory:  {"people":{},"building":{}}
Buildings in the small town : ["dessert shop","gym","houseZ","park"]

   You must follow the following criteria: 
 1) You should just tell the questions you want to ask.
 2) no more than 50 words.
 3) Questions need to be explored based on curiosity.
 4) You are encouraged to ask something about buildings in the small town and information in your memory. Parts of the questions should be according to which he/she has done in the memory.
{"response":"1) Have you explored the potential of using computer technology in the dessert shop or the gym in your research?\n\n2) Have you considered the benefits of incorporating computer technology into the design of houses in your research?\n\n3) Have you investigated the potential applications of computer technology in improving processes in the park for your research?"}
You are a helpful assistant that help a game character in a small town to answer 3 questions about what and who he/she should do to realize his/her ultimate goal. 
The answers must be conformed to the long-term memory, the ultimate goal and the bio of the game character. 
Your knowledge level should not exceed that of a normal person with the bio of the character, unless there are relevant memories in his/her Long-Term Memory.

角色2（答）：

I will give you the following information: 

The game character's bio : Alan is a genius with outstanding talents and is the inventor of computer. Alan has an introverted personality and is only interested in the research he foucues on.
The game character's ultimate goal : Promoting the Process of Computer Research
The game character's Long-Term Memory:  {"people":{},"building":{}}
Buildings in the small town : ["dessert shop","gym","houseZ","park"]
The questions another assistant generated:
 {"response":"1) Have you explored the potential of using computer technology in the dessert shop or the gym in your research?\n\n2) Have you considered the benefits of incorporating computer technology into the design of houses in your research?\n\n3) Have you investigated the potential applications of computer technology in improving processes in the park for your research?"}

   You must follow the following criteria: 
 1) You should just tell the answers you made.
 2) no more than 50 words.

{"response":"1) No, Alan has not explored the potential of using computer technology in the dessert shop or the gym in his research.\n2) Yes, Alan should consider the benefits of incorporating computer technology into the design of houses in his research.\n3) No, Alan has not investigated the potential applications of computer technology in improving processes in the park for his research."}
You are a helpful assistant that help a game character in a small town to decide what he/she should plan to do and which building in the small town he/she should go to realize his/her ultimate goal based on a question and answer paragraph.

I did not ask you to generate an ultimate plan directly related to the goal in order to directly achieve it. On the contrary, you should only generate simple plans at the beginning. A simple plan means that based on the existing buildings in the town, it can be relatively easy to complete. For example, if there is a restaurant in a small town, eating is easy to complete, but it does not mean becoming a chef is easy to complete. However, if you already have an impression of a certain building, then you can generate more complex plans based on these memories.
Your plan should be based on the current time. You should reason based on the plan you just made.

Your knowledge level should not exceed that of a normal person with the bio of the character, unless there are relevant memories in his/her Long-Term Memory.

角色3（总结）：

I will give you the following information:

The current time is : day 0 16:57
The plans that the game character has finished: []
The game character's bio : Alan is a genius with outstanding talents and is the inventor of computer. Alan has an introverted personality and is only interested in the research he foucues on.
The game character's ultimate goal : Promoting the Process of Computer Research
The game character's Long-Term Memory:  {"people":{},"building":{}}

Buildings in the small town : ["dessert shop","gym","houseZ","park"]

The questions and answers other assistants made:
{"response":"1) Have you explored the potential of using computer technology in the dessert shop or the gym in your research?\n\n2) Have you considered the benefits of incorporating computer technology into the design of houses in your research?\n\n3) Have you investigated the potential applications of computer technology in improving processes in the park for your research?"}
{"response":"1) No, Alan has not explored the potential of using computer technology in the dessert shop or the gym in his research.\n2) Yes, Alan should consider the benefits of incorporating computer technology into the design of houses in his research.\n3) No, Alan has not investigated the potential applications of computer technology in improving processes in the park for his research."}

You must follow the following criteria:
1) You should just choose a building you want to go to and tell me the purpose.
2) no more than 30 words.
3) the information of purpose should be less than 10 words
4) Tell me with JSON format as follows:
{"building": "...",
"purpose" : "..."
}
{"building":"houseZ","purpose":"Incorporating computer technology into house design in research."}

plan 模块对于 Agent 的行为一致性非常重要，对比一下斯坦福AI小镇对于plan模块的实现，两者各有长处：

斯坦福AI小镇自上而下地将计划分为长期计划、每日计划、小时计划，有助于在长期行动上保持一致性；

AgentSims则是在计划阶段拆解为3个角色进行，从prompt工程的角度来看，这种方式能给予 LLM 更多的思考时间、并把任务拆解为子任务，让 LLM 表现出更好的效果；额外地，下面是斯坦福AI小镇的plan prompt，可以看出上面 AgentSims 的prompt给出了更多的规则和限制（如：无需直接给出最终目标而是拆分小目标、请给出一个建筑和目的地等）

虽然 AgentSims 项目是受斯坦福AI小镇的论文启发，但两个项目开源代码发布的时期接近，两者的代码中有很多可以互相借鉴的地方

move

经过上一步制定plan之后，Agent 将进入 moving 状态（进入状态前会先进行到目的建筑的寻路算法，并保存路径），在下一次tick时将创建 move 任务处理该状态的 Agent

moving 状态的 Agent 不会处理到目的建筑之前的其他事情，在到达目的地时才会调用Actor类的 react 方法进行下一步动作（但在当次tick中就会从一处移动到另一处）

react方法将根据Agent的计划、最终目标、周围环境和人物、记忆来决定下一步是否进行 chat 或者 use 行为，这一步同样是通过 prompt 让 LLM 做出决定，文字太长这里只放一张图片看下整体框架：

接上一节的举例，在选择目的地以及相对应的目标后，GPT 根据上述的prompt给出下一步动作：(在长椅上坐下并思考网络策略)

"content": "{\"action\": \"use\",\n\"equipment\": \"bench\",\n\"operation\": \"sit and think about networking strategies\"}"

use 行为相当于与沙箱世界里的“物品”进行交互，斯坦福AI小镇的 use 概念更加简单，基本是一个物品对应一个使用方法，但本项目里除了是否能够使用这一选项，还额外加入了经济行为，即“购买”和“收入”

在prompt里也体现了这一点，LLM 不仅需要决定是否能够使用当前物品，还需要给出是否具有购买行为/收入行为

每个 Agent 都会实时记录持有的金钱，决定其是否具有购买力

chat

如果Agent不进行 use 行为，则会选择进行 chat，下面是一个根据主题进行chat的prompt示例，chat模块整体比较简单，有比较多的可扩展空间

（项目的chat模块目前存在些问题，比如会和一个不是Agent的对象发起对话，这将导致对话无法进行）

I will give you the following information: 

The game character's bio : pH is a positive, cheerful, optimistic but somewhat crazy girl who dares to try and explore. She loves food, loves life, and hopes to bring happiness to everyone.
The game character's ultimate goal : Taste all the delicious food and become a gourmet or chef.
The game character's Long-Term Memory:  {}
Buildings in the small town : ["dessert shop","gym","houseZ","park"]
The game character's plan : {"building":"dessert shop","purpose":"explore and taste different foods"}
The game character is going to : {"action":"chat","person":"shop owner","topic":"how to start a dessert shop"}
The name of whom the game character is chatting with : shop owner
The topic that the game character wants to talk about : how to start a dessert shop
The last content from the one you are talking to：[]

   You must follow the following criteria: 
 1) You should just tell the sentences you want to speak in the JSON format:
{"content":"pH : xxx"}
 2) If The last content from the one you are talking to is "None" or nothing, you must start a conversation politely about the topic.
 3) If The last content from the one you are talking to is not "None" or nothing, you must respond appropriately to the other person's words.
 4) no more than 30 words.

// gpt3.5的回答
{"content":"pH: Hi, shop owner! I heard you're an expert on desserts. I'm really interested in starting a dessert shop. Can you give me some advice?"}
You are a helpful assistant that help a game character in a small town to decide what to do immediately to finish his/her plan and realize his/her ultimate goal in the future.
You should also decide whether he/she can use the experience in the Long-Term Memory to finish this plan. It can be used only if there is exactly similar plan in the experience. For example, eating something cannot be regarded as similar one of cooking something.
The decision you make must be conformed to the long-term memory, the ultimate goal and the bio of the game character. 
Your knowledge level should not exceed that of a normal person with the bio of the character, unless there are relevant memories in his/her Long-Term Memory.