By Hussameddine al Attar | Staff Writer

Stanford’s AI team has recently introduced a new artificial intelligence agent that can simulate authentic human behavior. Generative agents are composed of three fundamental components: memory, reflection, and planning. With the combination of these components, these agents can store and retrieve information, interpret past experiences to make higher-level inferences, and generate future plans for action and reaction.

To test the capabilities of the generative agents, the researchers placed 25 agents within a virtual world that resembles a sandbox video game inspired by The Sims. Each agent was given a unique background and participated in a two-day simulation. In their published paper, the researchers explain that, in traditional game environments, game developers would have to explicitly and manually program the behavior of several characters to throw an in-game party, for instance. In the case of generative agents, on the other hand, it is enough to simply tell one agent that it wants to throw a party. That agent will, in fact, remember to invite all other agents to the party, the invitees will spread the word of the party, and all agents remember to attend. The research paper even mentions an instance of one agent asking another agent out on a date to the party. This was all achieved from one single prompt to the first agent.

The research paper highlights the concept of Information Diffusion, which spreads information between the agents through dialogue. In one case, one agent, Sam, speaks to another agent, Tom, about running for mayor. Tom then discusses Sam’s candidacy with a third agent, John, and the news begins to spread throughout the town, with Sam’s campaign slowly gaining support.

The paper also discusses Relationship Memory in a particularly interesting scenario. Two agents, Sam and Latoya, who had never interacted before, bump into each other at a park and introduce themselves. Latoya tells Sam about a photography project she is working on. In a later interaction between the two agents, Sam remembers the conversation he had with Latoya and asks her how the project is going. The experiment has shown that generative agents, with no previous interaction or knowledge of each other, are likely to form relationships over time.

The results of the experiment over the two-day period were remarkable, as the agents displayed human-like behaviors and qualities such as remembering information, spreading news, and forming relationships over time.

The architecture behind these generative agents was built by coupling the GPT 3.5-Turbo large language model with a long-term memory module that records and stores the agent’s experiences in natural language. This memory module is necessary to maintain relevant context that is too large to describe in a regular prompt. Summarizing information would lead to general and uninformative dialogue; specificity is necessary to accurately mimic human memory and behavior. The retrieval model factors in relevance, recency, and importance to extract the necessary information from the agent’s memory and direct its behavior in real time.

The architecture is further developed by introducing reflection, a combination of memory recall and higher-level reasoning. Reliance on memory to inform decisions, without the ability to make deductions or inferences, would simply prioritize frequent interactions rather than meaningful ones. Reflection allows agents to more accurately assess the situation and make decisions on a deeper level. This process, described as a “[synthesis] of memories into higher-level inferences over time”, allows the agent to analyze its own long-term memory, identify patterns between them, and deduce conclusions regarding itself and the world around it to adjust its behavior accordingly.

 

The final component of the architecture is planning, which is responsible for converting the conclusions that the agent has reached, along with the current environmental conditions, into long-term action plans. These action plans are then further broken down into more detailed behaviors, which are used by the agent to interact with its environment. This ensures that agent behavior is more believable in the long-run and not only at a specific moment. For example, although it makes sense for an agent to have lunch at each of 12, 12:30, and 1 PM, it isn’t believable to have lunch at all three of those times. Agents plan their actions ahead of time to prioritize long-term believability over “in-the-moment” believability.

 

The applications of generative agents are vast. They can be used to build video game NPCs, prototype and simulate social scenarios, and test social systems and ideas. As the field of natural language processing continues advancing at this incredible pace, we can expect significant developments in such projects in the coming years.