MUD procedural generation: drawing the walls

This is the second post in a series on procedural generation for a MUD, which is a text-only MMO, or a multiplayer Zork. The eventual goal is to create a large, varied MUD containing multiple cities, towns, and wilderness biomes, and a wide array of NPCs and items, as the basis for more personalized and customized work.

Last time, we spoke about placing towers for the city walls. Let’s create the walls themselves.

Read more “MUD procedural generation: drawing the walls”

MUD Procedural generation: outlining city walls

This starts a series on procedural generation for a MUD, which is a text-only MMO, or a multiplayer Zork. The eventual goal is to create a large, varied MUD containing multiple cities, towns, and wilderness biomes, and a wide array of NPCs and items, as the basis for more personalized and customized work.

Today we’re looking at generating the outline of a city: the city walls.

Read more “MUD Procedural generation: outlining city walls”

Procedural narratives: working toward an implementation

This is the start of the capstone to our discussion of actor-based procedural narratives. We discussed our goals, we described how to make decisions without referring to other actors, we described some potential improvements to that decision-making algorithm, we incorporated other actors’ actions into our decisions…

and now it’s time to write some code.

Before we do that, let’s look at the code we need to write.

We need:

  • the world
  • any background simulation (accounting for nature and other non-actors)
  • the mutations or possible actions
  • the basic elements of plot
  • a plot generator
  • an explainer

Now, we’ve implicitly discussed it, but let’s put this to the fore: we’re going to be producing tons and tons of possible worlds. We need a way to make these copies, be certain of not overwriting the actual world state, and process them as appropriate.

Using a functional programming language gives us security and copy-on-write semantics for free. We’re guaranteed not to overwrite the state of the world, ever. And many runtimes are smart enough to share unchanged state, which is awesome.

Another option is to store the decisions rather than the entire world state. This is a time/memory tradeoff; instead of storing world states, you recalculate them.

Previously we mentioned a constraint on the amount of prediction each actor does. We mentioned a number of possible worlds to evaluate at first. If we reinterpret that as a number of days in the future instead, we can change to a depth-first search. Now, instead of a queue of states to examine, we have a stack. This is much more efficient: you keep logarithmically many states in memory instead of a linear amount. However, it rather breaks best-first search, which previously allowed us to descend further along paths that seemed more promising. What we can do instead is probabilistically terminate processing on paths based on their depth and the expected utility.

In the context of a game, it should be relatively straightforward to identify the actions that can be taken. For a non-interactive work of fiction, it might be slightly harder. In either case, the plot is a heap of work.

Tune in next time and we’ll start the implementation. Maybe.

Procedural narratives: explain yourself!

One of our goals in producing actors was to get them to explain their plans. You get to the climax, the big bad delivers a villain speech explaining everything, and you hear:

It was me all along! I hired Thekal the Orc for my Army of Doom! Then I hired Koho the Salmon-beast for my Army of Doom! Then I purchased a sickle sword for Thekal the Orc! Then I purchased a spear for Koho the Salmon-beast! Then I told Thekal the Orc and Koho the Salmon-beast to patrol the corridors of my Fortress of Doom!

And then, three chapters later, if you pay close attention, you find out that she was actually the one to kill your parents. But that’s only a third of the way through her monologue, because, though it was halfway through her rise to power, she had this geometric growth thing going on. So you’ve got another six chapters of villain monologue because your villain AI didn’t know how to explain itself properly.

I don’t know how to fix this. But I have a stub leading toward a solution.

The basic idea is that your AI should have small goals that lead up to a big goal. For instance, the villain’s ultimate goal was to rule the country. One level down from that, she had two goals: obtain the Talisman of Senatorial Mind Control to take control of the government, and raise an army to crush the remaining resistance. Another level down, and on the talisman side her goals are to obtain the Three Rods of Power and assemble them at Mount Doom, while on the army side her goals are to obtain equipment and recruit soldiers. In order to get the Rods of Power, she can use her army to raid the Temple of Aberzle (where the Rod of Air is stored), use money to purchase the Rod of Water (which a rich person is keeping as a curiosity), and steal the Rod of Fire from Castle Fira.

You end up with a tree, or at least a directed acyclic graph, of plans within plans. It’s pretty easy to generate an explanation from this. You might ask the Dark Lady Valeria, “Why are you doing all this?” And she would answer:

I want to rule the country, so I decided to obtain the Talisman of Senatorial Mind Control and raise a private army. I needed the Three Rods of Power to obtain the talisman, so I looted the Rod of Air from the Temple of Aberzle, bought the Rod of Water from Locke Cole, and stole the Rod of Fire from Castle Fira. I recruited 7,232 Orcs for my army and purchased 6,491 swords, 3,188 spears, and 7,445 shields for them.

It’s…not great, but it does include the important bits, in rough order of importance, and it’s straightforward how to include more or less detail. If you’re making these intermediate goals manually, you can specify whether it’s typically important to go into the details or not.

Putting it in context

This is a top-down planning system. Before we were working with a bottom-up system instead. How do we reconcile this? Should we switch to top-down entirely?

Well, maybe. The top-down approach mimics how a human would plan. It’s somewhat backwards temporally, and its success requires accurately predicting the results of all your actions. That can be costly — but on the other hand, failures look somewhat like human failures. (The AI might fail at a prediction no human would, but it’s still producing a plot that’s perhaps reasonable in isolation.) But the huge downside of a pure top-down approach is that it requires a human to manually plug in every single intermediate state. That’s just exhausting.

A hybrid approach lets you guide your AI a lot more. This reduces the amount of innovation it can generate — but let’s face it, I can’t reasonably implement something that can find a novel solution that takes more than about twenty timesteps in-game. If I could make such an AI, I’d probably be able to apply it to real-world problems and make bank.

This approach does let the AI innovate among a much smaller search space when generating plans. But with a smaller search space, it can’t really produce as much novelty. The AI can also innovate between the plan nodes, but here it’s hamstringed by a lack of processing power.

Still, I think this is a balance between writing by hand and automatic generation.

Next time, we’ll consider implementing this beast.

Procedural narratives: considering others’ actions

Last time, we explored a few ways to make planning tractable. But there was a huge caveat there: each actor assumed that no other actor was going to do anything. And that’s a huge issue.

Professor Quirrell had remarked over their lunch that Harry really needed to conceal his state of mind better than putting on a blank face when someone discussed a dangerous topic, and had explained about one-level deceptions, two-level deceptions, and so on. So either Severus was in fact modeling Harry as a one-level player, which made Severus himself two-level, and Harry’s three-level move had been successful; or Severus was a four-level player and wanted Harry to think the deception had been successful. Harry, smiling, had asked Professor Quirrell what level he played at, and Professor Quirrell, also smiling, had responded, One level higher than you.

—Harry Potter and the Methods of Rationality

Right now, sadly, our actors are all playing at level negative one. They don’t even acknowledge that other people can act.

Let’s say you have two actors, Alice and Carol. They know each other very well — they’re married. They both love salmon, and there’s salmon steak in their fridge. Alice gets home from work an hour before Carol, and she always eats as soon as she gets home. Carol might recall, as she is leaving work, that there is a salmon steak at home, and she might count on eating it for dinner. But this would demonstrate a minor lack of foresight. When she gets home, she will be disappointed and hungry.

This is a trivial example, of course, and one that happens all the time in real life. People make this mistake. But when it’s something I care about, something I am devoting my full attention and faculties to, and I know about these other actors, I will attempt to incorporate their probable actions into my plans. At present, our AI system will never do this.

How do we fix this?

Moving from level negative one to level zero requires simulating all the other actors that Alice knows about as level negative one characters. Instead of simulating the changes caused by Alice and nature alone, we simulate other characters using the previous level. At level zero, you acknowledge that others exist and will take some default action at every turn. That’s cheap enough. It’s also dead stupid, but at least it’s slightly less stupid than before.

The next step beyond this, of course, is to model everyone else as a level zero character. (You could use level negative one, but since that’s only slightly less expensive, there’s really no point.) This requires vastly more resources, of course, and that’s true of real life, too — we rarely consider how someone will react to our reactions to their reactions to our plans.

Since going beyond level zero is expensive, we want to do that as little as possible. That means, ideally, coming up with a heuristic to determine when to switch from level zero to level one — and we probably don’t want to bother implementing level two or higher.

Next time, we’ll see how we can get actors to explain their motivations.

Procedural narratives: obvious improvements

Last time, we talked about how to have an actor make plans. We provided an expensive algorithm for reaching decisions, a simple breadth-first search through a space of possible worlds. It worked, but it was hideously expensive. Let’s make it a little better.

Best-first search

The most obvious change is to be smarter about what nodes we search and in what order. We can switch from breadth-first search, which will analyze every possibility one step from now, followed by every possibility two steps from now, and so on until your computer starts sobbing; instead, we can use a sort of best-first search.

In breadth-first search, we maintain a queue of nodes. When we visit a possible world, we put a tuple into that queue consisting of the possible world and the path we used to get there. Also, for convenience, we store the score of that possible world in the same tuple (the crude approximation of its expected utility). We keep popping items from the queue, and each time, we insert all the worlds you can reach in one decision back into the queue.

In best-first search, we use a priority queue instead. The queue is keyed on the the world’s score. This lets us repeatedly take only the best world out of the queue each time and explore possibilities from there.

This works. It’s the standard solution. It has a glaring problem.

Let’s say the actor, Scarlett, wants a television. Secondarily, she wants money, at a rate of $10,000 equivalent to one TV. She has an option of preordering a television for $200, to arrive in thirty days. Or she can just go to work and earn $5. We’ve got enough processing budget to analyze four billion nodes, plenty to breadth-first search far enough to get the television in. (Four billion nodes is 31 days’ worth of simulation for all possible decisions.)

But we’re doing best-first search. So we’ll look at the first two options, see that one loses us $200 and the other gains us $5, and we’ll prefer the second, since it puts us $205 ahead of the other. And we’ll keep going on that branch, keep choosing to work, never choosing to buy the television. What can we do about this?

Simulate ahead

Instead of just looking at the state of the world immediately after making the decision, we can look ahead thirty days, assuming we always make some nominal decision (probably the decision to do nothing). This greatly increases the cost of evaluating each world, but it avoids stupidity like this. Since we previously had a budget of looking through four billion nodes, now we’ll only have a budget of looking through about 140 million. But that took us to 31 days before, and it takes us through 26 days now. But we see a potential world 56 days from now.

This obviously isn’t perfect. Let’s say Scarlett, in addition to earning money, has bills to pay. Bills cost $4.50 per day, and she has $200 in savings. In 26 days, she’ll have saved up another $13, and we assumed she did nothing for the next 30 days. That saw her $17 in the red. But in Scarlett’s home town of Southumberland, there’s no such thing as credit; there’s debtor’s prison. If she buys the TV, she’ll end up in prison next month, with no access to television and no money in the bank.

We can change this to make her nominal decision “go to work” instead of “do nothing”. I’m still worried about similar problems that will require more manual intervention to fix.

Echo utility backwards to the point of the decision

We created the world. We created the option to preorder a television — we’ve created delayed gratification. We can have our AI handle this explicitly, as long as the events are structured in a way to allow it. We can, for instance, use a linear scale in utility for delayed gratification, where we achieve X% of the utility of a television per day until it arrives.

Alternatively, we can use the concept of net present value. We might say that a television in a month is worth 5% less than a television today, and we compound that every month. That means we’ll prefer to preorder the television for next month, and we’d even prefer a preorder that takes five years to arrive. But around 75 months, it’s worth more to Scarlett to work than to take the time and money ordering the television. At 150 months’ delay, she wouldn’t even take time off work to get a free television.

This is utterly straightforward in simple cases like this: you have an event, you look up the utility of the result, you calculate a net present value from that. What about more complex cases? You could just assume the result happened now and simulate the world for one or two steps to see what happens. That’s another crude approximation, and it requires to simulate your entire world, but it only costs you a day or two each time.

These are all simple, straightforward techniques, but they require you to be able to simulate quickly.

Machine learning

You can create a neural network that attempts to associate decisions with effects. This is a more holistic approach, and once you build the neural network, it’s going to be reasonably fast. However, neural networks require a lot of time and data to train, and they’re often rather large. Since we’re talking about procedurally generated actors, the network has to match the current collection of actors as well as the current state of the world.

Given the amount of data to match and the range of possibility generally available, you’re going to spend a ton of time with the game running a different AI just to feed a giant neural network.

In my case, I’m talking about a game that might have a large number of players. That would end up with me paying for a bunch of large AI servers just hosting this neural network, and players would have to be online in order to play. If I ever shut down the servers (for instance, because they cost a couple hundred dollars a month to keep running and I’m not seeing any revenue), my players wouldn’t be able to play the game.

Neural networks are not an option for me. But if you’re generating a novel, you might be able to use them.

Just be warned, neural networks aren’t good at explaining their reasoning. It’s usually important for a story that characters be able to explain their reasoning.

Next time, we’ll look at incorporating other actors’ actions into one’s plans.

Procedural narratives: making decisions in isolation

In the previous entry, we said that the first two steps in creating a procedural narrative were to enumerate options that actors can take (including their immediate results) and to implement a naive graph search.

Let’s define this more concretely. The nodes in the graph are possible world states, with the start node being the current state. The edges are decisions. In this simple version, they are all decisions that the current actor could take. You can just breadth-first search your way through the graph, stop when the AI budget for this actor has been reached, and go with the best option you’ve found so far.

That’s straightforward — but how do we define “best”?

The best possible world state is the one that matches the actor’s goals most closely. In order to make this definition tractable, we require our goals to be formatted in a specific way. For instance, one goal might be to have a world under this actor’s control reach a population of fifty million. This is an excellent goal for our system because we can simply find the highest population planet under the actor’s control, subtract its population from the target one, and get a distance metric for that goal.

However, an actor can have multiple goals. To manage tradeoffs between them, we need to normalize these metrics according to their importance to the actor. This isn’t utility, really — it’s quite possible that it’s useless to have a planet of 49 million and worth everything to have a planet of 50 million. It’s not quite expected utility, either, but it’s a crude approximation of expected utility.

Once we have that, we can determine, for a given actor, how good a particular possible world state is. And that lets us find a preferred future.

Now that we’ve got that sorted out, we can take a look at potential refinements.

Procedural narratives: goals and outline

I started working on a space empire management sim game. It was simple, it was doable — and then I realized, even before I had anything rendering, that it promised to be boring. I needed something to spice it up. And I quickly decided on factions and duplicitous intrigue. That quickly leads to a problem: play time per unit work drops. What can I do about it?

The obvious answer: procedural content generation. That’s one of the tactics employed by games from Nethack to Diablo III. But how do I generate intrigue dynamically? Again, the obvious answer: create characters dynamically and give them plentiful duplicitous tactics to achieve their goals.

This rather requires some artificial intelligence. Now, a general artificial intelligence is so far infeasible, and I’m not going to invent it just for a game. However, I can develop a restricted artificial intelligence that can choose from a menu of options I lay out before it and make sensible choices.

My rough outline for how to do this is:

  1. Create a set of options that an actor can take.
  2. Create a naive search between options and goals. This assumes the actor acts in a void and everyone else is a robot.
  3. Refine the decision making algorithm.
  4. Develop a way for characters to explain their decisions to the player, since it’s guaranteed to come up.

Steps 1 and 2 are pretty easy to implement. I just need my simulation to be fast — or at least, I need a fast approximation. Step 3 is “here there be dragons” territory. I’ve got some ideas, but they’re pretty expensive — recursive modeling of others’ decisions given that I make a particular decision, for a start. Step 4 is a little scary — it hinges on identifying which decisions were important. That’s not going to be easy. But I’ve got half an idea on how to make that easier and possibly ease some of my other processing.

Tune in next time and we’ll go over the easy parts in a little more detail.