Poem Quickstart

Getting started with Poem in Lua

Writing XBTs using the Lua implementation is the simplest way to get started with Poem.

Installation

You need to have a working Lua installation; if you want to run the examples from the XBT repository you should use LuaJIT since some of the libraries exist only for LuaJIT. Clone Xbt.lua from its GitHub repository and ensure that your LUA_PATH environment variable contains the path to the Xbt.lua module. If you use OS X or Linux with the bash shell, your .bashrc should probably include a line of the form

export LUA_PATH="$LUA_PATH;.../xbt/?.lua;.../xbt/?/init.lua"

(where .../xbt is the path to the Xbt.lua/src folder). On Windows you set the corresponding environment variable from the GUI.

To check whether the XBT framework works with your Lua installation, open a shell and change into the Xbt.lua/src folder and start Lua (or preferably LuaJIT) with the teacher/student example. You should see something like this (the actual numbers you see may vary depending on the version of the XBT framework you are using):

$ luajit -l example/teacher_student
XBTs are ready to go.
Default:
Robot rescue scenario (25000 steps)...
Random seed: 	lfib4 {\{-1233355477,-743959539,...},55}
Navigation graph has 200 nodes and 3960 edges.
........................................................................
... [several more lines]
....................................................................

---------------------------------------------------------------------------
Episode 1 	    reward:    -2928059 epsilon:     1 
  Teacher 1     samples:   0 	    bad choices: 24423  difference: 1705688
---------------------------------------------------------------------------
... [many more lines]
---------------------------------------------------------------------------
Episode 24751   reward:    9208537  epsilon: 	  0.01 
  Teacher 1     samples:   10514    bad choices:  7645  difference: 391240
--------------------------------------------------------------------------
Total reward = 820990052
Default with damage:
... [more of the same]

If you see this output: congratulations! You have just run your first robot swarm simulation using XBTs.

First Steps

To further explore XBTs we’ll now switch to some simpler examples. The file example/run.lua in the directory Xbt.lua/src contains several simple functions that create and tick XBTs; it includs some nodes from the file example/nodes.lua. An easy way to run the examples is from the Lua comand line. For example, if you want to execute the navigate_graph function, you proceed as follows:

$ luajit -e "ex = require('example.run'); ex.navigate_graph()"
Navigating graph...
Diameter:        	623.54149821804
Maxmin distance: 	76.837490849194	for node	21
Nodes:           	100	Edges:	984
1	->	1	[1]
1	->	2	[1->48->2]
1	->	3	[1->46->98->3]
1	->	4	[1->17->4]
1	->	5	[1->54->5]
2	->	2	[2]
2	->	3	[2->45->46->98->3]
2	->	4	[2->4]
2	->	5	[2->64->11->5]
3	->	3	[3]
3	->	4	[3->37->14->61->4]
3	->	5	[3->98->82->23->5]
4	->	4	[4]
4	->	5	[4->17->1->54->5]
5	->	5	[5]
Action table sizes: 	100	100

So, with all the preparation out of the way, let us define our first behavior tree. Obviously this tree should print Hello, world. and then succeed.

The first thing you need to do is to provide access to the XBT module. We therefore suppose that the beginning of your source file contains the line

local xbt = require("xbt")

without mentioning this for each example.

A simple XBT that executes a function and succeeds with the return value of the function as reward can be built using the function xbt.action. In the simplest case, xbt.action takes a function as argument, invokes this function with some arguments that we can ignore for now, and turns the result of the function into a successful XBT result. Let’s write this function first:

local function say_hello()
  print("Hello, world.")
  return 0
end

We then have to package this function into an XBT node and cause the node to be evaluated. In order to run each example in the file individually I tend to package these tasks into a function:

function ex.tick_hello_1 ()
  local node = xbt.action(say_hello)
  xbt.tick(node)
end

As usual you can execute this function like this:

$ luajit -e "ex = require('example.run'); ex.tick_hello_1()"
Hello, world.

Woo Hoo! A real breakthrough in computer science! Or, …, maybe not. Well, anyway, we have successfully executed our first XBT. The call to xbt.action creates an XBT node that is stored in the local variable node; the xbt.tick function triggers the next (and only) evaluation step of the node.

You, dear alert reader, have probably spotted the number 1 that I have fiendishly tacked onto a innocent little function name and are now waiting with bated breath for the next installments of our saga. Fear not! There will be versions 2, 3, 4 and maybe even 5!

One thing you might ask is: What about the return value? Does anybody care that we have succeeded so gracefully? Of course someone cares, and that someone is called tick_hello_2:

function ex.tick_hello_2 ()
  local node = xbt.action(say_hello)
  local res = xbt.tick(node)
  print("xbt.tick() returned " .. res.status .. " with reward " .. res.reward)
end

In tick_hello_2 we store the return value of the xbt.tick function in the local variable res and then print the status and reward fields. There are four possible status values for results: it can have succeeded or failed, it may still be running or it may as yet be inactive. The latter is a bit of an odd duck since it is the result value of tick for a node that has never been ticked, so you might be justified in thinking it’s not particularly useful. Or that the guy who dreamt up XBTs (that would be me) is totaly bananas. And you would probably be right with the second part, but the existence of inactive states allows some advanced use cases for XBTs that are rather nice to have. For example a planning node might answer a first tick with an inactive result that returns the estimated cost of a plan, and only start executing the plan after a second tick. A reinforcement learning node sitting above a number of such planning nodes might use these estimates as part of its value function computation allowing the interleaving of planning and learning in an elaborate dance of reasoning engines. But we are getting way ahead of ourselves here.

A number of functions in the xbt module allow you to query results for their status and other properties: is_result returns true when a value is one of these four possible results, and false otherwise; is_inactive, is_running, is_succeeded and is_failed return true if their argument has the corresponding status and false otherwise; is_done and can_continue check whether a node needs to be ticked again (if and only if is_done is false) or can be ticked again to potentially improve a previous result (can_continue). These last two functions are useful if you want to implement just-in-time algorithms with XBTs. An XBT for which both is_done and can_continue are true has completed its task (because is_done is true) but might be able to improve upon its performance (because can_continue is true as well). Imagine, for example, a household robot the can vacuum you appartment and in addition perform other tasks such as cooking meals or ironing clothes. This robot is never really finished with vacuuming your apartment, since it will always be able to clean up just a little more dirt. However, at some point we want the robot to stop vacuuming and start cooking, lest we die of starvation in the cleanest apartment ever (and, to make things worse, in rumpled clothes).

Let’s try some of the queries on XBT results by writing a (slightly cumbersome) function that prints info about a result:

local function print_result_info (res)
  if xbt.is_result(res) then
    if xbt.is_running(res) then
      print("Running!")
    elseif xbt.is_failed(res) then
      print("Failed!")
    elseif xbt.is_succeeded(res) then
      print("Succeeded!")
    elseif xbt.is_inactive(res) then
      print("Inactive!")
    end
    if xbt.is_done(res) then
      print("Is done.")
    end
    if xbt.can_continue(res) then
      print("Can continue.")
    end
    print("Reward: " .. res.reward)
  else
    print(res:tostring() .. " is not an XBT result!")
  end
end

To test the function call print_result_info on the various XBT result types. This also shows how you can create different kinds of XBT results should you need them:

function ex.test_print_result_info ()
  local res = xbt.running(1)
  print_result_info(res)
  print()
  res = xbt.failed(-10, "Out of patience.")
  print_result_info(res)
  print()
  res = xbt.succeeded(10)
  print_result_info(res)
  print()
  res = xbt.succeeded(5, true)
  print_result_info(res)
  print()
  res = xbt.inactive(0)
  print_result_info(res)
end

All result types take at least one argument, the reward for obtaining that result (which can be negative to indicate that obtaining this result incurred a net cost). Succeeded and failed results both have an optional second argument, but beware: the meaning of the second arguments differ. The additional argument of failed results simply adds a field containing a failure reason to the result; this can be helpful to figure out what went wrong when some part of the XBT does something unexpected. The second argument of succeeded is a Boolean that indicates whether the computation can be resumed to improve the result. Its default value is false.

Running this function results in the following output:

$ luajit -e "ex = require('example.run'); ex.test_print_result_info()"
Running!
Can continue.
Reward: 1

Failed!
Is done.
Reward: -10

Succeeded!
Is done.
Reward: 10

Succeeded!
Is done.
Can continue.
Reward: 5

Inactive!
Can continue.
Reward: 0

There is probably little that is surprising: Nodes returning failed or succeeded results are done, nodes returning running and inactive results can continue. A succeeded node may optionally indicate that it can continue if there is a chance that it may improve its result by doing so; this is the only case where simultaneously being done and being able to continue makes sense. Obviously it makes no sense for a failed node to continue; it should have produced a result right away if it were possible to do so. A running or inactive node can never be done, since, well, they are still running or inactive.