In this series of blog posts I am exploring the creation of a DSL for a mythical Role Playing Game called “Dragons of Kin” at the Puzzle Guzzler software house. The language will allow the game designers to interact with the underlying game engine. All the business logic and heavy lifting is performed by the engine meanwhile the language will be used a conduit into the engine. Characters, layout and animations will be configured using the language. In this post I will show the first implementation of the DSL.
In the last post Douglas (the development lead) and Jasmine (the level designer) had created the following example of the first cut of the language they had developed.
Creature Crocodile {
Life := random_select [30..40] LP
Magic := 15 MP
LifeHealingSpeed := 3 LP/sec
MagicHealingSpeed := 1 MP/min
Weapons
{
Bite:
Inflicts:= random_select [5..10] LP
PoisonBite:
Inflicts:= random_select [5..10] LP + 3 LP/sec
Costs:= 5 MP
}
}
Douglas takes the language fragment to Jimmy a developer on his team.
Jimmy | So this is the infamous BattleLang |
Douglas | Yep. As we discussed in the Stand Up. We have three weeks to implement a text mode engine that will allow Christoph to move from the town to the spring. There he will kill the crocodiles and save the town. |
Jimmy | Everything hard coded then. So we will have three Creature types – Jimmy, Crocodile and Boss Crocodile. Lets not do spawning either. The Crocodiles will be created as soon as the engine starts. There really should be an Area type in the language so that Jasmine can assign the Crocs to a location. |
Douglas | Definitely. Lets put that up onto the Product backlog. So can you do the first cut of the parser over the next couple of days? |
Jimmy | I think so. I was looking at F# Lex support, ANTLR, Bison but if I only have 2 days I don’t want to learn a new library or toolset. I’ll just crank out something quickly and we’ll throw it away bit further down the track. It looks pretty easy to parse. |
Douglas | Lets go to the white board and draw out a transition diagram.
Each line represents a token that the scanner is reading from the input. We start with “Creature” read in the name of the name Creature then we have an open and close bracket and then we reach the end. In this case we are saying its legal to have a Creature with no attributes defined at all. |
Jimmy | And what we assume that during each transition all the whitespace is thrown away? It doesn’t make sense for that to be legal does it? |
Douglas | For now lets assume that around every token there is whitespace. When the code is rewritten we can throw that assumption away. So the code should actually look like this. Creature Crocodile { Life := random_select [ 30 .. 40 ] LP Magic := 15 MP LifeHealingSpeed := 3 LP / sec MagicHealingSpeed := 1 MP / min Weapons { Bite : Inflicts := random_select [ 5 .. 10 ] LP PoisonBite : Inflicts := random_select [ 5 .. 10 ] LP + 3 LP / sec Costs := 5 MP } }
What is I mean by it being legal is that the checking if the Creature is valid is a different step in the process.
By shifting the complex validation out of the Syntax Analyser it makes the code a lot simpler. |
Jimmy | Ok. Let me see so to do the parsing of the character attributes there is another string literal and then we parse the function itself.
And the function parser is as follows.
|
Douglas | I’ll add in the Weapons |
Building the Proof of Concept BattleLang DSL
As described above there are a couple of stages to process a language:
- Lexical Analyser which converts the input stream into into Tokens
- Syntax Analyser which implements the transition diagram
- Semantic Analyser validates the business rules
Lexical Analyser
By simplifying the language definition so that all Tokens are separated by whitespace the scanner is very simple.
public static TokenAndStringIndex GetNextToken(string text, int currentIndex)
{
string token;
if (HaveReachedEndOfInput(currentIndex, text))
return new TokenAndStringIndex("", text.Length);
currentIndex = RemoveWhitespace(currentIndex, text);
currentIndex = GetToken(currentIndex, text, out token);
return new TokenAndStringIndex(token, currentIndex);
}
The TokenAndStringIndex is a tuple that stores the current Index into the array and the Token that was found. Retrieving a list of all the tokens is a simple matter of looping through the input. While we could return an IEnumberable<string> in a more complete solution the TokenAndStringIndex would be renamed and extended to to include information such as the line number within the string the token was extract from. This information is invaluable when it comes to creating useful error messages when problem are found during the Syntax and Semantic Analyser steps.
public static IEnumerable<TokenAndStringIndex> GetAllTokens(string text)
{
int currentIndex = 0;
while (currentIndex < text.Length)
{
var result = GetNextToken(text, currentIndex);
currentIndex = result.IndexIntoInputString;
if (!string.IsNullOrEmpty(result.Token))
yield return result;
}
}
Syntax Analyser
The syntax analyser eats through the streams of tokens moving through the transition diagram. On nodes such as the Open Bracket it needs to grab the next token and determine which of the three subsequent nodes should be selected (Close Bracket, retrieve Creature attribute or Enter Weapons scope). There are lots of different ways to model the nodes and transitions. One very clean way is to have a class per node and implementing the state pattern. All classes implement an interface INode and member INode ProcessToken(Token t).
However this time I decided to use an even simpler (although far less scalable) approach.
First three enums were created:
- Keywords – Include nodes for Creature, Weapons, Inflicts, Costs, MagicPoints, LifePoints, Seconds, Minutes, RandomSelect and Colon
- Operators – Includes nodes for Assign, Add, Subtract, Multiply, Divide, Range
- Scope – Includes nodes for OpenCreateScope, CloseCreateScope, OpenWeaponScope, CloseWeaponScope, OpenRangeScope, CloseRangeScope
I then created the TokenAction class which links the Token string, to the enum and to a delegate that will manage the transition logic and the retrieval of the information about the Creature.
internal class TokenAction<T>
{
private readonly T _tokenType;
private readonly string _token;
private readonly Action _action;
public TokenAction(T tokenType, string token, Action action)
{
_tokenType = tokenType;
_action = action;
_token = token;
}
public Action Action
{
get { return _action; }
}
public string Token
{
get { return _token; }
}
public T TokenType
{
get { return _tokenType; }
}
}
If we are at Node “Creature” the next token will be the Creature’s name then the following node should be a “{“. The “{“ node includes its own transition logic that we can access by calling execute on its Action. Hence:
new TokenAction<KeyWords> (KeyWords.Creature, "Creature", () =>
{
_creature.Name = GetLiteral();
TransitionToScopeToken(new[] { Scope.OpenCreateScope });
}
)
The following code searches for either “MP” or “LP” and sets the _dimension to the required value:
TransitionToKeywordToken(new[] { KeyWords.MagicPoints, KeyWords.LifePoints });
new TokenAction<KeyWords>(KeyWords.MagicPoints, "MP", () => _dimension = DimensionType.Magic)
new TokenAction<KeyWords>(KeyWords.LifePoints, "LP", () => _dimension = DimensionType.Life)
As you can see the code “collects” values such as the DimesionType as it parsers the tree. All of the Tokens and their relevant Actions were placed into three symbol tables using the TokenActions class which manages the moving executing of the Actions.
internal class TokenActions<T>
{
private readonly Dictionary<T, TokenAction<T>> _symbolTable = new Dictionary<T, TokenAction<T>>();
public void Add(TokenAction<T> tokenAction)
{
_symbolTable.Add(tokenAction.TokenType,tokenAction);
}
private TokenAction<T> GetTokenAction(string token)
{
return _symbolTable.Values.First(x => x.Token == token);
}
private TokenAction<T> GetTokenAction(T tokenType)
{
return _symbolTable[tokenType];
}
public void InvokeAction(string token)
{
GetTokenAction(token).Action();
}
public void InvokeAction(T tokenType)
{
GetTokenAction(tokenType).Action();
}
public bool IsTokenFor(string s, T tokenType)
{
return GetTokenAction(tokenType).Token == s;
}
public bool TokenSymbolExists(string token)
{
return _symbolTable.Values.Count(x => x.Token == token) > 0;
}
}
The parser populates the following objects:
- Unit which models the following “10 MP / sec”
- UnitsType a collection of Unit objects and automatically reduces “10 MP / sec + 11 MP / sec” to “21 MP / sec”
- Weapon which includes the damage that is inflicted and the cost to the user
- Creature which will be passed to the engine – Currently implemented as a fake shell to demonstrate the parser
using System.Collections.Generic;
using System.Linq;
namespace BattleLangParser
{
public enum DimensionType
{
Magic,
Life
}
public enum TimespanType
{
Now,
Second,
Minute
}
public class Unit
{
public float Amount { get; set; }
public DimensionType Dimension { get; set; }
public TimespanType Timespan { get; set; }
}
public class UnitsType
{
private readonly List<Unit> _units = new List<Unit>();
public IList<Unit> Units { get { return _units; } }
public void Add(Unit u, Operator @operator)
{
if(_units.Where(x => x.Dimension == u.Dimension && x.Timespan == u.Timespan).Count()>0)
{
var result = _units.First(x => x.Dimension == u.Dimension && x.Timespan == u.Timespan);
switch(@operator)
{
case Operator.Add:
result.Amount += u.Amount;
break;
case Operator.Subtract:
result.Amount -= u.Amount;
break;
}
}
else
{
_units.Add(u);
}
}
}
public class Weapon
{
public Weapon()
{
Inflicts = new UnitsType();
Costs = new UnitsType();
}
public UnitsType Inflicts { get; private set; }
public UnitsType Costs { get; private set; }
}
public class Creature
{
public Creature()
{
Stats = new Dictionary<string, UnitsType>();
Weapons = new Dictionary<string, Weapon>();
}
public string Name { get; set; }
public Dictionary<string, UnitsType> Stats { get; private set; }
public Dictionary<string, Weapon> Weapons { get; private set; }
}
}
Semantic Analyser
In this version of the software it does not include a semantic Analyser.
Unit Tests
An example of one of the Unit tests:
[Test]
public void ParseCreature_MultipleDimensions_EachDifferent()
{
const string emptyCreature = @"
Creature TestMonster
{
RingOfhealing := 3 MP / min + 8 LP / min
}";
var b = new Parser(emptyCreature);
Creature result = b.Parse();
Assert.AreEqual("TestMonster", result.Name);
Assert.IsNotNull(result.Stats["RingOfhealing"]);
Assert.AreEqual(3.0, result.Stats["RingOfhealing"].Units[0].Amount);
Assert.AreEqual(DimensionType.Magic, result.Stats["RingOfhealing"].Units[0].Dimension);
Assert.AreEqual(TimespanType.Minute, result.Stats["RingOfhealing"].Units[0].Timespan);
Assert.AreEqual(8.0, result.Stats["RingOfhealing"].Units[1].Amount);
Assert.AreEqual(DimensionType.Life, result.Stats["RingOfhealing"].Units[1].Dimension);
Assert.AreEqual(TimespanType.Minute, result.Stats["RingOfhealing"].Units[1].Timespan);
}
Verifying the Creature has been created correctly involves:
- Checking that expected attributes have been loaded
- Checking that attributes have the correct Units of Measure
- Providing the user with appropriate error messages
Summary
This post has shown how to very easily write a simple DSL (less than 500 lines of code in total) that can be used to populate a relatively complex object.
There are problems with the approach I’ve taken:
- The Lexical Analyser was written by hand – There are some great libraries and frameworks that can do the heavy lifting for us on this one
- Logic for populating the Creature was intermingled with logic for verifying the grammar – An Abstract Syntax Tree and the Visitor pattern is the answer for decoupling those two domains
- Currently very poor error messages for badly formed user code
Source code
BattleLang.SimpleImplementation.7z (91.06 kb)