Ivan Inozemtsev Blog

Exploring Fantom DSLs

Brief intro to DSLs

Fandoc says:

DSLs

DSLs or Domain Specific Languages allow you to embed other languages into your Fantom source code. The syntax for a DSL is:

AnchorType <|...|>

Everything between the <| and |> tokens is considered source code of the DSL itself. The anchor type defines how to the compile the DSL. DslPlugins are registered on the anchor type, and called by the Fantom compiler to translate them into a Fantom expression.

For built-in DSLs, the expression type matches to the anchor type, so that

  str := Str<|hello, world!|>  // str has type Str
  regex := Regex<|\d+|> // regex has type Regex

However, it is not necessary - basically the DSL expression can be compiled to expression of any type - you can write a DSL plugin with return type depending on DSL str, like this:

foo := MyDSL<|1|> //resolves to Int at compile time
bar := MyDsl<|str|> //resolves to Str at compile time

I don't know whether this is a bug or feature

The most obvious way to use DSL expressions is instantiation of some complex objects. For example, imagine we are writing a library for graph manipulation, so we define classes like Graph, Node, and Edge. See Graphs.fan
And imagine that we need to instantiate some graphs for our tests, so we write the code like this:

 a := Node("a")
 b := Node("b")
 c := Node("c")
 d := Node("d")
 ab := Edge(a, b)
 bc := Edge(b, c)
 bd := Edge(b, d)
 graph := Graph([a,b,c,d], [ab, bc, bd])

What a lot of code! Let's use some DSL magic:

  graph := Graph<|a -> b
                  b -> c
                  b -> d|>

The DSL plugin for Graph just parses the code between <| and |> (which is accessed via compiler::DslExpr.src) and generates appropriate object creation.

However even such a simple example is quite tricky - to allow our DSL to be used as field initializer, or default parameter value, we need to convert our DSL code to a single expression.

So, before implementing the DSL plugin itself, we need to understand how we can replace our code with a single expression. In this particular example, assuming that Node and Edge classes override equals and hash correctly, it can be done like this:

Graph(
  ["a","b","c","d"].map { 
    Node(it) 
  }, 
  [
    ["a","b"], 
    ["b", "c"], 
    ["b", "d"]].map { 
      Edge(
        Node(it.first), 
        Node(it.last)
      ) 
    }
  )

The source code of DSL plugin can be found here.

Uh, after looking at the source of GraphDsl, the question is - why do we want to write DSL plugins? The same task can be fairly easy implemented in simple static method like Graph.fromStr. Why anyone want to use heavy low-level Compiler API? The benefit like compile-time validation and generation of compile error on a bad line seems to be too small, almost negligible.
That's what I thought when saw Fantom DSLs for a first time, and then forgot about them almost for a year.

Beyond the DSL src

Last week, being tired from code like this:

    ...
    if(node is ListType) return selectType(node->valType)
    else if(node is TypeDef) return selectTypeDef(node)
    else if(node is CType) return selectType(node)
    else if(node is SlotRef) return selectSlotRef(node)
    else if(node is SlotDef) return selectSlotDef(path)
    else if(node is MethodVarRef) return selectMethodVarRef(node)
    ...

I thought it'd be cool to have multiple dispatch in Fantom. The initial implementation supposed to be quite simple - we have Dispatcher class which looks like this:

const class Dispatcher
{
  new make(Func[] funcs := Func[,]) { this.funcs = funcs }  

  public Dispatcher add(Func f) { Dispatcher(funcs.dup.add(f)) }
  
  **
  ** Dispatch can accept more args than needed for a func,
  ** so funcs in this container can have different arity
  ** 
  Obj? call(Obj?[] args) {
    //dummy approach for now - 
    //find first function which accepts less than arg count 
    //arguments with all param types fitting to arg types
    (funcs.exclude |f|
    {
      args.any |arg, i| 
      { 
        f.params.size > i && 
        !f.params[i].type.fits(arg?.typeof ?: Obj?#) 
      }
    }.first ?: throw ArgErr("No matching functions for args $args")).callList(args)
  }  
}

Using this, the code above can be written like this:

    d := Dispatcher([
      #selectType.func,
      #selectTypeDef.func,
      #selectSlotRef.func,
      ... ])
    d.call([this, node])

Slightly better, but still a lot of boilerplate code. What if we'd use DSLs + Symbols here? Using that, we can write code like this:

class SelectionEngine
{
  @Dispatch Void selectTypeDef(TypeDef t) 
  @Dispatch Void selectType(CType t)
  @Dispatch Void selectSlotRef(SlotRef s)
  ...
  Void select(Node node) { Dispatcher<|select|>.call([this, node]) }
}
...
  //everything dispatched according to node type, 
  //unsupported node types immediately give us exceptions
  engine.select(node) 
...

So, all we need to do is to write simple marker facet:

facet class Dispatch {}

And DSL plugin which will make everything for us - find all methods annotated with @Dispatch and starting with a given prefix, take functions from them and then pass list of functions to Dispatcher constructor. Sounds fairly easy, but when I started implementing it, I found the first problem - from a DslPlugin.compile we don't know where we are - I mean, we don't know anything about enclosing type or method.

Luckily for me, DslPlugin extends CompilerSupport, which means we have full access to all compilation units and type definitions for them! So we can take location of our DSL expression, and then by iterating through all compilation units and comparing location, we can find our compilation unit. Using the same way, we can find enclosing type definition.
The rest is simple - iterate through all slots of our type, find all methods with given prefix and facet and construct expression for Dispatcher creation.

Nice! But wait for a second - this means that new instance of Dispatcher will be created per each method invocation. That's not exactly what we want. What if we could inject a private static const Dispatcher selectDispatcher? And yes, we can! So, right inside our DSL plugin, we can write something like this:

    fieldName := "${name}Dispatcher"
    field := FieldDef.make(loc, parent, fieldName, 
      FConst.Const + FConst.Private + FConst.Storage + FConst.Static)
    field.fieldType = ns.resolveType("mdispatch::Dispatcher")
    field.init = //Expression to create dispatcher object
    type.addSlot(field)

However, there's one more thing we have to do - modify static initializer for our type. Static initializer is a special method generated by compiler and it contains all assignments made to static field definitions:

class Foo
{
  static const Str str := "a"
  static const Int int := 45
  
  //this method is generated by compiler
  private static synthetic Void static$init()
  {
    str = "a"
    int = 45
  }
}

At the stage when DSL plugins are called, this static method is already generated (if there are other static fields in this class), so we need to manually find this method and modify it's code (if there are no static fields, we also need to create this method):

    staticInit := type.methodDef("static\$init")
    //insert statement at the end of the list
    staticInit.code.stmts.insert(-1, BinaryExpr.makeAssign(fieldRef(dispField, loc), initExpr).toStmt)

And voila! Everything compiles and runs smoothly now. Here's the code. However it already smells like black magic. Let's go further!

Static imports

Not so long ago there's a Static imports discussion on Fantom's forum. Without arguing whether static imports are good or bad, let's see what we can do using DSLs, so the code like this would be possible:

class Sample
{
  StaticImports imports := 
    StaticImports<|
                    sys::Str.spaces
                    sys::File.os
                  |>
  Void main()
  {
    file := os("/")
    prefix := spaces(4)    
  }
}

So, what we want here - the DSL source just defines a list of fully qualified static methods, and DSL plugin just creates local static methods which route to corresponding 'foreign' static methods:

class Sample
{
  private static File os(Str str) { File.os(str) }
  ...
}

The creation of method routers and their bodies look fairly simple, but there's a problem - again, DSL plugins are called a bit too late and compiler have already generated a lot of compiler errors. Fail? No! We can do another black magic trick - go through all complier errs and remove those which caused by (yet) undefined methods:

//build qnames of new instance methods
newQnames := qnames.map { "${type.qname}.${it[it.indexr(".")..-1]}" }
//build compiler error message for further filtering
errMsgs := newQnames.map { "Unknown method '$it'" }
compiler.errs = compiler.errs.exclude { errMsgs.contains(it.msg) }

However we also need to manually go through all expressions and adjust CallExpr instances to insert correct references to our newly created methods. After then everything compiles and runs as expected, though console output will still display compilation errors during building. The dirty code of StaticImportDsl can be found here.

In fact, the example above have nothing common with initial purpose of DSLs, it is complete compiler plugin which significantly changes the source code meaning.

Inheritance

So, using DSL plugins it is possible:

  • Create objects
  • Modify types by adding new methods
  • Modify method bodies
  • Remove compilation errors

But also, we can extend existing types! So, inside DSL plugin we can create new type extending anchor type, and then the instance of our new type will be just upcasted to the base type. On of the examples where it can be useful, is the following:

parser := Parser<| Value = [0-9]+ | '(' Expr ')'
                   Product = Value (('*' / '/') Value)*
                   Sum = Product (('+' / '-') Product)*
                   Expr = Sum |>

Here the parser grammar defines a set of non-terminal symbols which can refer to other symbols. We can generate the class extending 'Parser' with a set of methods corresponding to symbols right at compile time, something like that:

class MyParser : Parser
{
  override Bool expr(InStream in, Ast ast) { sum(in, ast) }

  Bool sum(InStream in, Ast ast) 
  {
    if(!product(in, ast)) return false
    while(peek == '*' || peek = '/') 
    {
      consume
      if(!product(in, ast)) return true
    } 
    return true
  }

  Bool product(InStream in, Ast ast)
  ...
}

And therefore, since our parser extends Parser, the code like this will work correctly and give us great performance:
I'm not a parser guy, and don't have a complete working example on my hands, but I verified that extending types is possible on a simple synthetic example:

class Foo
{
  virtual Str str() { return "a" }
}

class Sample
{
  Void main()
  {
    echo(Foo<|bar|>.str)  //prints 'bar'
  }
}

The Foo DSL plugin creates class like this:

class Foo1290429812650070000 : Foo
{
  override Str str() { synth }
  protected Str synth() { "bar" }
}

Magic facets

DSL plugins is the backdoor which gives a full access to compiler API and allows to do almost everything. However, in such use cases, we don't need the return value of DSL expression. One of the ways of 'organizing chaos' can be something like below.
Let's define facet Magic:

facet class Magic
{
  const Obj? kind
}

It's has a single purpose - we can put it on type or slot. The field kind has a single purpose too - we can assign DSL expression to it.
Also, we can define the base class for our 'magic' plugins, which could simplify the creation of plugins. Here's the basic idea:

abstract class MagicPlugin : DslPlugin
{
  new make(Compiler c) : super(c) {}

  override Expr compile(DslExpr expr)
  {
    loc := expr.loc
    doMagic(type(loc, expr), slot(loc, expr), expr.src)
    return LiteralExpr.makeNull(expr.loc, ns)
  }

  **
  ** type - type definition of enclosing type
  ** method - method definition. If method is null,
  ** then magic applied to type, otherwise - to method
  ** 
  abstract Void doMagic(TypeDef type, SlotDef? slot, Str src := "")

  ... ton of helper methods for simplifying AST manipulation ...
}

The implementation of type and slot methods searching for appropriate definitions can be found here.

Later, we can write plugins as simple like this:

class EchoMagic : MagicPlugin
{
  new make(Compiler c) : super(c) {}

  override Void doMagic(TypeDef t, SlotDef? s, Str src := "")
  {
    echo("type - $t")
    echo("slot - $s")
    echo("src - $src")
  }
}

And use them like this:

@Magic { kind = Echo<||> }
class Sample
{
  @Magic { kind = Echo<||> }
  Void method() {}
  
  @Magic { kind = Echo<||> }
  Int slot := 1
  
  Void main() { echo("hello, world!") }
}

Though, it is not really interesting, let's do something cool, how about...

Actor locals

I've noticed that almost all the time when I use Actors, I write the code like this:

private Str mutableStr
{
  get { Actor.locals.getOrAdd("mutableStr") |->Str| { "default value" } }
  set { Actor.locals["mutableStr"] = it }
}

It would be great to have a keyword actorlocal, so I could just write

private actorlocal Str mutableStr := "default value"

But the same effect can be achieved via @Magic! We can write an ActorLocal DSL plugin and then use it like this:

@Magic { kind = ActorLocal<||> }
private Str mutableStr := "default value"

Probably a bit more typing than with a keyword, but definitely better than current variant.
So, ActorLocal DSL does the following:

  1. If magic facet is applied on type or method, returns compile error
  2. Removes Storage flag from the field (to indicate that we provide getter and setter explicitly)
  3. Removes Synthetic flag from getter and setter
  4. If field has the default value, locates initializer (static or instance depending on field flag) and removes field assignment from it (since filed has no storage and we will use initializer in our getter)
  5. Changes getter and setter to AST equivalents of this code:
    get 
    { 
      Actor.locals.containsKey("fieldName") ? 
        Actor.locals["fieldName"] : 
        (initExpr ?: fieldType.defVal) 
    }
    set { Actor.locals["fieldName"] = it }
    

Voila!

There's one problem with this code, imagine the case like this:

const class Foo
{
  @Magic { kind = ActorLocal<||> }
  Str s
}
const class MyActor
{
  const Foo foo1
  const Foo foo2
  ...
  override Obj? receive(Obj? msg)
  {
    foo1.s = "str" //now foo2.s equals to "str" too
  } 
}

And I don't really know if this can be fixed correctly. In theory, we could store map [Foo:FieldType] in Actor.locals, but there are two obstacles:

  1. We won't know when to remove values from this map, probably weak keys could help
  2. Foo must be const - only const objects can be used as map keys

However, I don't think it is a big deal, since typically such fields can be declared in actors itself and used only from receive method, so they'll act as normal instance fields.

Conclusion

The DSL plugins seem to be powerful and dangerous, though quite complex tool, allowing to completely change the meaning of the source code. They strongly rely on Compiler API, which is quite big and probably subject to change. Also it is not clear at which stage the DSL plugins are invoked, so sometimes it may be hard to predict all consequences of AST modifications, therefore it might be too risky to write something heavy using DSLs.

On the other hand, in some cases it can provide pretty elegant and robust solutions which can replace a ton of boilerplate code with full support of compile-time validation.