CakeParse

CakeParse is a lexer and recursive descent parser combinator library for Kotlin. It can parse any LL(*) grammer, and contains line/column number error reporting. A calculator expression parser example is included.

Usage

Step 0: Setup

Check the version badge for the latest version number:

Gradle

repositories {
  jcenter()
}

dependencies {
  compile "me.sargunvohra.lib:CakeParse:<version>"
}

Kobalt

dependencies {
  compile("me.sargunvohra.lib.CakeParse:<version>")
}

Step 1: Tokens

Define your tokens by name and regex. These make up the building blocks of your parser rules. Whitespace can be ignored by defining a token with ignore = true.

import me.sargunvohra.lib.cakeparse.api.*

val number = token("number", "[0-9]+")
val plus = token("plus", "\\+")
val space = token("space", "[ \\t\\r]+", ignore = true)
// ...

Step 2: Productions

Define your productions using the the combinators and other methods.

val parenExpr = lPar then expr before rPar
val primExpr = number map { it.raw.toInt() } or parenExpr
val goal = oneOrMore(number)
// ...

The available combinators are:

a or b: Parse a if possible, otherwise parse b. Return the result of the successful parse.
a and b: Parse a and then b. Return a Pair<K,V> of both results.
a then b: Just like and, but return the result of b.
a before b: Just like and, but return the result of a.
a map { /* transformer */ }: Just like a, but map the result to a new result using the transformer.
repeat(n, a): Match a in a sequence exactly n times, and return a result containing a list of all the matches.
atLeast(n, a): Just like repeat, but match at least n times

There are also a number of convenience parsers:

empty<T>(): Match an empty input and return a null result of type T.
optional(a): Match a if possible, otherwise return null.
zeroOrMore(a): Just like atLeast(0, a)
oneOrMore(a): Just like atLeast(1, a)

Step 3: Parse

Pass your input through a lexer, and then a parser.

val tokens = setOf(token1, token2, token3)
try {
  val input = System.in
  val result = tokens.lexer().lex(input).parseToEnd(goal).value
  println("Result: $result")
} catch (e: LexerException) {
  System.err.println("Lexing error: ${e.message}")
} catch (e: ParserException) {
  System.err.println("Parsing error: ${e.message}")
}

Note: Recursion

If you have a rule or system of rules that involves recursion, then you'll have to use the ref function to make it work. Be sure to explicitly specify the type of a ref. This is needed because the compiler can't infer the type of a recursive reference.

// put all tokens first
val lParen = token("lParen", "\\(")
val rParen = token("rParen", "\\)")

// then put all refs
val parenRef: Parser<Token> = ref({ paren })

// then put all rules
val paren = (lParen then parenRef before rParen) or empty()

The best way to do this is to put your goal rule at the bottom, all rules that depend on it above, all rules that depend on those above those, etc. Anywhere a rule refers to itself or the rule below, use a ref.

CakeParse

Simple parser combinator library for Kotlin

CakeParse

Usage

Step 0: Setup

Gradle

Kobalt

Step 1: Tokens

Step 2: Productions

Step 3: Parse

Note: Recursion