CakeParse
CakeParse is a lexer and recursive descent parser combinator library for Kotlin. It can parse any LL(*) grammer, and contains line/column number error reporting. A calculator expression parser example is included.
Usage
Step 0: Setup
Check the version badge for the latest version number:
Gradle
repositories {
jcenter()
}
dependencies {
compile "me.sargunvohra.lib:CakeParse:<version>"
}
Kobalt
dependencies {
compile("me.sargunvohra.lib.CakeParse:<version>")
}
Step 1: Tokens
Define your tokens by name and regex. These make up the building blocks of your parser rules. Whitespace can be ignored by defining a token with ignore = true
.
import me.sargunvohra.lib.cakeparse.api.*
val number = token("number", "[0-9]+")
val plus = token("plus", "\\+")
val space = token("space", "[ \\t\\r]+", ignore = true)
// ...
Step 2: Productions
Define your productions using the the combinators and other methods.
val parenExpr = lPar then expr before rPar
val primExpr = number map { it.raw.toInt() } or parenExpr
val goal = oneOrMore(number)
// ...
The available combinators are:
-
a or b
: Parsea
if possible, otherwise parseb
. Return the result of the successful parse. -
a and b
: Parsea
and thenb
. Return aPair<K,V>
of both results. -
a then b
: Just likeand
, but return the result ofb
. -
a before b
: Just likeand
, but return the result ofa
. -
a map { /* transformer */ }
: Just likea
, but map the result to a new result using the transformer. -
repeat(n, a)
: Matcha
in a sequence exactlyn
times, and return a result containing a list of all the matches. -
atLeast(n, a)
: Just likerepeat
, but match at leastn
times
There are also a number of convenience parsers:
-
empty<T>()
: Match an empty input and return anull
result of typeT
. -
optional(a)
: Matcha
if possible, otherwise returnnull
. -
zeroOrMore(a)
: Just likeatLeast(0, a)
-
oneOrMore(a)
: Just likeatLeast(1, a)
Step 3: Parse
Pass your input through a lexer, and then a parser.
val tokens = setOf(token1, token2, token3)
try {
val input = System.in
val result = tokens.lexer().lex(input).parseToEnd(goal).value
println("Result: $result")
} catch (e: LexerException) {
System.err.println("Lexing error: ${e.message}")
} catch (e: ParserException) {
System.err.println("Parsing error: ${e.message}")
}
Note: Recursion
If you have a rule or system of rules that involves recursion, then you'll have to use the ref
function to make it work. Be sure to explicitly specify the type of a ref. This is needed because the compiler can't infer the type of a recursive reference.
// put all tokens first
val lParen = token("lParen", "\\(")
val rParen = token("rParen", "\\)")
// then put all refs
val parenRef: Parser<Token> = ref({ paren })
// then put all rules
val paren = (lParen then parenRef before rParen) or empty()
The best way to do this is to put your goal rule at the bottom, all rules that depend on it above, all rules that depend on those above those, etc. Anywhere a rule refers to itself or the rule below, use a ref
.