# Wordy

Posted on September 30, 2020
ScalaProgrammingLiterate

As mentioned in my previous post, I’ve recently been solving problems from Exercism with a bunch of my colleagues.

In this post, I’m discussing my solution to the Wordy exercise.

It’s embarrassing, but this post completely ignores the advice in the last one.

The last post talked about the importance of introducing concepts incrementally. In contrast, here I go all out, and implement a Parser Combinator Framework from scratch.

I’m really only publishing this because I’d already gone to the effort of adding fairly detailed “literate” comments.

The aim of the game in “Wordy” is to parse expressions that look like the following:

What is 2 plus 8 multiplied by 3?

You’re supposed to ignore BIDMAS, and evaluate the expressions from left to right. So in that case, the correct result would be `30`.

I decided to slightly overreach the problem description, and also support parsing some “postfix” operations too, such as:

What is 2 squared?

I’m also going to be fairly religious about only accepting solutions that are well formed.

## Parser Combinators

As mentioned above, this solution to “Wordy” starts by building a ((relatively) simple) “Parser Combinator” framework.

To quote a paper on Monadic Parser Combinators, by Graham Hutton and Erik Meijer:

In functional programming, a popular approach to building recursive descent parsers is to model parsers as functions, and to define higher-order functions (or combinators) that implement grammar constructions such as sequencing, choice, and repetition.

In a nutshell, we’re aiming to write a framework that consists of:

• Very simple primitive parsers
• Functions for composing those parsers together

Once we’ve got those two things, we will be able to compose our simple parsers together to form complicated ones, eventually solving the kata.

This pattern, starting with very simple primitives that are easy to compose, is very common in Functional Programming.

I’d go as far as saying that this principle is central to good Functional Programming Design.

## The Framework

### `Parser[A]`

A `Parser[A]` is a “Single Abstract Method” class

“Single Abstract Method” classes are isomorphic to a function. In this case, the function is `parse` with type `String => Option[(String, A)]`

You can think of a parser as taking a String, and either:

• consuming part of that string, and returning a value
• failing, and returning no value

The `+A` in the type signature means that our parser is “covariant” If we have a `Parser[A]` and A extends B, we can use our `Parser[A]` as if it were a `Parser[B]`. This doesn’t come up very much, so you can mostly ignore it.

``trait Parser[+A] { self =>``

### `parse`

the “Single Abstract Method”.

Every “concrete” Parser must implement `parse`.

If we were doing this “in production”, we’d probably want to do something more efficient than creating a new string in every single `Parser`. For example, we could model this by updating an “index” into the string. Fortunately, in our case, none of the strings are going to be particularly large or complex, so we can get away with it.

``  def parse(str: String): Option[(String, A)]``

### `flatMap`

`flatMap` takes a function from `A => Parser[B]` and produces a `Parser[B]` which:

• runs `this` Parser, to produce an A, and consume part of the string
• calls the function `f(a)` to produce a `Parser[B]`
• runs that parser on the remainder of the string

Implementing `flatMap` on a trait implements the “Monad” pattern.

Amongst other things this lets us write “for comprehensions”, these “de-sugar” into chains of `flatMap`s

``````  def flatMap[B](f: A => Parser[B]): Parser[B] =
new Parser[B] {
override def parse(str: String): Option[(String, B)] =
for {
(s1, a) <- self.parse(str)
(s2, b) <- f(a).parse(s1)
} yield (s2, b)
}``````

### `orElse`

this mimics Scala’s `Option.orElse`.

It returns the result of “this parser”, if there is one, otherwise it returns the result of the parser in the argument.

``````  def orElse[AA >: A](p2: Parser[AA]): Parser[AA] =
new Parser[AA] {
override def parse(str: String): Option[(String, AA)] =
self.parse(str).orElse(p2.parse(str))
}``````

### `map`

The famous “map” function.

This transforms our `Parser[A]` into a `Parser[B]` using a function `A => B`.

We can write it, in terms of `flatMap` seen above, and `Parser.pure` (which we’ll get to later).

`Parser.pure` always returns a constant value, `and doesn't` consume any part of the string.

``  def map[B](f: A => B): Parser[B] = self.flatMap(a => Parser.pure(f(a)))``

### `replace`

This is really just a special case of map, we’re passing a constant value, rather than a function.

This doesn’t let us do anything that we couldn’t do with `parser.map(_ => ...`, except that it makes it slightly clearer that we’re discarding the parsed value.

``  def replace[B](b: B): Parser[B] = self.map(_ => b)``

### `emap`

`emap` is like map, except that our function `f` can fail, hence it returns an `Option`.

Much like `map` we can implement this in terms of `flatMap` and `pure`.

We also need `Parser.failure`, a parser which always fails to parse it’s argument.

``````  def emap[B](f: A => Option[B]): Parser[B] =
self.flatMap(f(_) match {
case Some(value) => Parser.pure(value)
case None        => Parser.failure
})``````

### `many`

`many` returns the result of running this parser over and over again until it eventually fails.

Because the parser may be run multiple times, this produces a `Parser[List[A]]`.

In “strict FP” terms, `many` comes from the `Alternative` typeclass.

``````  def many: Parser[List[A]] =
(for {
h <- self
t <- many
} yield (h :: t))
.orElse(Parser.pure(List.empty))
}``````

### `object Parser`

At this point, we’ve finished with the `Parser[+A]` trait, and we’re now working on it’s companion object.

We’ve now defined all of the methods that are available on a `Parser`: our “combinators”.

Next we want to build some simple, concrete, parsers.

``````object Parser {
import scala.util.matching.Regex
import scala.util.Try``````

### `Parser.pure`

As mentioned above, `Parser.pure` always returns a constant value, and doesn’t consume any part of the string.

In this particular case, we only use this in order to implement `map` in terms of `flatMap`.

But in general, `Monad`s should have an implementation of both `flatMap` and `pure`.

``````  def pure[A](a: A): Parser[A] =
new Parser[A] {
override def parse(str: String): Option[(String, A)] = Some((str, a))
}``````

### `Parser.failure`

A parser that always fails, and never returns anything.

We only really use this in order to implement `emap` in terms of flatMap, so you can ignore it from here on out.

In Scala, `Nothing` is a subtype of every type, so we can use a `Parser[Nothing]` in place of any other Parser, because our Parser is covariant, as mentioned above`.

``````  def failure: Parser[Nothing] =
new Parser[Nothing] {
override def parse(str: String): Option[(String, Nothing)] = None
}``````

### `Parser.exactly`

A parser that succeeds only if the string starts with a specific sequence.

It always returns that sequence, consuming it from the input.

This may seem trivial, but it can be composed with other Parsers to solve difficult problems.

``````  def exactly(start: String): Parser[String] =
new Parser[String] {
override def parse(str: String): Option[(String, String)] =
if (str.startsWith(start)) {
Some((str.drop(start.length()), start))
} else {
None
}
}``````

### `Parser.matches`

This is the same as `exactly` except it takes a `Regex` instead of a string.

I could probably have written `exactly` in terms of `matches`, but this felt simpler, and more efficient.

Even when doing FP, we can still find a use for `Regex`.

``````  def matches(regexp: Regex): Parser[String] =
new Parser[String] {
override def parse(str: String): Option[(String, String)] =
regexp
.findPrefixMatchOf(str)
.map(aMatch => (aMatch.after.toString(), aMatch.matched))
}``````

### `Parser.int`

A Parser that will parse and return an integer.

This will fail if our input doesn’t match the regexp or if the conversion to an int fails.

Finally this is looking like it’s approaching a solution to our problem and not just “miscellaneous category theory”

``````  def int: Parser[Int] =
matches("-?[0-9]+".r).emap(s => Try { s.toInt }.toOption)``````

### `Parser.spaces`

A parser that matches whitespace.

We’ll generally be discarding the result of this, but it’s useful to indicate that the content of 2 parsers may be separated by whitespace.

``  def spaces: Parser[String] = matches("\\s?".r)``

### `Parser.end`

Parse a string, and fail if the string isn’t empty.

Returns `Unit`; we can’t get any information from an empty string.

This is useful for running as the last step in a parser. It guarantees that we don’t return success if passed a valid string, followed by garbage.

``````  def end: Parser[Unit] =
new Parser[Unit] {
override def parse(str: String): Option[(String, Unit)] =
str match {
case "" => Some(("", ()))
case _  => None
}
}
}``````

## Solving the Actual Problem

From here on out, the code is less “frameworky” and more “targeted to the problem”.

Hence I’m moving from the `Parser` trait, and companion object and starting to write code in a `Wordy` object.

There are a number of good Scala Parser Combinator frameworks, and in the wild, I probably wouldn’t have written any of the code up until this point.

``object Wordy {``

### `Wordy.infixOperatorParser`

This is a parser that takes any of the operators, “multiplied by”, “divided by”, “plus” or “minus”.

And parses it into the corresponding “binary” function, in this case a function from `Int => Int => Int` (note that this is written in “curried form” (you can probably ignore what that means)).

``````  lazy val infixOperatorParser: Parser[Int => Int => Int] = {
val multiplied =
Parser
.exactly("multiplied by")
.replace[Int => Int => Int](a => b => a * b)
val divided =
Parser.exactly("divided by").replace[Int => Int => Int](a => b => a / b)
val plus =
Parser.exactly("plus").replace[Int => Int => Int](a => b => a + b)
val minus =
Parser.exactly("minus").replace[Int => Int => Int](a => b => a - b)
List(multiplied, divided, plus, minus).reduce(_ orElse _)
}``````

### `Wordy.infixFnParser`

First runs `infixOperatorParser` to get a binary function.

Then it parses the following int, and returns a function `Int => Int` obtained by running operator with the int as the second argument.

For example “multiplied by 4” would be parsed into a function `_ * 4`. and “plus 3” would be parsed into a function `_ + 3`.

``````  lazy val infixFnParser: Parser[Int => Int] = for {
operator <- infixOperatorParser
_ <- Parser.spaces
value <- Parser.int
} yield (a => operator(a)(value))``````

### `Wordy.postfixFnParser`

This is very similar to `infixOperatorParser`. Except that it implements postfix functions, such as “3 cubed”

``````  lazy val postfixFnParser: Parser[Int => Int] = List(
Parser.exactly("squared").replace[Int => Int](a => a * a),
Parser.exactly("cubed").replace[Int => Int](a => a * a * a)
).reduce(_ orElse _)``````

### `Wordy.fnParser`

This matches either “partially applied” infix operators, such as “plus 3”, or postfix operators, such as “squared”. It also consumes any leading spaces

``````  lazy val fnParser: Parser[Int => Int] =
Parser.spaces.flatMap(_ => infixFnParser.orElse(postfixFnParser))``````

### `eqnParser`

Parse an initial Int, followed by a sequence of Operators and Ints as Functions `Int => Int`.

Then apply the functions to the initial value from Left to Right.

It should be noted that if there are no functions, this returns the initial value.

This parses the “-12 divided by 2 divided by -3” part of the problem

``````  lazy val eqnParser: Parser[Int] = for {
first <- Parser.int
functions <- fnParser.many
} yield functions.foldLeft(first) { (acc, f) => f(acc) }``````

### `Wordy.wordyParser`

Wrap `eqnParser`, first applying a parser that looks for the “What is” then apply `eqnParser`. And finally, consumes the trailing “?”, before validating that the entire string has been matched.

``````  val wordyParser: Parser[Int] = for {
_ <- Parser.exactly("What is ")
res <- eqnParser
_ <- Parser.exactly("?")
_ <- Parser.end
} yield res``````

### `Wordy.answer`

Finally, our implementation of `answer`, just call `wordyParser.parse`, and get the result.

``````  def answer(str: String): Option[Int] = wordyParser.parse(str).map(_._2)
}``````

This blog post is Literate Scala, view the raw source-code here.