Wordy

Posted on September 30, 2020
ScalaProgrammingLiterate

As mentioned in my previous post, I’ve recently been solving problems from Exercism with a bunch of my colleagues.

In this post, I’m discussing my solution to the Wordy exercise.

It’s embarrassing, but this post completely ignores the advice in the last one.

The last post talked about the importance of introducing concepts incrementally. In contrast, here I go all out, and implement a Parser Combinator Framework from scratch.

I’m really only publishing this because I’d already gone to the effort of adding fairly detailed “literate” comments.

The aim of the game in “Wordy” is to parse expressions that look like the following:

What is 2 plus 8 multiplied by 3?

You’re supposed to ignore BIDMAS, and evaluate the expressions from left to right. So in that case, the correct result would be 30.

I decided to slightly overreach the problem description, and also support parsing some “postfix” operations too, such as:

What is 2 squared?

I’m also going to be fairly religious about only accepting solutions that are well formed.

Parser Combinators

As mentioned above, this solution to “Wordy” starts by building a ((relatively) simple) “Parser Combinator” framework.

To quote a paper on Monadic Parser Combinators, by Graham Hutton and Erik Meijer:

In functional programming, a popular approach to building recursive descent parsers is to model parsers as functions, and to define higher-order functions (or combinators) that implement grammar constructions such as sequencing, choice, and repetition.

In a nutshell, we’re aiming to write a framework that consists of:

  • Very simple primitive parsers
  • Functions for composing those parsers together

Once we’ve got those two things, we will be able to compose our simple parsers together to form complicated ones, eventually solving the kata.

This pattern, starting with very simple primitives that are easy to compose, is very common in Functional Programming.

I’d go as far as saying that this principle is central to good Functional Programming Design.

The Framework

Parser[A]

A Parser[A] is a “Single Abstract Method” class

“Single Abstract Method” classes are isomorphic to a function. In this case, the function is parse with type String => Option[(String, A)]

You can think of a parser as taking a String, and either:

  • consuming part of that string, and returning a value
  • failing, and returning no value

The +A in the type signature means that our parser is “covariant” If we have a Parser[A] and A extends B, we can use our Parser[A] as if it were a Parser[B]. This doesn’t come up very much, so you can mostly ignore it.

parse

the “Single Abstract Method”.

Every “concrete” Parser must implement parse.

If we were doing this “in production”, we’d probably want to do something more efficient than creating a new string in every single Parser. For example, we could model this by updating an “index” into the string. Fortunately, in our case, none of the strings are going to be particularly large or complex, so we can get away with it.

flatMap

flatMap takes a function from A => Parser[B] and produces a Parser[B] which:

  • runs this Parser, to produce an A, and consume part of the string
  • calls the function f(a) to produce a Parser[B]
  • runs that parser on the remainder of the string

Implementing flatMap on a trait implements the “Monad” pattern.

Amongst other things this lets us write “for comprehensions”, these “de-sugar” into chains of flatMaps

orElse

this mimics Scala’s Option.orElse.

It returns the result of “this parser”, if there is one, otherwise it returns the result of the parser in the argument.

map

The famous “map” function.

This transforms our Parser[A] into a Parser[B] using a function A => B.

We can write it, in terms of flatMap seen above, and Parser.pure (which we’ll get to later).

Parser.pure always returns a constant value, and doesn't consume any part of the string.

replace

This is really just a special case of map, we’re passing a constant value, rather than a function.

This doesn’t let us do anything that we couldn’t do with parser.map(_ => ..., except that it makes it slightly clearer that we’re discarding the parsed value.

emap

emap is like map, except that our function f can fail, hence it returns an Option.

Much like map we can implement this in terms of flatMap and pure.

We also need Parser.failure, a parser which always fails to parse it’s argument.

many

many returns the result of running this parser over and over again until it eventually fails.

Because the parser may be run multiple times, this produces a Parser[List[A]].

In “strict FP” terms, many comes from the Alternative typeclass.

object Parser

At this point, we’ve finished with the Parser[+A] trait, and we’re now working on it’s companion object.

We’ve now defined all of the methods that are available on a Parser: our “combinators”.

Next we want to build some simple, concrete, parsers.

Parser.pure

As mentioned above, Parser.pure always returns a constant value, and doesn’t consume any part of the string.

In this particular case, we only use this in order to implement map in terms of flatMap.

But in general, Monads should have an implementation of both flatMap and pure.

Parser.failure

A parser that always fails, and never returns anything.

We only really use this in order to implement emap in terms of flatMap, so you can ignore it from here on out.

In Scala, Nothing is a subtype of every type, so we can use a Parser[Nothing] in place of any other Parser, because our Parser is covariant, as mentioned above`.

Parser.exactly

A parser that succeeds only if the string starts with a specific sequence.

It always returns that sequence, consuming it from the input.

This may seem trivial, but it can be composed with other Parsers to solve difficult problems.

Parser.matches

This is the same as exactly except it takes a Regex instead of a string.

I could probably have written exactly in terms of matches, but this felt simpler, and more efficient.

Even when doing FP, we can still find a use for Regex.

Parser.int

A Parser that will parse and return an integer.

This will fail if our input doesn’t match the regexp or if the conversion to an int fails.

Finally this is looking like it’s approaching a solution to our problem and not just “miscellaneous category theory”

Parser.spaces

A parser that matches whitespace.

We’ll generally be discarding the result of this, but it’s useful to indicate that the content of 2 parsers may be separated by whitespace.

Parser.end

Parse a string, and fail if the string isn’t empty.

Returns Unit; we can’t get any information from an empty string.

This is useful for running as the last step in a parser. It guarantees that we don’t return success if passed a valid string, followed by garbage.

Solving the Actual Problem

From here on out, the code is less “frameworky” and more “targeted to the problem”.

Hence I’m moving from the Parser trait, and companion object and starting to write code in a Wordy object.

There are a number of good Scala Parser Combinator frameworks, and in the wild, I probably wouldn’t have written any of the code up until this point.

Wordy.infixOperatorParser

This is a parser that takes any of the operators, “multiplied by”, “divided by”, “plus” or “minus”.

And parses it into the corresponding “binary” function, in this case a function from Int => Int => Int (note that this is written in “curried form” (you can probably ignore what that means)).

Wordy.infixFnParser

First runs infixOperatorParser to get a binary function.

Then it parses the following int, and returns a function Int => Int obtained by running operator with the int as the second argument.

For example “multiplied by 4” would be parsed into a function _ * 4. and “plus 3” would be parsed into a function _ + 3.

Wordy.postfixFnParser

This is very similar to infixOperatorParser. Except that it implements postfix functions, such as “3 cubed”

Wordy.fnParser

This matches either “partially applied” infix operators, such as “plus 3”, or postfix operators, such as “squared”. It also consumes any leading spaces

eqnParser

Parse an initial Int, followed by a sequence of Operators and Ints as Functions Int => Int.

Then apply the functions to the initial value from Left to Right.

It should be noted that if there are no functions, this returns the initial value.

This parses the “-12 divided by 2 divided by -3” part of the problem

Wordy.wordyParser

Wrap eqnParser, first applying a parser that looks for the “What is” then apply eqnParser. And finally, consumes the trailing “?”, before validating that the entire string has been matched.

Wordy.answer

Finally, our implementation of answer, just call wordyParser.parse, and get the result.

This blog post is Literate Scala, view the raw source-code here.