I was asked once to write some code that would parse the following string and evaluate it based on a few simple rules.
( + ( + 4 ( * 7 8 ) ) ( * 10 31 ) )
The grammar for this string can be described like so:
expression: "( operator operand operand )"
operator: "*" | "+"
operand: expression | integer
An expression
is comprised of an open paren (
, a space, an operator
then an operand
followed by another operand
. Finally a space and a closing paren )
.
An operator
can be either an asterix *
or a plus sign +
.
An operand
can be either another expression
or an integer.
Here are a two simple examples:
( + 2 5) = 7
( * ( + 9 3 ) ( + 4 6 ) ) = 120
Let’s also make a few assumptions:
Given the assumptions above we have a pretty convenient way in Swift of breaking a String into an array of Strings:
let expression = "( + 2 5)"
var components = expression.components(separatedBy: " ")
// ["(", "+", "2", "5)"]
Whenever you see a pattern composed of itself you might think of how it can be broken down recursively. In this case an expression
can be comprised of expression
s.
Let’s write a function that will take our [String]
and start parsing it. Eventually we will return an Int
representing the result of the evaluated expression
:
func eval(components: inout [String]) -> Int {
let first = components.removeFirst()
if first != "(" {
return Int(first)!
}
// ...
}
Here we pop the first String off the Array and if it is not an open paren we cast to an integer and return. What we want to identify is our integer base case. A base case is simply a code-path in our recursive function that does not recurse and produces a result, trivially. It can sometimes be thought of as the “terminating case”. Here we should think of it as having reached a leaf node in our expression tree. Once it is found, simply return the parsed Int
value.
If first
is not an open paren, the only thing it could be is an operator
. So let’s remove that next.
func eval(components: inout [String]) -> Int {
let first = components.removeFirst()
if first != "(" {
return Int(first)!
}
let op = components.removeFirst()
// ...
}
Now that we know we’ve popped the operator
the next two things could only be operand
s. Since an operand
can be either an expression
or an Int
we can start recursing on the components
array that is now a bit shorter.
Remember, for the Int
operand
case, we’ve dealt with that first in our function so recursing here makes sense.
func eval(components: inout [String]) -> Int {
let first = components.removeFirst()
if first != "(" {
return Int(first)!
}
let op = components.removeFirst()
let left = eval(components: &components)
let right = eval(components: &components)
// ...
}
What we’re doing here is recursing down the “left” side of the state of the components
String, which will eventually return an Int
value.
In the most trivial example ( + 2 5)
left will return 2
.
Immediately we will do the same thing for the right side operand
and set that value to a variable called right
.
In the most trivial example ( + 2 5)
right will return 5
.
So far so good?
As a little clean up, after recursing left and right, we’ll need to remember our expression syntax and remove the trailing paren:
func eval(components: inout [String]) -> Int {
let first = components.removeFirst()
if first != "(" {
return Int(first)!
}
let op = components.removeFirst()
let left = eval(components: &components)
let right = eval(components: &components)
components.removeFirst() // Remove that last paren!
// ...
}
Let’s set a breakpoint and think for a minute. If you’re keeping track of the variables in scope for the trivial example you should have the following:
op = "+"
left = 2
right = 5
We got here by:
"("
"+"
2
5
")"
Does that make sense? If you stop for a second and look at the steps in this list you can see how it can be applied to any expression tree of this type, of any length.
Now that have an operator
and two Int
s we can evaluate the expression and return its result. For that write a new function to apply the operator
the operands
s:
func apply(_ op: String, _ lhs: Int, _ rhs: Int) -> Int {
if op == "+" {
return lhs + rhs
}
return lhs * rhs
}
Now, use it in the recursive function and return the result of our applied operand
at the tail end of the eval
function.
func eval(components: inout [String]) -> Int {
let first = components.removeFirst()
if first != "(" {
return Int(first)!
}
let op = components.removeFirst()
let left = eval(components: &components)
let right = eval(components: &components)
components.removeFirst()
// Do the maths
return apply(op, left, right)
}
To make this a usable function we’ll need to call eval
and pass to it the [String]
and return the result:
func evaluate(expression: String) -> Int {
var components = expression.components(separatedBy: " ")
return eval(components: &components)
}
Note, we’re mutating the array of String
’s and so need to pass it by reference. It is not treated as a value type by our function.
Here are a few simple tests:
evaluate(expression: "( + 2 5 )")
// 7
evaluate(expression: "( * ( + 9 3 ) ( + 4 6 ) )")
// 120
evaluate(expression: "( + ( + 4 ( * 7 8 ) ) ( * 10 31 ) )")
// 370
There are probably more Swift-like ways of parsing strings expessions like this and there’s probably some overhead with using String
instead of Character
but in-all this is an interesting exploration into string parsing.
In Part 2, we’ll look at using Swift’s Enumeration type to make our expression grammar more robust