Programming a simple expression interpreter using Stacks

Ikey Benzaken
5 min readMay 4, 2019

The Goal

I want to write a function that takes a mathematical expression (ie “1+2*3/4+5”) and computes its result using the order of operations known as PEMDAS: Parenthesis, Exponents, Multiplication, Division, Addition, and Subtraction. For simplicity, I’m going to leave parenthesis out of this solution, that will be for a more ambicious day. Today I hope to get my function to compute exponents, then multiplication or division, then addition or subtraction, of any valid input string.

The Strategy

I’m going to use the Python programming language to write out this solution. Python already has a built-in eval() function that does exactly what I’m trying to do — but I want to know how it does it! I will use eval() against my solution to see if it gets the same results.

This problem could be broken down into three functions. One function that will split a string based on certain characters. One function that will compute the mathematical result of two numbers given any operation. And the most important function will recursively simplify the expression until we’re left with a result.

The three functions we’re going to implement

For example: Given input string “1+2*3–4/5”

Our simplify function is going to have to keep track of three things: 1) Which operations we should simplify this run through 2) The rest of the operations for further simplification 3) The stack of substrings that either need to be calculated or further simplified.

It will run the given string through our split_by function, passing the current operations we’re simplifying (ie [“+”, “-”]) as symbols to split by. Then we’d end up with something like [“1”, “+”, “2*3”, “-”, “4/5”]. We’ll throw these values into a stack.

While the stack has more than one element in it, we’ll iteratively: 1) Pop the last three elements from it (we can assume the order of elements will be numOne, operator, numTwo), 2) Further simplify the popped numbers (recursively), 3) Calculate their result, 4) Push the result back onto the stack

Helper Functions

The two helper functions for our simplify function will be split_by() and calculate(). Lets psuedo code those.

split_by() will take two arguments: the string, and the symbols we want the string to be split by. The function will loop over each character in string, and check if the character is in the list of symbols we should split by. If the character is not a split symbol, then we’ll add the character to a variable called curr (for the current substring). If the character is a split symbol, we’ll append the curr_str, and then the symbol, to a list containing all the substrings that we’ll later return.

calculate() is way more self explainatory. It just checks which operation it should perform on num1 and num2 then returns the result.

Ok, nice. We have our helper functions. Now onto the main attraction, simplify().

The Meat and Potatoes

I tried to pseudo code the simplify() function for the purposes of this blog post, but I felt that the pseudo code was more confusing than helpful. Alternatively, here is a diagram of how the stack of characters should look as it’s being processed by the simplify function:

The most important aspect of this algorithm is how the stack processes each mini expression. We pop the top three elements off the stack, the first and third elements (num1 and num2) get recursively simplified, and after we calculate the result, we push it back onto the stack!

Our simplify() function should look similar to this:

A few things to note:

  1. We store variables for current operations and the rest of operations. We had to make a shallow copy of the given operations list because if we modified it directly, it would become empty after just two recursive calls (Since there are only two levels of operations that we simplify by), leaving a lot of the string unsimplified and eventually breaking our algorithm.
  2. When we create the stack, we reverse the list of substrings returned by split_by(). This is because we want the begining of our list to be at the top of our stack. So we feed the strings into the stack in reverse order.
  3. while len(symbols) > 1. Why 1? Because when there is one symbol left in our stack we know that it must be the answer, since we always push the end result back onto it.
  4. The big O notation of this is probably so bad that I don’t want to calculate and admit it. So I won’t ¯\_(ツ)_/¯

Are We Done Yet?

Well, yes and no. For the purposes of this blog post, I achieved what I set out to achieve: A simple math expression interpreter that conforms to the PEMDAS order of operations. However, it is by no means an efficient solution to the problem. We iterate over the strings characters a wasteful amount of times and there are probably many ways I could improve this solution.

In a future attempt at this, I’d like to have it simplify expressions that have parenthesis and exponents too. And when I have an O(n) working version of that, I’ll consider this problem solved.

--

--