Dynamic evaluation of an expression

In this post, I'll tackle Step#2 of evaluating an expression like 2 * (3 + 5) and return 16.
  1. Convert expression from infix to postfix (see the last 2 posts)
  2. Evaluate postfix expression to determine result
As always, write a new test for ExpressionConverter


[Test]
public void TestEvaluate()
{
ExpressionConverter converter = new ExpressionConverter();
Assert.AreEqual(10, converter.EvaluateInfixExpression("((1+2) + 3) * 8/4"));
}


The implementation is very simple now that we have the ability to 'convert to postfix'.

public object EvaluateInfixExpression(string sInfixExpression)
{
List<object> postfixExpression = this.ConvertInfixToPostfix(sInfixExpression);
Stack<object> stack = new Stack<object();
foreach (object term in postfixExpression)
{
if (term is int)
{
stack.Push(term);
}
else
{
Operator operatorTerm = term as Operator;
int iValue2 = (int) stack.Pop();
int iValue1 = (int) stack.Pop();
stack.Push( operatorTerm.Evaluate(iValue1, iValue2) );
}
}
return stack.Peek();
}

Except that we don't have Operator#Evaluate()... Lets get over that hurdle. Mark this test with a temp Ignore tag. Write a new test in the TestOperator fixture

[Test]
public void TestPlusOperator_Evaluate()
{
Assert.AreEqual(30, Operator.Plus.Evaluate(10, 20));
}

If each operator could have a associated evaluate code block that evaluated 2 parameters and returned the result, we'd be done. A new member variable, a new parameterized ctor and a new code block later.. Here are the changes

public class Operator
{
public static readonly Operator Plus = new Operator("+", 1, (x, y) => x + y );
private Func<int,int,int> m_evaluateDelegate;

private Operator(string sSymbol, int iPrecedenceLevel, Func<int,int,int> evaluateDelegate):this(sSymbol, iPrecedenceLevel)
{
m_evaluateDelegate = evaluateDelegate;
}

public int Evaluate(int iValue1, int iValue2)
{
return m_evaluateDelegate(iValue1,iValue2);
}


Green. Ditto for the other operators -, *, /
All done. Finally a console app to do this


static void Main(string[] args)
{
ExpressionConverter expr = new ExpressionConverter();

Console.WriteLine("Postfix version is ");
foreach (object term in expr.ConvertInfixToPostfix(args[0]))
{ Console.Write(term + " "); }
Console.WriteLine();

Console.WriteLine("Expression evaluated to " + expr.EvaluateInfixExpression(args[0]) );
}


This is how it works.
Release>DynCalc.exe "((1+2) + 3) * 2 - 8/4"
Postfix version is
1 2 + 3 + 2 * 8 4 / -
Expression evaluated to 10


Now ExpressionConverter is no longer a converter.. we should probably rename it to Expression. Also it has no instance state. All methods can therefore be converted to static methods - no need to create an instance to invoke members. Done.
Other than non-happy paths like malformed expressions.. (can be done) it's all good. End of series. I had fun..

Test driving an infix to postfix expression converter (2 of 3)

So at the end of Post 1 of this series. we've a converter that doesn't yet understand operator precedence.
Let me refresh my memory before proceeding..
An operator token belongs to a class/family of 'Operators', has a string representation like *, +, /, - and finally has a relative precedence associated with it.
1 for +, -
2 for *, /
3 for (, )

Seems like we need an Operator class instead of just a string. I'll mark the failing test with [Ignore("Detour")] tag while we go make some operators.
Create a new test fixture, imagine a magical Operator class.. something like this


TestOperator.cs

[Test]
public void TestParse_PlusOperator()
{
Operator plusOp = Operator.Parse("+");
Assert.IsNotNull(plusOp);
Assert.AreEqual("+", plusOp.ToString());
Assert.AreEqual(1, plusOp.PrecedenceLevel);
}


I turn my ideas about Operators into a Type/Class.

public class Operator
{
private string m_OperatorSymbol;
private int m_iPrecedenceLevel;
private static readonly Operator plusOperator = new Operator("+", 1);

private Operator(string sSymbol, int iPrecedenceLevel)
{
m_OperatorSymbol = sSymbol;
m_iPrecedenceLevel = iPrecedenceLevel;
}

public int PrecedenceLevel
{
get { return m_iPrecedenceLevel; }
}
public override string ToString()
{ return m_OperatorSymbol; }


public static Operator Parse(string stringToParse)
{ return Operator.plusOperator; }
}



Green! I took the quick way out with the Parse implementation.. I'll let the coming tests force me to distill that bit of code. I add similar tests for Minus (and then Multiply and Divide). I end up with a need for a 'switch case' in Parse().
Also Parse should probably throw if its passed a token that it doesnt recognize. I'll note that down as a test too.


[Test]
[ExpectedException(ExceptionType=typeof(ArgumentException), ExpectedMessage="Unknown Operator! Cannot parse Gishu")]
public void TestParse_InvalidInput()
{
Operator noOp = Operator.Parse("Gishu");
Assert.Fail("Operator learnt some AI to parse Gishu. This should have blown up by now!");
}


[Furious typing] Alright now Operator.Parse() looks like this.. all its tests pass.

public static Operator Parse(string stringToParse)
{
switch (stringToParse)
{
case "+":
return Operator.plusOperator;
case "-":
return Operator.minusOperator;
case "*":
return Operator.multiplyOperator;
case "/":
return Operator.divisionOperator;

default:
throw new ArgumentException("Unknown Operator! Cannot parse " + stringToParse);
}
}


Now lets update our tokenizer to use this new type. This looks like it might break a few tests.. but with baby steps and a vigilant eye, it'll all be better by the time I'm done with it. I'll start by modifying one of my initial tokenizer tests...

[RowTest]
[Row("+")]
[Row(" + ")]
public void ReadPlusOperator(string sInput)
{
Tokenizer t = new Tokenizer(sInput);

//Assert.AreEqual("+", t.GetNextToken());
Assert.AreEqual(Operator.Plus, t.GetNextToken());
}


This would require us to make the static plusOperator member public. The other option is to check each member of the operator object… which would be duplication of the testfixture for Operator. So lets share our operators with the world.. make em public and rename.

public class Operator
{
public static readonly Operator Plus = new Operator("+", 1);
//...

And we're red...
TestDynCalc.TestTokenizer.ReadPlusOperator( + ):
Expected: <+>
But was: "+"


Instead of making a big bang change.. let’s make an ugly hack inside Tokenizer#GetNextToken() to make just this test pass..

Regex regexOperator = new Regex(@"^\s*([-+*/()])");
obResults = regexOperator.Match(m_sInputData);
if (obResults.Success)
{
Group obGroup = obResults.Groups[1];
RemoveReadCharacters(obGroup);

//return obGroup.Value;
if (obGroup.Value != "+")
return obGroup.Value;
else
return Operator.Plus;

}

I broke 2 tests - that tested the tokenizer with complete expressions containining +. Turns out its easy to fix them.

[Test]
public void ReadMultipleTokens()
{
Tokenizer t = new Tokenizer(" 2 + 3 - 34 * 2 / 1 ");
//object[] arrExpectedTokens = new object[] { 2, "+", 3, "-", 34, "*", 2, "/", 1 };
object[] arrExpectedTokens = new object[] { 2, Operator.Plus, 3, "-", 34, "*", 2, "/", 1 };
foreach (object token in arrExpectedTokens)
{
Assert.AreEqual(token, t.GetNextToken());
}
}

All green. Excuse me while I repeat the procedure for all the other operators. The hack in Tokenizer#GetNextToken() can now be replaced by ...

return Operator.Parse(obGroup.Value);


However we forgot the parantheses bros. while creating our operators. We need more operators! Mark Tokenizer with Ignore(“2 TDDed operators coming right up…”).. [more furious typing] we have new operators. Take off the ignore tag. Similar changes… and Tokenizer is now Operator-aware.
Time to take off the ignore tag on ExpressionConverter tests and end our little detour.
So we're back to red.. but this time, we've operators with precedence. [after some mental acrobatics with a pencil and an eraser on paper]
The key to getting this is to check if the operator on top of stack is not a '(' and has higher precedence than the incoming/current token. If yes, pop the stack, write popped operator to output and retry. If no, push the incoming token onto the stack.
So + should not be pushed if the top of stack[TOS] contains *; however * should be if TOS is +.
So we add some more code to the final else block in ExpressionConverter#ConvertToPostfix..


if (obStack.Count() > 0)
{
Operator termOnTopOfStack = obStack.Peek() as Operator;
if ((termOnTopOfStack != Operator.OpenParantheses)
&& (termOnTopOfStack.PrecedenceLevel > (term as Operator).PrecedenceLevel))
{
postfixExpr.Add(obStack.Pop());
continue;
}
}

obStack.Push(term);


All Green! Woohoo! However the method seems to have grown to an uncomfortable size for me... maybe I can refactor.

  • Rename method to ExpressionConverter#ConvertInfixToPostfix()

  • Stack always contains Operators.. So we can do away the casting...Stack stackOfOperators

  • Replace ‘term.ToString() == ")" ‘ with type-based checks

  • Each of ( and ) is a parenthesis together they make a set of parentheses. http://en.wikipedia.org/wiki/Parenthesis... well you live, you read wikis, you learn. Correct typos

  • Extract some methods to clarify intent.
    Had to promote the stack and postfixExpr variables to class members. (All the extracted methods required the stack and List variables to be passed in)



Looking good. I'll post the refactored version shortly. Let's try another expression/test.


[Row("0-12000-(16*4)-20", "0 12000 - 16 4 * - 20 -")]


Red again! [A few brain cycles later]
This breaks because the converter doesn’t recognize Left to Right precedence for operators at the same level. (In 0 - 5 + 3, - should be performed first to get the result of -2 even though precedence-wise + and - are equivalent.)
But this can be easily solved by replacing > with >= where we compare operator precedence. Here’s the final version of the class.


public class ExpressionConverter
{
List<object> m_postfixExpr = new List<object>();
Stack<Operator> m_stackOfOperators = new Stack<Operator>();
public List<Object> ConvertInfixToPostfix(string sInfixExpr)
{
m_postfixExpr.Clear(); m_stackOfOperators.Clear();

Tokenizer parser = new Tokenizer(sInfixExpr);
object term = parser.GetNextToken();
while (term != null)
{
if (term is int)
{
m_postfixExpr.Add(term);
}
else if (term == Operator.CloseParenthesis)
{
MoveOperatorsOnStackToPostfixExprTillMatchingOpenParenthesisIsFound();
}
else
{
Operator operatorTerm = term as Operator;
if (OperatorOnTopOfStackHasHigherOrSamePrecedenceAs(operatorTerm))
{
m_postfixExpr.Add(m_stackOfOperators.Pop());
continue;
}

m_stackOfOperators.Push(operatorTerm);
}

term = parser.GetNextToken();
}

PopAllRemainingOperatorsOnStackToPostfixExpr();
return m_postfixExpr;
}

private bool OperatorOnTopOfStackHasHigherOrSamePrecedenceAs(Operator term)
{
if (m_stackOfOperators.Count() == 0)
return false;

Operator termOnTopOfStack = m_stackOfOperators.Peek();
return ((termOnTopOfStack != Operator.OpenParenthesis)
&& (termOnTopOfStack.PrecedenceLevel >= term.PrecedenceLevel));
}

private void MoveOperatorsOnStackToPostfixExprTillMatchingOpenParenthesisIsFound()
{
Operator term = m_stackOfOperators.Pop();
while (term != Operator.OpenParenthesis)
{
m_postfixExpr.Add(term);
term = m_stackOfOperators.Pop();
}
}

private void PopAllRemainingOperatorsOnStackToPostfixExpr()
{
while (m_stackOfOperators.Count > 0)
m_postfixExpr.Add(m_stackOfOperators.Pop());
}
}


Code is good. End of post. In the next post, I'll write a console 'main' app to exercise the converter and evaluate the expression.

Test driving an infix to postfix expression converter (1 of 2)

I've been wanting to understand infix to postfix conversion - strangely intriguing to me. Here's what I'm about to ramble on for a while today...
Given an expression in infix form (operators appear inline in the expression) like
3 * (2 + 5)
Transform it to its postfix equivalent
3 2 5 + *


This form can be easily evaluated by a computer. You keep pushing terms onto a stack.. everytime you hit a operator, you pop the last 2 values and perform the operation, push the result back on the stack. In time, you will run out of terms and have the result at the top of the stack.
e.g. Stack builds from left to right (TOS is extreme right)
Read 3 Stack [3]
Read 2 Stack [3 2]
Read 5 Stack [3 2 5]
Read + Stack [3 7] #Pop 5,2 Perform 2+5, Store 7 back on stack
Read * Stack [21] # Pop 7,3 Perform 3*7, Store 21 back on stack.


Time and again I've procrastinated.. Then I hit Bart De Smet's blog post dated Oct 2006 - It's time! :). You may want to read up on the algorithm to parse an infix expression.

Just to make things interesting I'm going to do it TDD style.. red-green-refactor to glory. I'm trying to shorten the post to the essence.

I've TDDed myself a Tokenizer, a class that parses a string, uses a couple of regular expressions and returns the next number/operator with each call to GetNextToken().

Tokenizer.cs


public class Tokenizer
{
string m_sInputData;
public Tokenizer(string sInput)
{ m_sInputData = sInput; }

public object GetNextToken()
{
Regex regexNumber = new Regex(@"^\s*(\d+)");
Match obResults = regexNumber.Match(m_sInputData);
if (obResults.Success)
{
Group obGroup = obResults.Groups[1];
RemoveReadCharacters(obGroup);
return Int32.Parse(obGroup.Value);
}

Regex regexOperator = new Regex(@"^\s*([-+*/()])");
obResults = regexOperator.Match(m_sInputData);
if (obResults.Success)
{
Group obGroup = obResults.Groups[1];
RemoveReadCharacters(obGroup);
returnobGroup.Value;
}

return null;
}

private void RemoveReadCharacters(Group obMatchedGroup)
{
m_sInputData = m_sInputData.Remove(0, obMatchedGroup.Index + obMatchedGroup.Length);
}
}


The tokenizer has around 20 tests. I'll post one of the latter tests to illustrate usage

TestTokenizer.cs

[Test]
public void ReadMultipleTokensWithParantheses()
{
Tokenizer t = new Tokenizer(" ( 1 + 2 ) * (3 - (8/2))");
object[] arrExpectedTokens = new object[] { "(", 1, "+", 2, ")", "*", "(", 3, "-", "(", 8, "/", 2, ")", ")" };
foreach (object token in arrExpectedTokens)
{
Assert.AreEqual(token, t.GetNextToken());
}
}


All set to start with the meat of the problem: ExpressionConverter. So we'll come up with out first test employing some wishful thinking. The output is going to be reordered sequence of numbers and operators.. hence List<objects>

TestExpressionConverter.cs

[Test]
public void TestConvert_Simple()
{
ExpressionConverter converter = new ExpressionConverter();
List<object> postfixExpr = converter.ConvertToPostfix(" 2 + 3 ");
StringBuilder sb = new StringBuilder();
foreach(object term in postfixExpr)
{
sb.AppendFormat("{0} ", term.ToString());
}
Assert.AreEqual("2 3 +", sb.ToString());
}


To make this compile, I need to define the ExpressionConverter class and the ConvertToPostfix method within it. Something like this

public List<Object> ConvertToPostfix(string sInfixExpr)
{
List<object> postfixExpr = new List<object>();
Stack<object> obStack = new Stack<object>();

Tokenizer parser = new Tokenizer(sInfixExpr);
object term = parser.GetNextToken();
while (term != null)
{
if (term is int)
{ postfixExpr.Add(term); }
else
{ obStack.Push(term); }

term = parser.GetNextToken();
}

while (obStack.Count > 0)
postfixExpr.Add(obStack.Pop());

return postfixExpr;
}


The simplest thing that works. Collect all numeric terms in the result List<>, push operators onto the stack. At the end, pop contents of the stack into the List. Green!
Let's turn on the heat, see how it deals with grouping. Since the test is basically the same, albeit with diff input and hence output, I found an unorthodox way to use the RowTest NUnit extension I blogged about earlier. I make a couple of changes to the test like this... The last parameter in the list is the ExpectedValue for the test. So now I can run the same test with different input and expected values by adding Row Attributes. Might be a bit hard to read for the new guy.. but not too bad.



[RowTest]
[Row(" 2 + 3 ", "2 3 +")]
[Row(" 2 + (30 + 50 ) ", "2 30 50 + +")]
public void TestConvertInfixToPostfix(string sInfixExpr, string sExpectedPostfixExpr)
{
ExpressionConverter converter = new ExpressionConverter();
List<object> postfixExpr = converter.ConvertInfixToPostfix(sInfixExpr);

StringBuilder sb = new StringBuilder();
foreach(object term in postfixExpr)
{
sb.AppendFormat("{0} ", term.ToString());
}
Assert.AreEqual(sExpectedPostfixExpr, sb.ToString().Trim());
}


This fails as expected. Our code doesn't do parentheses yet. When we read a ), we need to pop out all operators till the matching (. Both ( ) are discarded and never part of the postfix expression. We can make this change in ConvertToPostfix()

if (term is int)
{
postfixExpr.Add(term);
}

else if (term.ToString() == ")")
{
object poppedTerm = obStack.Pop();
while (poppedTerm.ToString() != "(")
{
postfixExpr.Add(poppedTerm);
poppedTerm = obStack.Pop();
}
}

else
{
obStack.Push(term);
}


and back to Green. Lets take it up a notch. Add a new Row attribute / new test with the following expression

[Row(" ( (10+20) + 30 ) * 20-8/4 ", "10 20 + 30 + 20 * 8 4 / -")]


The test fails. NUnit reports:
TestDynCalc.TestExpressionConverter.TestConvertToPostfix( ( (10+20) + 30 ) * 20-8/4 , 10 20 + 30 + 20 * 8 4 / -):
String lengths are both 25. Strings differ at index 16.
Expected: "10 20 + 30 + 20 * 8 4 / -"
But was: "10 20 + 30 + 20 8 4 / - *"
---------------------------^


Bitten by operator precedence.. * has precedence over -. Is this the end for our converter? Not quite.. Onto Part 2


Agile gone wild (due apologies to Skid Row)

Been watching from the sidelines as to how Agile evolves into mainstream software development...
Its funny how everything is changing (w.r.t the latest fad) and simultaneously how the factors behind the success of the initial methodology starting teams stay the same.. good people with skill & discipline, who communicate and relish-and-respond to feedback combined with an environment that supports their cause.

Collecting a set of links that chronicles the current state of "Agile" - whatever little of the original intent is left of it post the rampant SemanticDiffusion
  1. *** James Shore - The decline and fall of Agile
  2. ** Martin Fowler - Flaccid Scrums
  3. *** Ron Jeffries - Context my Foot!
  4. *** Ron Jeffries - Discovering Essential Technical Practices