Sources

The project can be cloned from github repository.
The revision described in this post is 7c8e6ea934b6d3172e3c142c658e9c4287287de3.

Grammar changes

Implementing conditional statements resulted in two grammar changes:

  • introdcuing a new rule ifStatement.
  • adding conditionalExpressions alternatives to the expression rule.
ifStatement :  'if'  '('? expression ')'? trueStatement=statement ('else' falseStatement=statement)?;
expression : varReference #VARREFERENCE
           | value        #VALUE
           //other expression alternatives
           | expression cmp='>' expression #conditionalExpression
           | expression cmp='<' expression #conditionalExpression
           | expression cmp='==' expression #conditionalExpression
           | expression cmp='!=' expression #conditionalExpression
           | expression cmp='>=' expression #conditionalExpression
           | expression cmp='<=' expression #conditionalExpression
           ;

ifStatement rule definition basically means:

  • expression is a condition to be tested.
  • it is not required to place condition in parenthesis - '('? ')'?- question marks mean “optional”.
  • trueStatement is meant to be evaluated when the condition is true.
  • the if can be followed by an else.
  • falseStatement is meant to be evaluated when the condition is false.
  • ifStatement is a statement too so it can be used in trueStatement or falseStatement (if … else if … else ).

New expression alternatives are pretty much self explanatory. Their purpose is to compare two expressions and return another expression (boolean value).

To better understand how the ‘if’ and ‘else’ can be used to specify ‘else if’ take a look at following snippet:

    if(0) {
        
    } else if(1) {
        
    }

The code is parsed to following parse tree:

Parse Tree

As you can see the second if is actually a child of else. They are on the different level in hierarchy. There is no need to specify ‘else if’ in rule explicitly. ifstatement rule is actually a statement rule too so other ifStatements can be used inside ifStatement. This provides a way to chain them easily.

Mapping antlr context objects

Antlr autogenerated IfStatementContext objects are converted into POJO IfStatement objects:

public class StatementVisitor extends EnkelBaseVisitor<Statement> {
    //other stuff
    @Override
    public Statement visitIfStatement(@NotNull EnkelParser.IfStatementContext ctx) {
        ExpressionContext conditionalExpressionContext = ctx.expression();
        Expression condition = conditionalExpressionContext.accept(expressionVisitor); //Map conditional expression
        Statement trueStatement = ctx.trueStatement.accept(this); //Map trueStatement antlr object
        Statement falseStatement = ctx.falseStatement.accept(this); //Map falseStatement antlr object

        return new IfStatement(condition, trueStatement, falseStatement);
    } 
}

Conditional Expressions on the other hand are mapped like this:

public class ExpressionVisitor extends EnkelBaseVisitor<Expression> {
    @Override
    public ConditionalExpression visitConditionalExpression(@NotNull EnkelParser.ConditionalExpressionContext ctx) {
        EnkelParser.ExpressionContext leftExpressionCtx = ctx.expression(0); //get left side expression ( ex. 1 < 5  -> it would mean get "1")
        EnkelParser.ExpressionContext rightExpressionCtx = ctx.expression(1); //get right side expression
        Expression leftExpression = leftExpressionCtx.accept(this); //get mapped (to POJO) left expression using this visitor
        //rightExpression might be null! Example: 'if (x)' checks x for nullity. The solution for this case is to assign integer 0 to the rightExpr 
        Expression rightExpression = rightExpressionCtx != null ? rightExpressionCtx.accept(this) : new Value(BultInType.INT,"0"); 
        CompareSign cmpSign = ctx.cmp != null ? CompareSign.fromString(ctx.cmp.getText()) : CompareSign.NOT_EQUAL; //if there is no cmp sign use '!=0' by default
        return new ConditionalExpression(leftExpression, rightExpression, cmpSign);
    }
}

CompareSign is an object representing comparing sign (‘==”, ‘<’ etc.). It also stores appropriate bytecode instruction name for comparison (IF_ICMPEQ,IF_ICMPLE etc.)

Generating bytecode

The jvm has few groups of conditional instructions for conditional branching:

  • if<eq,ne,lt,le,gt,ge> - pops one value from the stack and comparse it to 0.
  • if_icmp_<eq,ne,lt,le,gt,ge> - pops two values from stack and compares them to each other.
  • ifs for other primitive types (lcmp - long ,fcmpg - float etc.)
  • if[non]null - checks for null

For now we’re just going to use second group. The instructions take operand which is a branchoffset (the instruction to which proceed if the condition is met).

Generating ConditionalExpression

The first place the ifcmpne (compare two values for ‘not equal’ test) instruction is going to be used is for generating bytecode is ConditionalExpression:

public void generate(ConditionalExpression conditionalExpression) {
    Expression leftExpression = conditionalExpression.getLeftExpression();
    Expression rightExpression = conditionalExpression.getRightExpression();
    Type type = leftExpression.getType();
    if(type != rightExpression.getType()) {
        throw new ComparisonBetweenDiferentTypesException(leftExpression, rightExpression); //not yet supported
    }
    leftExpression.accept(this);
    rightExpression.accept(this);
    CompareSign compareSign = conditionalExpression.getCompareSign();
    Label trueLabel = new Label(); //represents an adress in code (to which jump if condition is met)
    Label endLabel = new Label();
    methodVisitor.visitJumpInsn(compareSign.getOpcode(),trueLabel);
    methodVisitor.visitInsn(Opcodes.ICONST_0);
    methodVisitor.visitJumpInsn(Opcodes.GOTO, endLabel);
    methodVisitor.visitLabel(trueLabel);
    methodVisitor.visitInsn(Opcodes.ICONST_1);
    methodVisitor.visitLabel(endLabel);
}

compareSign.getOpcode() - returns the instruction for condition:

public enum CompareSign {
    EQUAL("==", Opcodes.IF_ICMPEQ),
    NOT_EQUAL("!=", Opcodes.IF_ICMPNE),
    LESS("<",Opcodes.IF_ICMPLT),
    GREATER(">",Opcodes.IF_ICMPGT),
    LESS_OR_EQUAL("<=",Opcodes.IF_ICMPLE),
    GRATER_OR_EQAL(">=",Opcodes.IF_ICMPGE);
    //getters
}

Conditional instructions take operand which is a branchoffset (label). Two values currently sitting at top of the stack are poped and compared using compareSign.getOpcode().

If the comparision is positive then the jump is performed to trueLabel. The truLabel instructions consist of methodVisitor.visitInsn(Opcodes.ICONST_1);. This means pushing int 1 onto the stack.

If the comparison is negative no jump is performed. Instead the next instruction is invoked (ICONST_0 - push 0 onto the stack). Afterwards the GOTO (unconditional branching instruction) is performed to jump to endLabel. That way the code responsible for positive comparision is bypassed.

Performing comparison in the manner described above guarantees that the result would be 1 or 0 (int value pushed onto the stack).

That way the conditonalExpression can be used as an expression - it can be assigned to a variable, passed as argument to a function,printed or even returned.

Generating IfStatement

 public void generate(IfStatement ifStatement) {
        Expression condition = ifStatement.getCondition();
        condition.accept(expressionGenrator);
        Label trueLabel = new Label();
        Label endLabel = new Label();
        methodVisitor.visitJumpInsn(Opcodes.IFNE,trueLabel);
        ifStatement.getFalseStatement().accept(this);
        methodVisitor.visitJumpInsn(Opcodes.GOTO,endLabel);
        methodVisitor.visitLabel(trueLabel);
        ifStatement.getTrueStatement().accept(this);
        methodVisitor.visitLabel(endLabel);
    }

The IfStatement relies on a concept used by ConditionalExpression - it guarantees that the 0 or 1 is pushed onto the stack as result of generating.

It simply evaluates expression (condition.accept(expressionGenrator);) and checks if the value it pushed onto the stack is != 0 (methodVisitor.visitJumpInsn(Opcodes.IFNE,trueLabel);). If it is != 0 then it jumps to trueLabel which generates the trueStatement (ifStatement.getTrueStatement().accept(this);). Otherwise it continues to execute instructions, by generating falseStatement, and jumping (GOTO) to the endLabel.

Example

The following Enkel class:

SumCalculator {

    main(string[] args) {
        var expected = 8
        var actual = sum(3,5)

        if( actual == expected ) {
            print "test passed"
        } else {
            print "test failed"
        }
    }

    int sum (int x ,int y) {
        x+y
    }
    
}

gets compiled into following bytecode:

kuba@kuba-laptop:~/repos/Enkel-JVM-language$ javap -c  SumCalculator
public class SumCalculator {
  public static void main(java.lang.String[]);
    Code:
       0: bipush        8
       2: istore_1          //store 8 in local variable 1 (expected)
       3: bipush        3   //push 3 
       5: bipush        5   //push 5
       7: invokestatic  #10 //Call metod sum (5,3)
      10: istore_2          //store the result in variable 2 (actual)
      11: iload_2           //push the value from variable 2 (actual=8) onto the stack
      12: iload_1           //push the value from variable 1 (expected=8) onto the stack
      13: if_icmpeq     20  //compare two top values from stack (8 == 8) if false jump to label 20
      16: iconst_0          //push 0 onto the stack
      17: goto          21  //go to label 21 (skip true section)
      20: iconst_1          //label 21 (true section) -> push 1 onto the stack
      21: ifne          35  //if the value on the stack (result of comparison 8==8 != 0 jump to label 35
      24: getstatic     #16  // get static Field java/lang/System.out:Ljava/io/PrintStream;
      27: ldc           #18  // push String test failed
      29: invokevirtual #23  // call print Method "Ljava/io/PrintStream;".println:(Ljava/lang/String;)V
      32: goto          43   //jump to end (skip true section)
      35: getstatic     #16                 
      38: ldc           #25  // String test passed
      40: invokevirtual #23                 
      43: return

  public static int sum(int, int);
    Code:
       0: iload_0
       1: iload_1
       2: iadd
       3: ireturn
}

Jakub Dziworski

JVM Dev Blog