Sources

The project source code can be cloned from https://github.com/JakubDziworski/AntlrListenerVisitorComparison. There are full examples of Listener and Visitor oriented parser implementations.

SomeLanguage

Let’s say we want to parse “SomeLanguage” with following grammar.

grammar SomeLanguage ;

classDeclaration : 'class' className '{' (method)* '}';
className : ID ;
method : methodName '{' (instruction)+ '}' ;
methodName : ID ;
instruction : ID ;

ID : [a-zA-Z0-9]+ ;
WS: [ \t\n\r]+ -> skip ;

Sample valid “SomeLanguage” code:

class SomeClass {
    fun1 {
        instruction11
        instruction12
    }
    fun2 {
        instruction21
        instruction22
    }
};

The class consist of zero or more methods. The methods consist of zero or more instructions. That’s all.

We’d like to parse the code to “Class” object:

public class Class {
    private String name;
    private Collection<Method> methods;
}

public class Method {
    private String name;
    private Collection<Instruction> instructions;
}

public class Instruction {
    private String name;
}

Listener vs Visitor

To do that Antlr4 provides two ways of traversing syntax tree:

  • Listener (default)
  • Visitor

To generate visitor classes from the grammar file you have to add -visitor option to the command line. I however use antlr maven plugin (see full code at github)

Parsing using Listener

To parse the code to Class object we could create one big Listener and register it using parser (parser.addParseListener()). This is however going to generete one huge messy class. Instead it is good idea to register separate listener for each rule separetly:

public class ListenerOrientedParser implements Parser{

    @Override
    public Class parse(String code) {
        CharStream charStream = new ANTLRInputStream(code);
        SomeLanguageLexer lexer = new SomeLanguageLexer(charStream);
        TokenStream tokens = new CommonTokenStream(lexer);
        SomeLanguageParser parser = new SomeLanguageParser(tokens);

        ClassListener classListener = new ClassListener();
        parser.classDeclaration().enterRule(classListener);
        return classListener.getParsedClass();
    }

    class ClassListener extends SomeLanguageBaseListener {

        private Class parsedClass;

        @Override
        public void enterClassDeclaration(@NotNull SomeLanguageParser.ClassDeclarationContext ctx) {
            String className = ctx.className().getText();
            MethodListener methodListener = new MethodListener();
            ctx.method().forEach(method -> method.enterRule(methodListener));
            Collection<Method> methods = methodListener.getMethods();
            parsedClass = new Class(className,methods);
        }

        public Class getParsedClass() {
            return parsedClass;
        }
    }

    class MethodListener extends SomeLanguageBaseListener {

        private Collection<Method> methods;

        public MethodListener() {
            methods = new ArrayList<>();
        }

        @Override
        public void enterMethod(@NotNull SomeLanguageParser.MethodContext ctx) {
            String methodName = ctx.methodName().getText();
            InstructionListener instructionListener = new InstructionListener();
            ctx.instruction().forEach(instruction -> instruction.enterRule(instructionListener));
            Collection<Instruction> instructions = instructionListener.getInstructions();
            methods.add(new Method(methodName, instructions));
        }

        public Collection<Method> getMethods() {
            return methods;
        }
    }

    class InstructionListener extends SomeLanguageBaseListener {

        private Collection<Instruction> instructions;

        public InstructionListener() {
            instructions = new ArrayList<>();
        }

        @Override
        public void enterInstruction(@NotNull SomeLanguageParser.InstructionContext ctx) {
            String instructionName = ctx.getText();
            instructions.add(new Instruction(instructionName));
        }

        public Collection<Instruction> getInstructions() {
            return instructions;
        }
    }
}

Parsing using Visitor

Visitor implementation is very similar but have one advantage. Visitor methods return value - no need to store values in fields.

public class VisitorOrientedParser implements Parser {

    public Class parse(String someLangSourceCode) {
        CharStream charStream = new ANTLRInputStream(someLangSourceCode);
        SomeLanguageLexer lexer = new SomeLanguageLexer(charStream);
        TokenStream tokens = new CommonTokenStream(lexer);
        SomeLanguageParser parser = new SomeLanguageParser(tokens);

        ClassVisitor classVisitor = new ClassVisitor();
        Class traverseResult = classVisitor.visit(parser.classDeclaration());
        return traverseResult;
    }

    private static class ClassVisitor extends SomeLanguageBaseVisitor<Class> {
        @Override
        public Class visitClassDeclaration(@NotNull SomeLanguageParser.ClassDeclarationContext ctx) {
            String className = ctx.className().getText();
            MethodVisitor methodVisitor = new MethodVisitor();
            List<Method> methods = ctx.method()
                    .stream()
                    .map(method -> method.accept(methodVisitor))
                    .collect(toList());
            return new Class(className, methods);
        }
    }

    private static class MethodVisitor extends SomeLanguageBaseVisitor<Method> {
        @Override
        public Method visitMethod(@NotNull SomeLanguageParser.MethodContext ctx) {
            String methodName = ctx.methodName().getText();
            InstructionVisitor instructionVisitor = new InstructionVisitor();
            List<Instruction> instructions = ctx.instruction()
                    .stream()
                    .map(instruction -> instruction.accept(instructionVisitor))
                    .collect(toList());
            return new Method(methodName, instructions);
        }
    }

    private static class InstructionVisitor extends  SomeLanguageBaseVisitor<Instruction> {

        @Override
        public Instruction visitInstruction(@NotNull SomeLanguageParser.InstructionContext ctx) {
            String instructionName = ctx.getText();
            return new Instruction(instructionName);
        }
    }
}

Results

Both implementations output the same result. I personally prefer Visitor since it requires less code and there is no need to store values in the fields.

Using any parser implementation “SomeLanguage” sample code is parsed to Class object:

{
    "name": "SomeClass",
    "methods": [
        {
            "name": "fun1",
            "instructions": [
                {
                    "name": "instruction11"
                },
                {
                    "name": "instruction12"
                }
            ]
        },
        {
            "name": "fun2",
            "instructions": [
                {
                    "name": "instruction21"
                },
                {
                    "name": "instruction22"
                }
            ]
        }
    ]
}

For full code visit https://github.com/JakubDziworski/AntlrListenerVisitorComparison.


Jakub Dziworski

JVM Dev Blog