The following is a summary of CongoCC's streamlined syntax from the perspective of somebody migrating
from legacy JavaCC. Note that, unlike CongoCC's predecessor, JavaCC 21, CongoCC does not support the legacy syntax. Any existing grammar file must be converter. There is a utility available that automatically converts the legacy syntax to the streamlined syntax. You can pick up the latest build of JavaCC 21 and the converter can be invoked via java -jar javacc-full.jar convert MyGrammarfile
. (N.B. The converter is somewhat imperfect, and you may well need to hand-edit the results. Even so, it is bound to be timesaver.)
There is no need to write empty parentheses after a nonterminal for a production that takes no parameters. Thus:
Foo() Bar() Baz()
can now be written as:
Foo Bar Baz
void
in front of productions with no return value.{}
as the first thing in your production's definition. (I mean, assuming that you don't actually need to put some code at the top of your production.)The above four points (along with the earlier point about no-args nonterminals not needing parentheses) combine such that, where you would previously write:
void Foobar() : {} { Foo() Bar() }
you would now write:
Foobar : Foo Bar;
In this case, they are written with no opening {
and the list is ended with a semicolon. Thus, instead of writing:
TOKEN #Delimiter : { <LPAREN : "("> | <RPAREN> : ")"> | <LBRACE> : "{"> | <RBRACE> : "}"> }
the newer syntax is:
TOKEN #Delimiter : <LPAREN: "(" > | <RPAREN: ")" > | <LBRACE: "{" > | <RBRACE: "}" > ;
This was deemedd to be preferable, not because it saves much space (it doesn't!) but because one aspect of the newer syntax is that the {...}
are reserved for elements that really are embedded Java actions. As you can see, in the newer syntax for BNF productions, the only use of {...}
is for actual Java code.
Since the options, like TREE_BUILDING_ENABLED=false
can only occur at the very top of a file anyway, there is no need for them to be in some special construct Options {..}
. Thus, where you would previous have:
options { BASE_SRC_DIR=".."; PARSER_PACKAGE="com.acme.foolang"; }
You now simply put:
BASE_SRC_DIR=".."; PARSER_PACKAGE="com.acme.foolang";
at the top of your file.
N.B. This, of course, is not a change from legacy JavaCC, since legacy JavaCC never had an INJECT
statement!
You can (optionally) dispense with the parentheses in: INJECT(ClassDeclaration) :
The first block after the colon does not need braces around it. Either part of the injection can be left out. Thus, if the only point is to indicate that a Node extends a class (or implements an interface or you want to use an Annotation), where you previously had to write:
INJECT(MyNode) : { extends AbstractBaseNode } {}
(Actually, the final empty block {}
has been optional for some time in these spots, but I don't believe I ever documented that! But now it is much more streamlined.)
You can now write:
INJECT MyNode : extends AbstractBaseNode
A more complex INJECTION that does insert some code might now look something like:
INJECT MyNode : import java.util.List; extends AbstractBaseNode implements Nullable { private List<Foo> foos; public List<Foo> getFoos() {return foos;} public void setFoos(List<Foo> foos) {this.foos = foos;} }
Note that, in the statements immediately following the colon (and immediately preceding the opening brace) the first ends with a semicolon and the other two do not. Well, the extends
and implements
elements in Java do not end in with a semicolon, while an import
statement does. However, if the above looks funny to you, you can (optionally) end the other two lines with a semicolon and the parser will not complain!
The new SCAN
instruction supersedes the legacy LOOKAHEAD
. See here for more information.
The up to here syntax provides a way to specify lookahead in a much more clean, intuitive way. See here for more information.