meta data for this page
Obsolete Settings from Legacy JavaCC (and JJTree)
As a result of quite a bit of forward evolution, some of the settings from legacy JavaCC (and JJTree) are obsolete in JavaCC21. We don't anticipate that any of them will be missed.
- STATIC: JavaCC21 does not support static parsers. Or, in other words, this is always set to false, (and thus, ignored.)
- LOOKAHEAD Legacy JavaCC allowed you to specify a default numerical lookahead other than 1 token. In JavaCC21, this setting is gone and is always effectively equal to 1. Of course, you can still specify numerical lookahead other than 1 at choice points as needed.
- CHOICE_AMBIGUITY_CHECK This was a parameter in legacy JavaCC that allowed you to specify how far to scan ahead when checking for “ambiguities” in the grammar. For now, the whole concept has been removed from JavaCC 21. Most of the conditions reported as “choice ambiguities” were not really ambiguities in the grammar anyway. The logic of JavaCC is that if more than one choice matches, the first one wins. At some point, we may put in some code to check for unreachable code (at least the simple cases that can be statically proven) but it is not a high priority since the whole thing is of very marginal use-value.
- OTHER_AMBIGUITY_CHECK The same comments basically apply here as to CHOICE_AMBIGUITY_CHECK. The code for these so-called “ambiguity checks” has been ripped out. In any case, in real world praxis, nobody was ever using these settings anyway.
- FORCE_LA_CHECK Frankly, we are unsure what this setting ever did. At least in this case, ignorance is bliss. So the setting is gone. Besides, the fact remains that lookahead was always fundamentally broken in legacy JavaCC anyway, so all of these sophisticated checks were surely all for nothing anyway!
- UNICODE_INPUT: Effectively, this is now always set to true, so it is superfluous, (and thus, ignored.)
- USER_CHAR_STREAM: This was a setting that allowed you to define your own implementation of the
CharStream
interface. In default usage of JavaCC 21, this whole concept is irrelevant, since by default, the generated parser just slurps the whole file into memory at once anyway. See The Gigabyte is the new Megabyte. - BUILD_LEXER: It is rather hard to fathom what the point of this setting ever was. On modern hardware, a full rebuild of both the parser and lexer is not very expensive. This kind of thing does not seem to have any value and is really just confusing.
- BUILD_PARSER: Another bizarrely pointless setting really. If all you want to do is build a lexer, and not a parser, then just don't define any grammatical productions in your grammar and all we build is a lexer!
- DEBUG_LEXER: This setting, along with
DEBUG_PARSER
are removed as of mid-December 2021. It is very hard to imagine current-day developers using this sort of approach, as opposed to using an actual debugger! - DEBUG_PARSER: This is now gone. It is actually not so hard to debug the generated parser since the code is much more readable than before and contains location info to trace back where in the grammar file the generated code originated.
- KEEP_LINE_COL: JavaCC21 always puts location information in Tokens and Node objects. (Really, why would you ever want to throw away location info?) For more thoughts on this issue, see The Gigabyte is the new Megabyte.
- ERROR_REPORTING: This was an option that was true by default, but you could turn it off in order to generate a somewhat smaller .class file, except that error messages would be much less informative because of information being thrown away. I did some experimenting and found that the generated XXXParser.class was typically about 10% smaller with ERROR_REPORTING off. The tradeoff looks terrible and, as with KEEP_LINE_COL, it looks utterly foolish to ever turn this off. So, the setting is now gone and the option is always effectively on. (Further note. All the legacy error reporting code is practically rewritten anyway. The prior comment applies in any case. There is no reason for any sane person to want to turn it off.)
- SANITY_CHECK: By default, the parser generator does some various sanity checks before generating the various files. This setting in the legacy JavaCC tool allowed you to turn this off. (Why would anybody turn this off?) This setting is gone and is now effectively always true.
- CACHE_TOKENS: I never even understood what the point of this setting was. It must have been some kind of speculative peephole optimization, except I don't think it was even correct. There would be problems with switches of lexical state in some cases. Also, I doubt it offered any noticeable performance gain. The setting is now gone and is always effectively false. (Which was the default before, which everybody was using anyway.)
- TOKEN_FACTORY : This setting has been removed (as of 11/11/2021) since it is really not very useful now that we have INJECT. I doubt it was really very widely used (if at all).
- TRACK_TOKENS : There is no real reason for this setting to exist any more, since, by default, Tokens are added to the AST and they have their line/column information. In fact, all Node objects have line/column information.
- USER_DEFINED_TOKEN_MANAGER : This setting was removed in October 2021.
- COMMON_TOKEN_ACTION : This feature is still supported but the configuration setting is no longer necessary, since JavaCC21 deduces it from the presence (or absence) of the appropriately named method in your generated lexer class. If you have a method with the signature
void CommonTokenAction(Token t)
it will be called at the appropriate point. However, you would be better off using the newer alternative, which you use by creating a method with the signatureToken TOKEN_HOOK(Token t)
. It is more flexible because, for one thing, it allows you to define multiple token hook routines. See here for more information. Also, since this method has a return value, it allows you to instantiate a new Token object (of whatever subclass) and return it. In any case, there is no need for the configuration setting, since these methods are used if present and if not, not. (Duh!) - NODE_SCOPE_HOOK : As with the
COMMON_TOKEN_OPTION
, the feature is still supported but the configuration option is no longer necessary, since JavaCC21 deduces it from the presence or absence of the appropriately named method or methods in your generated parser class. See Node Life Cycle Hooks for more information. - NODE_EXTENDS : Since JavaCC21 has
INJECT
, there is no need for this configuration option to exist. If you want to specify that your BaseNode class extends some specific class, simply use code injection to specify this. Something like:
INJECT BaseNode : extends SomeClass
In general, code injection can be used to specify that any generated class should extend a given class or implement whatever interface(s). There is no need for a plethora of configuration settings for this.
The following configuration option is still supported but is deprecated in JavaCC21:
- NODE_PREFIX: Use of this is not encouraged in JavaCC21. By default, it is simply the empty string. (In JavaCC (or JJTree to be precise) it was “AST” by default.) I guess that prefixing all the Node classes with “AST” is a (crude) way of defining a Namespace. However, one would think these people noticed that Java has this thing called
packages
.
The use of both PARSER_BEGIN….PARSER_END
and TOKEN_MGR_DECLS
is deprecated in favor of the new code injection feature. Injecting code into the generated parser and lexer is simply a specific case of code injection, so there is no need for these separate constructs. However, they will continue to work for the foreseeable future.
To specify the parser and lexer class names, you may use the PARSER_CLASS
and LEXER_CLASS
configuration options. However, it is not mandatory, since a Foo.javacc
file will automatically generate a parser class called FooParser
and a lexer class called FooLexer
. There will rarely be any practical value in overriding that.
There are a host of settings that were added after the FreeCC fork, which was in mid-2008. See ancient history for more information on all this. No settings added to legacy JavaCC after about 2008 are currently supported in JavaCC 21. Most of them are of very marginal value. Moreover, it is safe to say that nobody uses them because they are not documented anywhere that I can find! Just for example, the GRAMMAR_ENCODING option was added at some point after 2008 (I don't know when exactly) to specify what encoding your grammar file is in. I am certain that nobody uses this. (Or just about nobody surely.) Everybody stores their grammar files in the system default encoding which is UTF-8
on any remotely modern system that any serious developer would be working on. Adding these kinds of options that nobody uses is actually very typical of a nothingburger project. (Adding all these options and not even documenting them is nothingburger-ism squared!)
See new settings in JavaCC 21 for information on settings introduced in JavaCC21 that were not present in legacy JavaCC.