meta data for this page
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
include [2020/02/14 00:24] – revusky | include [2021/04/01 13:00] – revusky | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | < |
+ | |||
+ | # The INCLUDE Statement | ||
JavaCC 21's **INCLUDE** statement allows you to break up your grammar file into multiple physical files. It would look like this typically: | JavaCC 21's **INCLUDE** statement allows you to break up your grammar file into multiple physical files. It would look like this typically: | ||
- | INCLUDE(" | + | |
- | //This feature is not present in legacy JavaCC.// | + | *This feature is not present in legacy JavaCC.* |
The motivation behind **INCLUDE** should be obvious. By allowing you to reuse a base grammar or generally useful fragment in various files, you can avoid the copy-paste-modify *antipattern* that would have been necessary when using legacy JavaCC. Generally speaking, being able to to organize a large grammar into multiple physical files can be a big win in terms of maintainability. | The motivation behind **INCLUDE** should be obvious. By allowing you to reuse a base grammar or generally useful fragment in various files, you can avoid the copy-paste-modify *antipattern* that would have been necessary when using legacy JavaCC. Generally speaking, being able to to organize a large grammar into multiple physical files can be a big win in terms of maintainability. | ||
Line 11: | Line 13: | ||
Still, as they say, the devil is in the details, and there are some various wrinkles that need to be covered here. | Still, as they say, the devil is in the details, and there are some various wrinkles that need to be covered here. | ||
- | ===== The DEFAULT_LEXICAL_STATE setting | + | ## The DEFAULT_LEXICAL_STATE setting |
- | In legacy JavaCC, if you defined a token production without specifying a lexical state, any lexical definitions belonged to a lexical state called " | + | In legacy JavaCC, if you defined a token production without specifying a lexical state, any lexical definitions belonged to a lexical state called " |
Thus, JavaCC 21 introduces a setting called **DEFAULT_LEXICAL_STATE**. That means that any lexical specifications where the lexical state is unspecified are in that state. Thus, a JSON grammar would likely have something like this at the top: | Thus, JavaCC 21 introduces a setting called **DEFAULT_LEXICAL_STATE**. That means that any lexical specifications where the lexical state is unspecified are in that state. Thus, a JSON grammar would likely have something like this at the top: | ||
- | options { | + | |
- | | + | DEFAULT_LEXICAL_STATE=" |
- | } | + | |
| | ||
In that case, any grammar for a language that wants to handle embedded JSON data would presumably define its own " | In that case, any grammar for a language that wants to handle embedded JSON data would presumably define its own " | ||
Line 25: | Line 27: | ||
Actually, at the moment, **DEFAULT_LEXICAL_STATE** is the only setting you can put in an **INCLUDE**d grammar that has any effect. All of the other options are simply ignored, since they are presumably set in the top-level *including* grammar. In legacy JavaCC, if you defined a token production without specifying a lexical state, those patterns are matched in a lexical state called " | Actually, at the moment, **DEFAULT_LEXICAL_STATE** is the only setting you can put in an **INCLUDE**d grammar that has any effect. All of the other options are simply ignored, since they are presumably set in the top-level *including* grammar. In legacy JavaCC, if you defined a token production without specifying a lexical state, those patterns are matched in a lexical state called " | ||
- | ===== Wrinkles with Code Injection | + | ## Wrinkles with Code Injection |
JavaCC still supports the legacy JavaCC constructs of **PARSER_BEGIN...PARSER_END** and **TOKEN_MGR_DECLS**. (For how much longer, I am not making any promises...). However, those constructs are ignored within an **INCLUDE**d grammar. | JavaCC still supports the legacy JavaCC constructs of **PARSER_BEGIN...PARSER_END** and **TOKEN_MGR_DECLS**. (For how much longer, I am not making any promises...). However, those constructs are ignored within an **INCLUDE**d grammar. | ||
- | You can still //inject// code into the generated parser or lexer class, from within an included grammar, but you need to write something like: | + | You can still *inject* code into the generated parser or lexer class, from within an included grammar, but you need to write something like: |
- | INJECT(PARSER_CLASS) : | + | INJECT PARSER_CLASS : |
- | { | + | |
- | ... | + | |
- | } | + | |
{ | { | ||
... | ... | ||
Line 41: | Line 40: | ||
or: | or: | ||
- | INJECT(LEXER_CLASS) : | + | INJECT LEXER_CLASS : |
- | { | + | |
- | ... | + | |
- | } | + | |
{ | { | ||
... | ... | ||
Line 51: | Line 47: | ||
JavaCC 21 will replace the **PARSER_CLASS** and **LEXER_CLASS** aliases with the appropriate names -- i.e. the actual class names of the XXXParser or XXXLexer being generated. So, if you have a Foo language in which you want to embed JSON expressions, | JavaCC 21 will replace the **PARSER_CLASS** and **LEXER_CLASS** aliases with the appropriate names -- i.e. the actual class names of the XXXParser or XXXLexer being generated. So, if you have a Foo language in which you want to embed JSON expressions, | ||
- | INJECT(JSONParser) : | + | INJECT JSONParser : |
{ | { | ||
... | ... | ||
} | } | ||
- | { | ||
- | ... | ||
- | } | ||
- | because the parser class we are generating is not '' | + | because the parser class we are generating is not JSONParser, it is FOOParser! However, the person writing |
- | So, do not be surprised when the code within PARSER_BEGIN...PARSER_END is ignored if it is within an INCLUDEd grammar. You need to write '' | + | So, do not be surprised when the code within PARSER_BEGIN...PARSER_END is ignored if it is within an INCLUDEd grammar. You need to write INJECT(PARSER_CLASS) to achieve the desired result. |
- | In fact, the aliases **PARSER_CLASS**, | + | In fact, the aliases **PARSER_CLASS**, |
To see a concrete example of **INCLUDE** in use, you can take a look at https:// | To see a concrete example of **INCLUDE** in use, you can take a look at https:// | ||
+ | |||
+ | </ | ||
===== INCLUDE with Java Source files ===== | ===== INCLUDE with Java Source files ===== | ||
Line 72: | Line 67: | ||
to only contain Java source code. Thus, writing: | to only contain Java source code. Thus, writing: | ||
- | | + | |
is exactly the same as if you wrote: | is exactly the same as if you wrote: |