meta data for this page
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| include [2020/02/14 00:24] – revusky | include [2023/03/03 16:20] (current) – revusky | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ====== The INCLUDE Statement ====== | + | < |
| - | JavaCC 21's **INCLUDE** statement allows you to break up your grammar file into multiple physical files. It would look like this typically: | + | # The INCLUDE |
| - | INCLUDE(" | + | Congo' |
| - | //This feature is not present in legacy JavaCC.// | + | INCLUDE " |
| + | |||
| + | *This feature is not present in legacy JavaCC.* | ||
| The motivation behind **INCLUDE** should be obvious. By allowing you to reuse a base grammar or generally useful fragment in various files, you can avoid the copy-paste-modify *antipattern* that would have been necessary when using legacy JavaCC. Generally speaking, being able to to organize a large grammar into multiple physical files can be a big win in terms of maintainability. | The motivation behind **INCLUDE** should be obvious. By allowing you to reuse a base grammar or generally useful fragment in various files, you can avoid the copy-paste-modify *antipattern* that would have been necessary when using legacy JavaCC. Generally speaking, being able to to organize a large grammar into multiple physical files can be a big win in terms of maintainability. | ||
| Line 11: | Line 13: | ||
| Still, as they say, the devil is in the details, and there are some various wrinkles that need to be covered here. | Still, as they say, the devil is in the details, and there are some various wrinkles that need to be covered here. | ||
| - | ===== The DEFAULT_LEXICAL_STATE setting | + | ## The DEFAULT_LEXICAL_STATE setting |
| - | In legacy JavaCC, if you defined a token production without specifying a lexical state, any lexical definitions belonged to a lexical state called " | + | In legacy JavaCC, if you defined a token production without specifying a lexical state, any lexical definitions belonged to a lexical state called " |
| - | Thus, JavaCC 21 introduces | + | Thus, CongoCC has a setting called **DEFAULT_LEXICAL_STATE**. That means that any lexical specifications where the lexical state is unspecified are in that state. Thus, a JSON grammar would likely have something like this at the top: |
| + | |||
| + | |||
| + | DEFAULT_LEXICAL_STATE=JSON; | ||
| - | options { | ||
| - | | ||
| - | } | ||
| | | ||
| In that case, any grammar for a language that wants to handle embedded JSON data would presumably define its own " | In that case, any grammar for a language that wants to handle embedded JSON data would presumably define its own " | ||
| Line 25: | Line 27: | ||
| Actually, at the moment, **DEFAULT_LEXICAL_STATE** is the only setting you can put in an **INCLUDE**d grammar that has any effect. All of the other options are simply ignored, since they are presumably set in the top-level *including* grammar. In legacy JavaCC, if you defined a token production without specifying a lexical state, those patterns are matched in a lexical state called " | Actually, at the moment, **DEFAULT_LEXICAL_STATE** is the only setting you can put in an **INCLUDE**d grammar that has any effect. All of the other options are simply ignored, since they are presumably set in the top-level *including* grammar. In legacy JavaCC, if you defined a token production without specifying a lexical state, those patterns are matched in a lexical state called " | ||
| - | ===== Wrinkles with Code Injection | + | ## Wrinkles with Code Injection |
| - | JavaCC still supports the legacy JavaCC constructs of **PARSER_BEGIN...PARSER_END** and **TOKEN_MGR_DECLS**. (For how much longer, I am not making any promises...). However, those constructs are ignored | + | You can |
| - | You can still //inject// code into the generated parser or lexer class, from within an included grammar, but you need to write something like: | + | |
| - | + | ||
| - | | + | |
| - | { | + | |
| - | ... | + | |
| - | } | + | |
| { | { | ||
| ... | ... | ||
| Line 41: | Line 38: | ||
| or: | or: | ||
| - | INJECT(LEXER_CLASS) : | + | INJECT LEXER_CLASS : |
| - | { | + | |
| - | ... | + | |
| - | } | + | |
| { | { | ||
| ... | ... | ||
| } | } | ||
| - | JavaCC | + | CongoCC |
| - | INJECT(JSONParser) : | + | INJECT JSONParser : |
| { | { | ||
| ... | ... | ||
| } | } | ||
| - | { | ||
| - | ... | ||
| - | } | ||
| - | because the parser class we are generating is not '' | + | because the parser class we are generating is not JSONParser, it is FooParser! However, the person writing |
| - | So, do not be surprised when the code within PARSER_BEGIN...PARSER_END is ignored if it is within | + | In fact, the aliases **PARSER_CLASS**, |
| - | In fact, the aliases | + | To see a concrete example of **INCLUDE** in use, you can take a look at https:// |
| - | To see a concrete example of **INCLUDE** in use, you can take a look at https:// | + | </markdown> |
| ===== INCLUDE with Java Source files ===== | ===== INCLUDE with Java Source files ===== | ||
| Line 72: | Line 63: | ||
| to only contain Java source code. Thus, writing: | to only contain Java source code. Thus, writing: | ||
| - | | + | |
| is exactly the same as if you wrote: | is exactly the same as if you wrote: | ||