meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
new_settings_in_javacc_21 [2020/09/27 00:20] revuskynew_settings_in_javacc_21 [2021/02/09 11:16] revusky
Line 2: Line 2:
  
   * **BASE_SRC_DIR** This supersedes the older OUTPUT_DIRECTORY setting. Files are generated //relative// to the BASE_SRC_DIR, i.e. taking into account the package naming. If this is unset, BASE_SRC_DIR is assumed to be the directory where the grammar is.   * **BASE_SRC_DIR** This supersedes the older OUTPUT_DIRECTORY setting. Files are generated //relative// to the BASE_SRC_DIR, i.e. taking into account the package naming. If this is unset, BASE_SRC_DIR is assumed to be the directory where the grammar is.
 +  * **ENSURE_FINAL_EOL** With this setting turned on (it is off by default) the generated parser makes sure 
 +that the input ends with a newline character. Some grammars are actually quite hard to write if you can't be 
 +sure that every line (including the last one) terminates with a newline!
   * **FAULT_TOLERANT** This turns on the experimental support for building a [[fault tolerant]] parser. It is off by default.   * **FAULT_TOLERANT** This turns on the experimental support for building a [[fault tolerant]] parser. It is off by default.
-  * **HUGE_FILE_SUPPORT** Since we believe that the normal usage of the tool is simply to build a tree, it makes little sense to have any qualms about reading in the entire input into memory. So this is the default. This option allows you to turn on the legacy behavior of only maintaining a (fairly small) buffer in memory of the input file. See [[https://javacc.com/2020/05/05/gigabyte-is-the-new-megabyte/|The Gigabyte is the new Megabyte]] for more information on the reasoning behind all this. Note that the experimental fault-tolerant parsing features only work with HUGE_FILED_SUPPORT off. Also, having TREE_BUILDING_ENABLED set to true (which is the default) means that HUGE_FILE_SUPPORT is automatically turned off.+  * **HUGE_FILE_SUPPORT** Since we believe that the normal usage of the tool is simply to build a tree, it makes little sense to have any qualms about reading in the entire input into memory. So this is the default. This option allows you to turn on the legacy behavior of only maintaining a (fairly small) buffer in memory of the input file. See [[https://javacc.com/2020/05/05/gigabyte-is-the-new-megabyte/|The Gigabyte is the new Megabyte]] for more information on the reasoning behind all this. Note that the experimental fault-tolerant parsing features only work with HUGE_FILE_SUPPORT off. Also, having TREE_BUILDING_ENABLED set to true (which is the default) means that HUGE_FILE_SUPPORT is automatically turned off.
   * **LEGACY_API** If you turn on this setting, the tool generates code that is more compatible with legacy JavaCC. One example is that JavaCC 21 removes publicly visible fields like Token.kind and Token.image and replaces them with getter/setter methods. If you have LEGACY_API set, it leaves these fields as publicly visible. Also, it generates static final int constants for your various token types, as a convenience to keep older code working. (JavaCC 21 uses type-safe enums in these cases.) Note, however, that the existence of this setting is not guaranteed to keep all older code working. It simply makes it less work to get legacy JavaCC code working with JavaCC 21. Projects that migrate to JavaCC 21 should, as soon as they reasonably can, refactor their code so that the LEGACY_API setting can be turned off. In other words, it is just meant to provide a temporary stopgap, not any sort of permanent solution for people migrating their projects.   * **LEGACY_API** If you turn on this setting, the tool generates code that is more compatible with legacy JavaCC. One example is that JavaCC 21 removes publicly visible fields like Token.kind and Token.image and replaces them with getter/setter methods. If you have LEGACY_API set, it leaves these fields as publicly visible. Also, it generates static final int constants for your various token types, as a convenience to keep older code working. (JavaCC 21 uses type-safe enums in these cases.) Note, however, that the existence of this setting is not guaranteed to keep all older code working. It simply makes it less work to get legacy JavaCC code working with JavaCC 21. Projects that migrate to JavaCC 21 should, as soon as they reasonably can, refactor their code so that the LEGACY_API setting can be turned off. In other words, it is just meant to provide a temporary stopgap, not any sort of permanent solution for people migrating their projects.
   * **PRESERVE_LINE_ENDINGS** This is true by default (though this could change in the future based on user feedback. If you turn this setting off, all Windows/DOS style line endings (\r\n) are converted to UNIX/MacOS style (\n) internally when the file is read in. Note, by the way, that one advantage of this and the TABS_TO_SPACES option is that if you convert tabs to spaces and line endings to \n then your grammar's lexical specification can be a bit simpler. And your own code that runs over Tokens and Nodes. Your code can just assume that any line endings are a simple \n and and your horizontal whitespace is just spaces, not a mix of tabs and spaces, independently of what platform the generated code is running on.   * **PRESERVE_LINE_ENDINGS** This is true by default (though this could change in the future based on user feedback. If you turn this setting off, all Windows/DOS style line endings (\r\n) are converted to UNIX/MacOS style (\n) internally when the file is read in. Note, by the way, that one advantage of this and the TABS_TO_SPACES option is that if you convert tabs to spaces and line endings to \n then your grammar's lexical specification can be a bit simpler. And your own code that runs over Tokens and Nodes. Your code can just assume that any line endings are a simple \n and and your horizontal whitespace is just spaces, not a mix of tabs and spaces, independently of what platform the generated code is running on.