meta data for this page
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
choice_points [2020/09/26 22:17] – [More than one way to skin a cat...] revusky | choice_points [2021/02/08 18:09] (current) – ↷ Links adapted because of a move operation revusky | ||
---|---|---|---|
Line 7: | Line 7: | ||
* The expansions within a //choice construct//, | * The expansions within a //choice construct//, | ||
- | (Perhaps needless to say, all of the above constructs | + | (Perhaps needless to say, all of these constructs arbitrarily |
- | I think the simplest way to think about the above cases (for most people, anyway) is just in terms of their analogues in a procedural programming language. The first three cases are effectively binary choices, a choice between entering the expansion inside the parentheses and jumping directly to what follows it. In all of these cases, the expansion within the parentheses is a choice point. The last case is a choice between n options and each of the expansions is a choice point. | + | I think the simplest way to think about the above cases (for most people, anyway) is just in terms of their analogues in a procedural programming language. The first three cases are effectively binary choices, a choice between entering the expansion inside the parentheses and jumping directly to what follows it. In all of these cases, the expansion within the parentheses is a choice point. The last case is a choice between n options and each of those n sub-expansions is a choice point in the grammar. |
- | A //zero-or-more// is a single (non-looping) choice. If the enclosed expansion matches, we enter it, and if not, we jump directly to whatever follows it. So, if we write: | + | ==== Zero Or One ==== |
+ | |||
+ | A //zero-or-one// is a single (non-looping) choice. If the enclosed expansion matches, we enter it, and if not, we jump directly to whatever follows it. So, if we write: | ||
< | < | ||
Line 24: | Line 26: | ||
</ | </ | ||
- | That is a //zero or one// choice. | + | ==== Zero Or More ==== |
+ | |||
+ | A //zero or more// is basically the same as a //zero or one// **except** that it is not a one-time choice, but a loop. So, the following: | ||
< | < | ||
Line 36: | Line 40: | ||
following_expansion(); | following_expansion(); | ||
</ | </ | ||
+ | |||
+ | ==== One or More ==== | ||
A //one or more// is like a //zero or more// loop except that the first iteration is obligatory. Thus: | A //one or more// is like a //zero or more// loop except that the first iteration is obligatory. Thus: | ||
Line 66: | Line 72: | ||
So, again, we see that the first iteration is actually executed // | So, again, we see that the first iteration is actually executed // | ||
- | A choice, such as '' | + | ==== The Choice Construct ==== |
+ | |||
+ | This construct, something like '' | ||
< | < | ||
Line 75: | Line 83: | ||
.... | .... | ||
else | else | ||
- | | + | |
</ | </ | ||
Line 84: | Line 92: | ||
In the default case, this has a very simple answer: | In the default case, this has a very simple answer: | ||
- | ==== The Default Choice Resolution Algorithm ==== | + | ===== The Default Choice Resolution Algorithm |
The //default// choice resolution algorithm for whether we enter an expansion at a choice point is just to check ahead //exactly one token// and if that token // | The //default// choice resolution algorithm for whether we enter an expansion at a choice point is just to check ahead //exactly one token// and if that token // | ||
Line 96: | Line 104: | ||
Now, to be clear, looking ahead one token might be sufficient or not in any given spot, but if we don't specify any extra information, | Now, to be clear, looking ahead one token might be sufficient or not in any given spot, but if we don't specify any extra information, | ||
- | //We enter an expansion at a choice point if the next token is in that expansion' | + | //We enter an expansion at a choice point if the next token is in that expansion' |
- | Again, no need to be intimidated by the lingo. An expansion' | + | Again, no need to be intimidated by the lingo. An expansion' |
- | //If the next token is **not** in an expansion' | + | //If the next token is **not** in an expansion' |
Now, regardless, it may be the case that more than one of the expansions at a given choice point matches this condition. Well, in that case, we have a secondary rule: //the first one gets it//. And this is really no different, by the way, from how if-elseif-else in a procedural programming language works. | Now, regardless, it may be the case that more than one of the expansions at a given choice point matches this condition. Well, in that case, we have a secondary rule: //the first one gets it//. And this is really no different, by the way, from how if-elseif-else in a procedural programming language works. | ||
Line 133: | Line 141: | ||
The answer is that we do this by defining some sort of // | The answer is that we do this by defining some sort of // | ||
- | ==== Definite | + | ===== Numerical Lookahead |
In the case given above, where we have that unreachable second expansion, the simplest solution would simply be to specify that we should scan ahead two tokens on the first expansion. So we could write the above choice as: | In the case given above, where we have that unreachable second expansion, the simplest solution would simply be to specify that we should scan ahead two tokens on the first expansion. So we could write the above choice as: | ||
Line 157: | Line 165: | ||
(//Huh?//) | (//Huh?//) | ||
- | It's really quite simple, you see. The second token, " | + | It's really quite simple, you see. The second token, " |
- | But that is all right and proper. Look at it this way... the above grammar snippet says that we look ahead 2 tokens to decide whether to enter the first expansion. Since we didn't specify anything for the second expansion, it now just looks ahead one token, and if an error occurs after that, then so be it. The parser is doing what it is supposed to do, right? You see, in this case, none of the three given choices can match the " | + | But that is all right and proper. Look at it this way... the above grammar snippet says that we look ahead 2 tokens to decide whether to enter the first expansion. Since we didn't specify anything for the second expansion, it now just looks ahead one token, and if an error occurs after that, then so be it. The parser is doing what it is supposed to do, right? You see, in this case, none of the three given choices can match the " |
Well, if you don't currently understand my point there, then all I can say is just think about it. It will eventually become clear... | Well, if you don't currently understand my point there, then all I can say is just think about it. It will eventually become clear... | ||
- | ==== More than one way to skin a cat... ==== | + | ===== More than one way to skin a cat... |
- | The //definite numerical lookahead// of two tokens worked okay in the above example, but generally speaking, it is a rather crude disposition. The legacy JavaCC tool provides two other ways to specify how we resolve a choice -- when the default resolution is not good enough. In the original, somewhat inaccurate terminology, | + | The //definite numerical lookahead// of two tokens worked okay in the above example, but generally speaking, it is a rather crude disposition. The legacy JavaCC tool provides two other ways to specify how we resolve a choice -- when the default resolution is not good enough. In the original, somewhat inaccurate terminology, |
==== Syntactic Lookahead ==== | ==== Syntactic Lookahead ==== | ||
Line 199: | Line 207: | ||
</ | </ | ||
- | The code in the braces is just any Java expression, a black box from the point of view of our parser generator. | + | The code in the braces is just any Java expression |
Line 225: | Line 233: | ||
In very many common usage cases, the [[up to here]] syntax removes the need to write separate // | In very many common usage cases, the [[up to here]] syntax removes the need to write separate // | ||
- | By the same token, //semantic lookahead//, | + | By the same token, //semantic lookahead//, |
- | A general rule of thumb would be to use [[up to here]] and [[lookbehind]] constructs whenever possible instead of // | + | A general rule of thumb would be to use [[up to here]] and [[contextual_predicates]] constructs whenever possible instead of // |