uho 2022-01-20
Usage: (* Start a comment. Spans multiple lines. ends at *)
Standard Forth provides words to add comments to programs:
(
to skip text until the next closing parenthesis )
(
from the File wordset) but many Forth systems support (
comments only within a single line.\
skip all remaining text on the current line
As Forth is extensible it is possible to define additional comment words. Many flavours are possible. Here we focus on Pascal like comments using (*
to start and *)
to end a comment.
So we seek to define an new word ´(*` that skips the source text until it finds the end of comment.
Because the Forth input stream might only consist of the current line, this word (*
in search of *)
probably has to read addtional source code lines until it finds *)
and the comment ends.
Sometimes you might want to comment parts of the source text that contains comments themselves. If comments nest, i.e. you can have comments inside comments, then this is possible without further issues. If however only a naive search for the next *)
is done to indentify the end of comment, then the end of the inner comment would also be considered the end the outer comment:
(* outer comment (* inner comment *) still comment if comments can nest, but not a comment if they don't nest. *) probably error when not nesting as *) is typically not a defined word.
Function: (* ( -- ) - Parse the rest of the input stream searching for *) - If *) is found then end searching - If *) is not found within the available source text, load more source text and continue search.
Using this algorithm, skipping text stops at the first occurance of *)
.
Function: (* ( -- ) - Parse the rest of the input stream searching for *) - If the current *) is found then end this searching and resume any delayed searches. - If another (* is encountered then delay the search of the current *) and start the search for another *). - If *) is not found within the available source text, load more source text and continue search.
Using this algorithm the number of nested comments is counted by the number of delayed searches.
A naive non nesting Forth-94 implementation may look like this:
: (* ( -- ) BEGIN BEGIN cr bl word count dup \ next token available? WHILE ( c-addr u ) s" *)" compare 0= IF EXIT THEN \ stop if end of comment found REPEAT 2drop refill 0= \ read more source code UNTIL ; immediate \ end of source code
A version that allows for nesting comments:
: (* ( -- ) BEGIN BEGIN cr bl word count dup \ next token available? WHILE ( c-addr u ) 2dup s" *)" compare 0= IF 2drop EXIT THEN \ stop if end of comment found s" (*" compare 0= IF recurse THEN \ start of nested comment REPEAT 2drop refill 0= \ read more source code UNTIL ; immediate \ end of source code
Both implementations will extract tokens from the input stream. Thus both (*
and *)
must be separated by whitespace and must not be attached to printable characters for them to be considered start or end of comment. Thus:
(* this does not end the comment*) but this does *)
In practice that does not impose serious limitations.
This is an ANS Forth Program with environmental dependencies,
This program has the environmental dependencies to use lower case for standard definition. After loading this program, a Standard System still exists.
The test for handling nested comments:
(* .( Start of outer comment ) (* .( inner comment ) *) .( This should not print if comments nest! ) *)
should not print anything.
If you face a resource constraint system you might want to further simplify the definition of (*
in order to impose fewer requirements on the supporting system. In the above definition there are two required words that might not be supported in a resource constraint system: REFILL
and COMPARE
.
REFILL
(or its terminal/command line counterparts EXPECT
, QUERY
or ACCEPT
) is probably always necessary to keep reading input lines in case *)
has not yet been found.
The use of COMPARE
on the other hand can be eliminated as we only want to test for *)
(and for (*
in the nesting case).
Here is Albert Nijhof's approach:
\ Tool - Multi-line comment - an 17jan2022 \ (* starts a multi-line comment. Not nestable. \ The delimiter *) must be the first word on a line. : (* ( -- ) 0 \ dummy begin begin begin drop cr refill 0= if exit then bl word count 2 = until count [char] * = until count [char] ) = until drop ; immediate
(*
1. This simple code was intended for small systems. That's why I avoided > the word COMPARE
. Unfortunately, “REFILL” was unavoidable.
2. The delimiter *)
must be the first word on a line. This is not to keep the code simple or make it faster. I purposely chose this because it is better, it provides a clearly readable layout.
*)
Compare this with:
(*
1. This simple code was intended for small systems. That's why I avoided the word COMPARE
. Unfortunately, “REFILL” was unavoidable.
2. This is not to keep the code simple or make it faster. I purposely chose this because it is better, it provides a clearly readable layout.
*)
This code has some interesting properties:
*)
instead of a string based COMPARE
.*)
is found. This is done right after the loop.