[Set] PARSER parser fileidUse the
The
For example, if you were working with a hypothetical language called LANG and you had described the language in a KEDIT Language Definition file called LANGDEF.KLD, you could define a parser called LANG with the command
SET PARSER LANG LANGDEF.KLDAfter issuing the SET PARSER command, you could then issue the command
SET COLORING ON LANGto use this parser to control syntax coloring for the current file.
If files in your language always had an extension of, for example, .LNG, you could use the SET AUTOCOLOR command to tell KEDIT to always use the LANG parser for .LNG files:
SET AUTOCOLOR .LNG LANGSET PARSER commands are typically executed from your KEDIT profile when KEDIT is initially loaded. For example:
* if first profile execution in a session, * setup the LANG parser and then * cause all .LNG files to be colored using the LANG parser if initial() then do 'set parser lang langdef.kld' 'set autocolor .lng lang' endSeveral language definitions are built into KEDIT, and when KEDIT is loaded it automatically issues SET PARSER commands that use these language definitions to set up its default parsers. See the description of the SET PARSER command for a complete list of built-in parsers. To distinguish these internal language definition files from actual disk files, KEDIT uses an asterisk as the first character of their names. For example, the command
SET PARSER C *C.KLDtells KEDIT to use *C.KLD as the Language Definition File associated with the C parser. The asterisk in the name tells KEDIT to use the special file *C.KLD, which is built into KEDIT, and not to look for the file on disk.
Copies of all of the KLD files built into KEDIT are included in the SAMPLES subdirectory of the main KEDITW directory. For example, there is a C.KLD file that is an exact copy of the *C.KLD file that is built into KEDIT. If you modify one of these copies you should save it in a different location (normally the USER subdirectory of the main KEDITW directory) and load it by issuing a SET PARSER command referring to the modified file.
Note that whenever you issue the SET PARSER command, the KLD file that you specify is loaded into memory, even if an identical SET PARSER command has previously been issued. This makes it easy to develop and test modifications to KLD files, because if you make changes to a KLD file you can simply reissue the appropriate SET PARSER command and KEDIT will load the updated version of the file. Any files whose syntax coloring is controlled by your parser will automatically be re-colored, so you can easily see the effect of the changes you have made to the KLD file.
The rules given here for KLD files are flexible enough to describe a number of popular programming languages, to handle varying syntax conventions for comments, strings, numbers, etc., and to have user-configurable lists of keywords. The goal is to handle many common language variants with a relatively small number of parameters.
KLD files are divided into sections. Each section begins with a section header, consisting of a colon in column one followed immediately by the section name. Following each section header line are one or more lines of parameter information.
To improve readability, you can insert blank lines at any point in a KLD file. Additionally, any line whose first nonblank character is an asterisk (``*'') is considered a comment line and is ignored by KEDIT. For example:
* Sample KLD contents :case ignore :identifier [a-z] [a-z0-9] :keyword if then elseThe above example starts with a comment line, followed by a :CASE section with one parameter line, an :IDENTIFIER section with one parameter line, and a :KEYWORD section with three parameter lines. Parameter information is usually indented from column one, as in this example, but it does not have to be.
Here are descriptions of each kind of KLD file section:
An example:
:CASE respectIf the :CASE section is omitted, KEDIT assumes case insensitivity. If present, the :CASE section must precede the :IDENTIFIER section.
PREPROCESSOR indicates that the language supports a C-like preprocessor mechanism, and that preprocessor keywords are preceded by the specified character. For example:
:OPTION preprocessor #
REXX indicates that the REXX language is being described. In REXX, certain identifiers are sometimes considered keywords and are sometimes considered variables, depending on the context in which they are used, and the REXX option tells KEDIT to do the special processing that this requires.
If the :OPTION section is omitted, KEDIT does not do special handling of preprocessor keywords or of REXX keywords.
:IDENTIFIER [a-zA-Z]specifies that any set of alphabetic characters is a valid identifier.
In many languages, there are different rules for what is valid as the first character of an identifier and for what is valid in additional characters in an identifier. To handle this situation, you can include two identifier specifications: first specify what is valid as the first identifier character and then specify what is valid in the remaining characters. For example, in C programs the first character of an identifier can be any alphabetic character or can be an underscore, while the remaining characters of an identifier can be alphabetic or can be underscores, but can also be numeric digits:
:IDENTIFIER [a-zA-Z_] [a-zA-Z0-9_]In some cases (BASIC programs are the main example), the last character of an identifier can be a special character that is not valid elsewhere in an identifier. For example, in BASIC, ABC@ is a valid identifier. To handle this, you can include a third item specifying the special characters acceptable only at the end of an identifier. For example:
:IDENTIFIER [a-zA-Z] [a-zA-Z0-9_] [%&!#@$]The :IDENTIFIER section is required if you will be using the :KEYWORD section to give a list of the keywords in your language. The :IDENTIFIER section must appear before the :KEYWORD section.
Some languages have single-line comments, which are introduced by some type of comment delimiter and cannot continue for multiple lines. Some languages have comments with both a starting and an ending delimiter. This kind of comment can usually continue for multiple lines, but in some languages may be restricted to a single line.
For example, C++ allows comments that are introduced by a pair of slashes (``//'') and continue until the end of the line. C++ also allows comments that can continue for multiple lines, introduced by a slash-asterisk pair (``/*'') and terminated by an asterisk-slash pair (``*/''). The corresponding :COMMENT section would be:
:COMMENT line // any paired /* */ nonestLine comments are described using the format
LINE delim ANY|FIRSTNONBLANK|COLUMN nwhere
ANY
indicates that appearance of the comment delimiter anywhere on a line (except within a quoted string) starts a comment.
FIRSTNONBLANK
indicates that the comment delimiter starts a comment only if it is the first nonblank item on a line.
COLUMN n
indicates that the comment delimiter starts a comment
only if it appears in column
Comments with both starting and ending delimiters are described using the format
PAIRED delim1 delim2 [NEST|NONEST] [MULTIPLE|SINGLE]where
NEST|NONEST
NEST indicates that multi-line comments can be nested inside multi-line comments, with the comments ending only when as many comment end delimiters as comment start delimiters have been encountered. NONEST is the default and indicates that comments cannot be nested, and that a comment ends as soon as the next comment end delimiter has been encountered. For example, consider
/* /* here is a comment */ x = 17 */
In the REXX language, which allows nested comments,
``x
MULTIPLE
indicates that the comments can continue for multiple lines; this is the default and need not be specified.
SINGLE
indicates that, even though paired delimiters are being used, the comments must begin and end on a single line.
Header lines are specified in the same way as single-line comments:
LINE delim ANY|FIRSTNONBLANK|COLUMN nAs far as KEDIT's syntax coloring is concerned, the only difference between single-line comments and headers is that comments are displayed using ECOLOR A and headers are displayed using ECOLOR G. An example of a :HEADER section that describes .KLD file section headers:
:HEADER line : column 1
This means that your language uses strings enclosed in single quotes.
This means that your language uses strings enclosed in double quotes.
Use this to specify that the character
SINGLE, DOUBLE, and DELIMITER
If the :STRING section is omitted, KEDIT's syntax coloring does not recognize any strings in your files.
DELIMITER delim FIRSTNONBLANK|ANY|COLUMN nwhere
Instead of a DELIMITER line, you can specify
COLUMN nto indicate that any non-keyword identifier beginning in the specified column should be treated as a label, with no need for a delimiter following the label.
KEDIT's syntax coloring facility uses the information in the :MATCH section for two purposes:
First, items at different nesting levels are colored differently, so you can easily see which items match. For example, in the line
if (f(x + y + z) = 17)KEDIT can display the inner parentheses and the outer parentheses in different colors.
Second, when you use the CMATCH command (assigned by default to Shift+F3) to find the matching item for the text at the cursor position, KEDIT can properly match any items described in the :MATCH section. With the cursor on the first DO in the following example, Shift+F3 can move the cursor to the second END in the example:
if a = 5 then do j = 17 do i = 1 to 10 say i*j end endEach line of the :MATCH section has either two or three items. The first item specifies the identifiers or character sequences that introduce a matchable construct. The second item specifies the identifiers or character sequences that end a matchable construct. The third item is optional, and is used to specify items that always appear inside of a matchable construct.
For example,
:MATCH ( ) { } #if #endif #elseHere, three matchable constructs are specified:
:MATCH ( ) { } #ifdef,#if,#ifndef #endif #else,#elif,#elseifThis is because any of #ifdef, #if, and #ifndef can match up with #endif, with any of #else, #elif, and #elseif allowed between them. As in this example, you can specify multiple equivalent items in a :MATCH section, separated by commas. Some notes on using the :MATCH section:
:MATCH DO END BEGIN ENDbut should instead use
:MATCH DO,BEGIN END
keyword [ALTERNATE n] [TYPE m]where
Keywords are normally colored according to the current ECOLOR D setting, and preprocessor keywords according to the current ECOLOR F setting. It is sometimes useful to specify different types of keywords that will be colored differently. To do this, you can specify
ALTERNATE nfollowing a keyword, where
TYPE mis used only when REXX has been specified in the :OPTION section, and determines what to treat as a REXX keyword, subkeyword, etc. The number
A sample :KEYWORD section:
:KEYWORD if then else do end switch for procedure alternate 1If the :KEYWORD section is omitted, KEDIT's syntax coloring facility does not recognize any keywords. If the :KEYWORD section is specified, it must be preceded by the :IDENTIFIER section.
Use the TAG line to specify the character string that initiates a markup tag and the character string that terminates a markup tag.
In an HTML file, where a typical line of text might be:
<H1>Level 1 header</H1>``<'' initiates a tag, and ``>'' terminates it. This would be specified in the :MARKUP section as
:MARKUP TAG < >Use the REFERENCE line to specify the character string that initiates a character or entity reference and the character string that terminates it.
HTML lets you use entity references like ``<'' or character references like ``<'' to refer to special characters. These references begin with an ampersand (``&'') and end with a semi-colon (``;''). This would be specified in the :MARKUP section as:
:MARKUP TAG < > REFERENCE & ;The following special rules apply if your KLD file contains a :MARKUP section:
<P>This is a new paragraph.``<P>'' would be highlighted.
<A HREF="film_clip.jpg">the quoted string is displayed using ECOLOR B, while the rest of the tag is displayed using ECOLOR T.
:COLUMN EXCLUDE 1 6 EXCLUDE 73 *Each line of the :COLUMN section has the word EXCLUDE followed by the starting and ending column of a range of columns that the parser is to ignore. The ending column can be given as an asterisk to indicate that all columns through the end of the line are to be ignored.
When the syntax coloring parser processes a line of your file, it will treat the excluded columns as if they were entirely blank. By default, the excluded columns will be displayed with no special highlighting, but you can specify that any of the 9 ALTERNATE colors be used. For example,
:COLUMN EXCLUDE 1 10 ALTERNATE 2would display columns 1 through 10 of your file using ECOLOR 2.
The :POSTCOMPARE can contain CLASS lines and TEXT lines.
CLASS lines specify a set of characters that you want to have colored, using the same regular expression character class notation that is used in the :IDENTIFIER section. For example,
CLASS [+-=/]means that ``+'', ``-'', ``='', and ``/'' characters are to be colored. KEDIT uses ECOLOR I by default, but you can instead specify any of the four alternate keyword colors. For example:
CLASS [+-=/] ALTERNATE 2TEXT lines specify a string of nonblank characters that is to be colored. For example,
TEXT .T.would color the character sequence ``.T.''. KEDIT uses ECOLOR D by default, but you can specify an alternate keyword color. For example:
TEXT .T. ALTERNATE 3You can specify any number of CLASS or TEXT lines in a :POSTCOMPARE section. When applying syntax coloring to your file, the :POSTCOMPARE section is processed last. That is, KEDIT first checks for identifiers, numbers, comments, tags, etc., and checks the items in the :POSTCOMPARE section only if none of these are found.
Note that it is not useful to include valid identifiers in the :POSTCOMPARE section, since the parser checks for identifiers before :POSTCOMPARE is processed, so identifiers, even identifiers that are not listed in the :KEYWORD section, will never be matched by :POSTCOMPARE. For this reason, any identifiers that you want to color should be included in the :KEYWORD section.