parse.e

The parse.e file contains a function to parse a file, converting it into a tree. The exact format of the tree is probably going to change soon, so I won't go into it here.

Parsing files

The only function at the moment is:

node = parse(filename)

This will tokenize the file specified and parse it, returning the root node of a tree (see tree.e) for the parsed program.

There is also a variable, do_includes. By default this is set to 1, and include files will be loaded. You can set this to 0, and include files will be skipped.

Extending the parser

Like tokens.e, I've tried to make parse.e easily extendible. Right now, there are two functions for extending the parser: add_namedref and add_syntax.

Named references

add_namedref lets you add named references - that is, things like variables and calls. named references can be read-only (ex. calls) or read-write (ex. variables). Variables and calls are built-in, but you could add things like variables in structures etc. Since all the routines use these routines, routines added here will work in assignments (if writable), expressions etc.

add_namedref(id, writable, priority)

id is the routine_id of your function. It should accept three arguments. For example, here's the routine for variables.

function do_varref(tree_node node, sequence tokens, integer token)
    tree_node n
    sequence name
    if token > length(tokens) or not euid(tokens[token][1]) then
        return 0
    end if
    n = alloc_node(node)
    node_set(n,{VARREF,tokens[token][1]})
    return do_subscripts(n,tokens,token+1)
end function

node is the node to add new nodes to. tokens is a sequence of tokens (see tokens.e), and token is an index into those tokens. Your function should return 0 if your function does not apply to the tokens, or if it does, return the token after the tokens processed by your function.

If the named reference handled by your function is writable (i.e. variables), the writable flag should be 1. If not (i.e. functions), make it 0. Finally, the priority is used to sort the functions. For example, routines would be mistaken for variables if variables were tested first. Therefore, the priority of do_call is 1 and the priority of do_varref is 2. Priorities are atoms, so you can use any value to exactly position your reference.

Syntax elements

Loops, definitions, includes, and statements are controlled through the add_syntax procedure:

add_syntax(id,globalLevel,fileLevel,routineLevel,blockLevel)

id is the routine_id of your function (same format as the named reference functions). globalLevel, fileLevel, routineLevel, and blockLevel are either 0 or 1 depending on whether your syntax element applies to that level. For example:

add_syntax(routine_id("do_variable"),1,1,1,0) -- variable definition
add_syntax(routine_id("do_return"),0,0,1,1) -- return statement
add_syntax(routine_id("do_constant"),1,1,0,0) -- constant
add_syntax(routine_id("do_include"),0,1,0,0) -- include file
add_syntax(routine_id("do_exit"),0,0,0,1) -- exit statement
add_syntax(routine_id("do_global"),0,1,0,0) -- global modifier
add_syntax(routine_id("do_for"),0,1,1,1) -- for loop

Prev: tree.e

Table of contents