Home | Download | ANTLRWorks | Wiki | About ANTLR | Feedback | Support | Bugs | v2


Latest version is 3.0.1
Download now! »

Download
» Home
» Download
» ANTLRWorks
» News
»Using ANTLR
» Documentation
» FAQ
» Articles
» Grammars
» File Sharing
» Runtime API
» Tech Support
» Bug Tracking
»About ANTLR
» What is ANTLR
» Why use ANTLR
» Showcase
» Testimonials
» Getting Started
» Software License
» ANTLR WebLogs
» ANTLR Workshops
»StringTemplate
»ANTLR v2
»Feedback
»Credits
»Contact


Support StringTemplate, ANTLR Project by making a donation! Terence often pays for things like the antlr.org server, conference travel, and this site design (that alone cost US$1000). Buy him a beer and pizza remotely ;)

Search



Draft specification for Antlr Tree Generation

RSS Feed
By Matthew Ford Revision 1 25th Sept 2001

Control of automatic tree generation in the Parser

By default the Parser will automatically generate AST trees. This generation can be disabled globally by setting buildAST=false.

When buildAST=false ALL code related to AST tree building is removed and the only ways to build your own tree are :-

  1. to update a global tree or
  2. use Antlr's return syntax to pass your own tree back.

(Note: in this mode you should not need to link or load any AST code unless you reference it yourself from action, etc)

With buildAST = true (that is the default) you can selectively disable tree generation by using the ! syntax. This can be used on either a rule or token basis. Example of a rule based use of ! to disable tree generation

addition!
    :   INT PLUS i:INT
    ;

In this case no tree generation code is generated for this rule. If you want to create a tree by hand for this rule you need to return it as shown below

addition returns [AST return_tree]!
   :   INT PLUS i:INT  { .. code to generate return_tree }
   ;

So I suggest this be relaxed a little to say that No tree generation code is output except that labels in the rule are initialized with the appropiate minimal tree. For example

drop_table_statement!
: "drop" "table" t3:table_name t4:drop_behavior
;

results in #t3 containing tree resulting from the rule table_name and

statement!
: INT PLUS i:INT
;

would set up a tree for label i consisting of a single root node containing the INT token This allows the user to control what tree code is added to their code if the tree generation is turned off for a rule. If there are not labels then no code.

To suppress a single token use ! after the token. It will not be added to the tree, eg.

statement
:  lhsVar EQUALS rhs SEMI!  // SEMI is not added to the tree.
;

Note as far as the rule statement is concerned

statement
:  lhsVar EQUALS addition!  // suppress addition of tree returned from addition
;
addition:
    :   INT PLUS i:INT
;

Is the same as

statement
:   lhsVar EQUALS addition
;
addition!   // suppress generation tree
    :   INT PLUS i:INT
;

But in the second case no rule in the parser can get a tree from the addition rule. And

statement
 :  lhsVar EQUALS addition!
;
addition!
    :   INT PLUS i:INT
;

is redundant but legal.

You would probably actually use something like

statement
{AST addTree;}
:   lhsVar EQUALS^ addTree=addition!
   { ## =   build tree here using ## and addTree }
;

addition returns [AST returnTree]
    :   INT PLUS^ i:INT
{ returnTree = ##}  // pick up the autogenerated tree
;

Note: It makes no sense in this system to allow ! to be applied to alternative of rules that is :-

statement
{AST addTree;}
:   lhsVar EQUALS^ addTree=addition!
    { ## =   build tree here using ## and addTree }
|!  printstatement
;

is now illegal

In all other cases (that is when buildAST is true and ! is not used) the return tree is always generated and assigned to the global AST_return to be picked up by the parent rule. This AST_return can be modified/overwritten using the syntax discussed below.

Syntax for manual modification of trees in the Parser

Note this is for modification of trees that have been automatically created. If you set buildAST=false or use ! on a rule, you are on your own as no tree code is generated for you.

Tree nodes are created using

#[TOKEN_TYPE] or #[TOKEN_TYPE,"text"]

Trees are created using

#(root, c1, ..., cn)

where

  • root must be a node
  • c1,to cn are the 1st to nth children which may be either nodes or other trees.

Elements of the current rule can be addressed using the following

  • ## is a short cut for AST_return, the current result tree.
  • #id is a short cut for the current tree rooted at the location originally occupied by the node labelled by id
  • @id is a short cut for the root node of the tree rooted at the location originally occupied by the node labelled by id.

When these occur on the rhs of = they are replaced by clones of their respective nodes or trees. This prevents deadly loops.

As an optimisation

## =#(#[token],##)

could be done without cloning ##.

When these occur on the lhs of = they refer to that location in the tree. This allows subtree replacements. eg,

statement
:   lhsVar e:EQUALS^ a:addition
  {
      #a = #(@a,#[INT,"5"],#[INT,"6"]);
     // the children of the addition subtree in the result (##) have been
replace with 5,6
    // a: now refers to the new subtree, the original subtree is has been
replaced by it.
    @a = #[MINUS]
    // the root of the new subtree a is now MINUS
   ## = #(#[STATEMENT],##);
   // add a node to the top of the result tree.  a: and e: still point to
the same subtrees.
  ## = #[DIV];
  // where do a: and e: point now?  They still point to there subtrees which
are not
released until the rule returns.
 // so the following is valid
  ## = #(@#,#a,#[INT,"3"],#a);
   // @# is the root node of ##  which is now just #[DIV]
   // note it is valid to use #a twice as it is cloned.
}
;