Arithmetic Expressions

LA home
Computing
Prog-Langs
 glossary
 Grammar
  Arith-Exp
  Abstract-syn
  Top-Down
  Bottom-Up

Also see:
 Parser.c
 *.java
 exp.sml
<Exp> ::= <Exp> + <Term> |
<Exp> - <Term> |
<Term>

<Term> ::= <Term> * <Factor> |
<Term> / <Factor> |
<Factor>

<Factor> ::= x | y | ... |
( <Exp> ) |
- <Factor>

A grammar for the concrete syntax of simple arithmetic expressions

Non-terminal symbols:
<Exp>, <Term>, <Factor>
Terminal symbols:
+, -, *, /, (, ), x, y, z, ...
Start symbol:
<Exp>
Production rules as above.
 
Note the following are meta symbols:
::= (read it as "can be replaced by")
-> is alternative notation for ::=,
|   (read it as "or by")

e.g. An <Exp> can be replaced by   <Exp> + <Term>   or by <Exp> - <Term>   or by   <Term>.

Note that the production (replacement) rules for <Exp>, <Term> and <Factor> are recursive. The rule for <Exp> states that an Expression is a list of Terms separated by `+' or `-' symbols. A Term is a list of Factors separated by `*' or `/' symbols. A Factor is a variable (x, y, ...), or a parenthesized sub-expression, or a unary `-' followed by a Factor.

The operators `*' and '/' bind more tightly to their operands than `+' and binary `-' because `*' and '/' appear in the rule for <Term>, closer to the operands (<Factor>). Unary `-' binds most tightly of all operators.

The rules for <Exp> and <Term> are left recursive. This is because `+', `-', `*' and `/' are left associative: `a-b-c' gives the same parse tree as `(a-b)-c', not `a-(b-c)'.

A grammar like this can be turned into a "recursive descent parser" (a program) by writing a routine for each non-terminal in the grammar. The routines call each other as specified by the replacement rules. Left recursion, that is a replacement that begins with a recursive non-terminal, must be removed so that the parser does not loop. A simple parser of this type can be found in [Parser.c (click)] or [*.java] or [exp.sml].

1. x+y*z:

<Exp> => <Exp> + <Term> => <Term> + <Term> => <Factor> + <Term> => x + <Term> => x + <Term> * <Factor> => x + <Factor> * <Factor> => x + y * <Factor> => x + y * z
tree of nonterminals and terminals
x+y*z: Full derivation-tree of non-terminals and replacements.

The result of parsing is often returned as a parse-tree based on a simpler abstract syntax. The abstract syntax corresponds to the datatype of parse trees returned by the parser.

parse tree
x+y*z: Parse tree.

2. (x+y)*z:

Parentheses '(' and ')' can be used to force a different parsing:

tree of nonterminals and terminals
(x+y)*z: Full derivation-tree of non-terminals and replacements.

parse tree
(x+y)*z: Parse tree.

Note that parentheses do not appear in any parse tree. They directed the parser to build a certain tree and after that they are unnecessary. The structure of the tree shows the binding of operators to operands without the need for (). Parentheses are only needed in strings.

-- L.A.
www #ad:

key
::= "can be replaced by"
->
| "or by"
=> derivation
lm left most
rm right most

↑ © L. Allison, www.allisons.org/ll/   (or as otherwise indicated).
Created with "vi (Linux)",  charset=iso-8859-1,   fetched Thursday, 25-Apr-2024 16:28:13 UTC.

Free: Linux, Ubuntu operating-sys, OpenOffice office-suite, The GIMP ~photoshop, Firefox web-browser, FlashBlock flash on/off.