Syntax, Grammar
- A Context-Free Grammar G = <N, T, S, P> where
- T is a finite set of terminal symbols,
- N is a finite set of non-terminal symbols,
- S is the start symbol, S:N,
- P is a finite set of production rules of the form,
- X ::= α where X:N and α:(N+T)*
- (X -> α is alternative notation)
- rules with a common LHS, X ::= α, X ::= β, X ::= ... , are often shown thus:
- X ::= α | β | ...
- (X -> α is alternative notation)
- e.g., see the grammar of [arithmetic expressions].
- N is a finite set of non-terminal symbols,
- Definitions =>, =>*, =>+ :
- α => β if β can be derived from α using a single production rule,
- α =>* β if β can be derived from α using zero or more applications of production rules,
- α =>+ β if β can be derived from α using one or more applications of production rules,
- α =>* β if β can be derived from α using zero or more applications of production rules,
- where α, β :(N+T)*
- Definition:
- γ is a sentential form if
γ:(N+T)* and S =>* γ.
- Definition:
- γ is a sentence in the language defined by G if
γ:T* and
S =>+ γ.
- (In some sources word is used in place of sentence.)
- (In some sources word is used in place of sentence.)
- Definition:
- The language
L(G) = {γ:T* s.t. S =>+ γ},
- that is, L(G) is the set of sentences that can be derived from S.
- that is, L(G) is the set of sentences that can be derived from S.
- A derivation can be represented by a (derivation-) parse tree. (However, a parse tree is often based on simpler abstract syntax rather than concrete syntax.)
- Definitions lm=>, lm=>*, lm=>+, rm=>, rm=>*, rm=>+ :
- α lm=> β if β can be derived from α
by applying a single production rule to the left-most non-terminal
symbol within α.
- α lm=>* β if β can be derived from α using zero or more left-most applications of production rules.
- α lm=>+ β if β can be derived from α using one or more left-most applications of production rules.
- (Similarly rm=>, rm=>*, rm=>+, and right-most.)
- α lm=>* β if β can be derived from α using zero or more left-most applications of production rules.
- Definitions:
- γ is a left sentential form if γ:(N+T)* and S lm=>* γ.
- γ is a right sentential form if γ:(N+T)* and S rm=>* γ.
- γ is a right sentential form if γ:(N+T)* and S rm=>* γ.
- A [top-down] left-to-right parser performs left-most derivations.
- A [bottom-up] left-to-right parser performs right-most derivations in reverse!
Sources
- The dragon book, Ch.4.
- Search for [parse] and/or [grammar].
- The [Algol-60 Revised Report],
- Search for [parse] and/or [grammar].
- which includes the first(?) formal definition of the syntax of a significant programming language.