Download Understanding Right-Linear Grammars and Their Relation to Regular Languages and more Slides Theory of Automata in PDF only on Docsity!
Chapter Ten:
Grammars
Grammar is another of those common words for which
the study of formal language introduces a precise
technical definition. For us, a grammar is a certain
kind of collection of rules for building strings. Like
DFAs, NFAs, and regular expressions, grammars are
mechanisms for defining languages rigorously.
A simple restriction on the form of these grammars
yields the special class of right-linear grammars. The
languages that can be defined by right-linear grammars
are exactly the regular languages. There it is again!
A Little English
- An article can be the word a or the :
A โ a A โ the
- A noun can be the word dog , cat or rat : N โ dog N โ cat N โ rat A noun phrase is an article followed by a noun: P โ AN
A Little English
- An verb can be the word loves, hates or eats :
V โ loves V โ hates V โ eats
A sentence can be a noun phrase, followed by a verb, followed by another noun phrase:
S โ PVP
- Start from S and follow the productions of G 1
- This can derive a variety of (unpunctuated) English sentences:
S โ PVP โ ANVP โ theNVP โ thecatVP โ thecateatsP โ thecateatsAN โ thecateatsaN โ thecateatsarat S โ PVP โ ANVP โ aNVP โ adogVP โ adoglovesP โ adoglovesAN โ adoglovestheN โ adoglovesthecat S โ PVP โ ANVP โ theNVP โ thecatVP โ thecathatesP โ thecathatesAN โ thecathatestheN โ thecathatesthedog
S โ PVP A โ a P โ AN A โ the V โ loves N โ dog V โ hates N โ cat V โ eats N โ rat
- Often there is more than one place in a string where a production could be applied
- For example, PlovesP :
- PlovesP โ ANlovesP
- PlovesP โ PlovesAN
- The derivations on the previous slide chose the leftmost substitution at every step, but that is not a requirement
- The language defined by a grammar is the set of lowercase strings that have at least one derivation from the start symbol S
S โ PVP A โ a P โ AN A โ the V โ loves N โ dog V โ hates N โ cat V โ eats N โ rat
Informal Definition
- Productions define permissible string substitutions
- When a sequence of permissible substitutions starting from S ends in a string that is all lowercase, we say the grammar generates that string
- L ( G ) is the set of all strings generated by grammar G
A grammar is a set of productions of the form x โ y. The strings x and y can contain both lowercase and uppercase letters; x cannot be empty, but y can be ฮต. One uppercase letter is designated as the start symbol (conventionally, it is the letter S ).
- That final production for X says that X may be replaced by the empty string, so that for example abbX โ abb
- Written in the more compact way, this grammar is:
S โ aS | X
X โ bX | ฮต
S โ aS S โ X X โ bX
X โ ฮต
- For this grammar, all derivations of lowercase
strings follow this simple pattern:
- First use S โ aS zero or more times
- Then use S โ X once
- Then use X โ bX zero or more times
- Then use X โ ฮต once
- So the generated string always consists of
zero or more a s followed by zero or more b s
S โ aS | X
X โ bX | ฮต
Untapped Power
- All our examples have used productions with a single uppercase letter on the left-hand side
- Grammars can have any non-empty string on the left-hand side
- The mechanism of substitution is the same
- Sb โ bS says that bS can be substituted for Sb
- Such productions can be very powerful, but we won't need that power yet
- We'll concentrate on grammars with one uppercase letter on the left-hand side of every production
Formalizing Grammars
- Our informal definition relied on the difference between lowercase and uppercase
- The formal definition will use two separate alphabets:
- The terminal symbols ( typically lowercase)
- The nonterminal symbols (typically uppercase)
- So a formal grammar has four partsโฆ
4-Tuple Definition
- A grammar G is a 4-tuple G = ( V , ฮฃ, S , P ), where:
- V is an alphabet, the nonterminal alphabet
- ฮฃ is another alphabet, the terminal alphabet , disjoint from V
- S โ V is the start symbol
- P is a finite set of productions, each of the form x โ y , where x and y are strings over ฮฃ โช V and x โ ฮต
Outline
- 10.1 A Grammar Example for English
- 10.2 The 4-Tuple
- 10.3 The Language Generated by a
Grammar
- 10.4 Every Regular Language Has a
Grammar
- 10.5 Right-Linear Grammars
- 10.6 Every Right-linear Grammar Generates
a Regular Language
The Program
- For DFAs, we derived a zero-or-more-step ฮด*
function from the one-step ฮด
- For NFAs, we derived a one-step relation on
IDs, then extended it to a zero-or-more-step
relation
- We'll do the same kind of thing for
grammarsโฆ