The
Chomsky Normal Form is a
notation of
context-free grammars. Notation may sound a bit strange, especially if you know, what the CNF is (like
rp). Indeed there isn't
the CNF of a grammar, but of a language. But as you often have a grammar and want to transform it into a CNF, thinking of the CNF as a different notation may be easier. Normally the only restriction for a
context-free grammar G = {N, T, P, o} (short description of what the symbols mean: N is the set of
non-terminals, T the set of
terminals, P is the set of
derivation rules and o is the start symbol) is that for a derivation rule a→b a ∈(e.o.) N . The same
grammar in CNF (the C is for Chomsky, there is also a
CNF in boolean logic, where the C stands for
conjunctive) has some more rules: b has to be formed by two non-terminal or one terminal symbol.
Formally this looks like:
A,B,C ∈(e.o.) N
a ∈(e.o.) T
A →(->) BC
A →(->) a
Every derivation rule in P has to look like this. Two additional facts about CNFs:
Every
context-free grammar has one CNF. But the CNF is not
unequivocal. IF you have for example a
regular grammar with A →(->) abc ∈(->) P, this could be reformed into A →(->) Bc with B →(->) ab or into A →(->) aB with B →(->) bc (thanks to
ariels for this explanation)
And the
derivation tree (you form a tree from the derivation rules, where the right parts of the rules are the leafs of the left parts) of a CNF is a
binary tree.