An affix grammar is a context-free grammar in which the names of nonterminals have structure: they consist of a stem and 0 or more affixes. The affixes are variables: they can assume values.

The same affix can be used multiple times within the same rule: in each use the value must be identical. If the value domains for affixes are all finite, they are just a shorthand notation for context-free grammars. If not, they become more powerful.

An example: suppose we have the context-free grammar


  file content:
    line;
    line, file content.

  line:
    words, newline.

  words:
    whitespace;
    word, whitespace, words.

  words:
    letter;
    letter, word.

  whitespace:
    whitechar;
    whitechar, whitespace.

  whitechar:
    space;
    tab;
    linefeed;
    carriage return.
  
Now what if we have different types of file content, described by subtly different grammars? For example,

  script file content:
    shebang, file content.

  gif file content:
    gif magic sequence, file content.

  ps file content:
    ps magic sequence, file content.

  [and many more rules like this]

  shebang:
    "#!".

  gif magic sequence:
    "GIF87";
    "GIF89".

  ps magic sequence:
    "%!".

  [equally many more rules]

With affixes we can summarize the first set of rules:


  full file content + type:
    magic sequence + type, file content.

  magic sequence + "script":
    "#!";

  magic sequence + "gif":
    "GIF87";
    "GIF89";

  magic sequence + "ps":
    "%!";

  [...]

The type is an affix, a parameter. In parsing file content, it will assume the type of file we're dealing with. In more complex grammars, affix values can be used to express agreement of type between different parts of the input; for example,

  sentence: noun + number, verb + number, noun + number2.

  noun + "singular":
    "John"; "Mary"; "the dog"; "the system";
  noun + "plural":
    "the Joneses"; "the police"; "noders".
  verb + "singular":
    "kicks"; "knows"; "falls in love with";
  verb + "plural":
    "kick"; "know"; "fall in love with".

This describes example sentences such as "the police kick noders" and "the dog falls in love with the dog", while excluding monstrosities such as "the Joneses knows Mary" and "Mary know John".


There are various types of affix grammars. The ranges of possible affix values can itself be described with content-free grammars: the resulting formalism is known as Van Wijngaarden two-level grammars. It is very powerful - Turing complete, in fact - and its main claim to fame is the fact that the syntax, including all typing and other context dependent issues, of Algol 68 was specified entirely in this formalism.

in the 80s, it was extended with a few shorthand notations and other conveniences to form EAG; a working EAG compiler, written in C, is freely available from ftp.cs.kun.nl.

The affix grammar formalism doesn't support computation on affixes. EAGs address this by providing a very limited set of special rules to express elementary computations on numbers. A different approach is to be open, and allow the specification of arbitrary computations of affix values of arbitrarily complex data types. This specification must be done in some external language. This leads to attribute grammars and CDL2.

Log in or register to write something here or to contact authors.