CDL2 (version 2 of CDL
, a Compiler Description Language
) is a programming language
of the type known as compiler compiler
s. More specifically, CDL2 is a type of affix grammar
. It was developed in the late 70s
by a team at universities of Berlin
, and it was used to write compilers, mostly in education.
CDL2 programs consist of modules that define sets of procedures, in the same way as Modula 2 modules or
.c/.h pairs of files in C.
Procedure definitions consist of a header, specifying the formal arguments and possibly some local parameter declarations; and a body, which consists of statements connected by the , connective (conditional composition), or the ; connective (alternation).
Every statement either succeeds or fails. The , connective is the && of Unix shells: it sequentially executes until a statement fails, in which case it fails, and succeeds otherwise. The ; connective is its ||, acting as else if.
Identifiers, that is, names for procedures and parameters, consist of lowercase characters and blanks. The + sign separates identifiers in procedure headers and calls. In the headers, the > sign is used to mark each parameter as input, output, or both, as an aid to the compiler. Output parameters are read-only: their value cannot be modified within the body. The value of input parameters must be used somewhere within the body. Marking this allows the compiler to check it recursively across procedure calls. Compare the const keyword in ANSI C and C++.
A macro is a string that represents a piece of code in the target lanuage, the language the CDL 2 compiler compiles to; usually, the target language is a particular assembly language. Occurrences of parameters within the macro are translated to memory locations or registers.
A CDL 2 statement is either a special one-character statement +, -, or *,
or a call, which consists of the name of a procedure plus zero or more arguments.
A procedure can be defined either as a macro, which is a piece of code in the target language, i.e. the language we want to translate the CDL2 program into (e.g. assembly language), or a body consisting of statements, combined with the , and ; operators. Eventually, a procedure always expands to code in the target language, due to the macros it contains, directly or indirectly.
The target language is just arbitrary text, as far as CDL2 is concerned; it would be obviously useful to generate Java, C, LaTeX, HTML, or something similar, but you can also generate PNM, CDL2, or lines of Shakespeare, if you like.
CDL2 was written for programs that parse an input language. In such cases,
a library of macros is used that parse input, and the procedures in the program can be divided into two kinds: the ones that, when executed, cause some of the input to be parsed, and correspond to certain constructs or components within the input; and the auxiliary ones, of which the execution does not cause any parsing.
In particular, CLD2 was designed to facilitate the writing of programs in which the input is the specification of intended program behavior, while the result (the CDL2 program translated into he target language) is a program that implements that behavior. Such programs are called compilers; hence CDL2's name.
Every procedure definition must be marked as either predicate, test, action, or function. These four labels encode two binary properties:
whether the procedure can fail, and whether it can have have side effects.
- a PREDICATE can have ("side") effect; can fail
- an ACTION can have ("side") effect; never fails
- a TEST never has a side effect; can fail
- a FUNCTION never has a side effect; never fails
For a macro, the CDL2 compiler cannot help you assign these labels, since it does not know anything about the target language, but for a procedure definition consisting of statements, it can derive the correct label by itself and hence warn you if you apply the wrong one.
The + statement always succeeds: it is the /bin/true of CDL2, and actually, superfluous wherever it appears.
The - statement always fails.
The * statement means "repeat": it is a shorthand for a recursive procedure call with the original parameters.
There are two well-formedness rules for procedure bodies in CDL 2: code with effects must not fail and left recursion is forbidden.
The first rule (known as no defects) says that once we have actually done something, we can't just fail later on and pretend nothing happened. An example of a defect:
is aggravating + p,
fetch cake + c,
land cake into face of + c + p,
becomes angry + p;
shake hands with + p.
is aggravating, we fetch a cake, land it into
's face, and see if
becomes angry, until any of this should fail, in which case we shake hands with
The steps of fetching cake and throwing it clearly have effects, but they can also fail, so they will both be marked as predicates. Assuming that
does not have effects (i.e. is a test), it is not a defect for
to fail; but it is a defect for
land cake into face of
, since the previous statement has an effect.
The CDL2 compiler would notice the defect and issue a warning.
A defect represents loss of information: after executing this code fragment, we don't know what happened to the cake. Programming without defects, we know that we never have unfinished work that should really be cleaned up. It has the advantages of functional programming without forbidding side effects altogether.
The no defects rule allows CDL2 to be implemented with an execution stack: a procedure call pushes a frame on the stack to hold its local parameter values, and upon failure the frame can simply be popped off, leaving a clean state to continue from; we never need to undo an effect. This makes execution deterministic.
The second rule says that all statements that call, directly or indirectly, the procedure being defined, must always call something else first. In addition, the call must not use the same unmodified parameters. Actually, I don't remember the exact restriction - please /msg me if you do. This allows a more efficient implementation of recursion.
Considering the syntactic limitations, particularly on recursion, it's surprising how natural CDL 2 programming turns out to be in practice. The conditional combination programming style exists in other languages: Prolog uses it (but in a nondeterministic way); the conditional and and sequential or constructs in Unix shells and Perl allow this style, but without preventing failure after (side) effects have occurred.
The compiler can statically check this for all code except macro definitions: it does not know anything about the target language, so it accepts the programmer's classication of macros at face value.
CDL2 also has blocks (substructures within procedure bodies) and global variables, but they can be considered syntactic sugar. There are no data types: all real data manipulation must be provided by macros.
Here's one way to define multiplication in CDL2.
Note the 'diagonal' indentation style; it was the one most commonly used. All whitespace is optional, actually, a feature inherited from Algol.
ACTION mul +>x +>y +z> -xmin -zmin:
iszero + x,
setzero + z;
iszero + y,
setzero + z;
decr + x + xmin,
mul + xmin + y + zmin,
add + x + zmin + z.
CDL2 lives on in its successor, CDL3