Semantics is a subfield of computer science that studies ways of fixing a precise mathematical meaning to programming languages. This has obvious applications in software engineering, compiler construction, and programming language design.

While the syntax of mainstream programming languages have been described formally since the Algol standard in the mid-60s (by grammars in so-called Backus-Naur form), the meaning is usually given as a detailed description in English. This is unsatisfactory because natural language seldom has enough precision. For example, the following clause from the C standard about sequence points is classic:

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.

It is easy to identify various problems with it. It is very difficult to understand what it's getting at unless you already understand the restriction it is trying to impose. Also, the terms in it are quite vague: is an object "modified" if it is assigned the value it has already, and how can you tell why its prior value was accessed? This is a problem not only when trying to apply formal methods to prove things about programs or languages, but also if you are just trying to understand the language in order to use or implement it.

Formal semantics tries to instead specify the meaning of programs using mathematical constructs. Three main applications for this are commonly stated:

  • The very process of formulating a formal specification makes you get a more precise understanding of the intended meaning and reveals subtle ambiguities that were not previously noticed. Standards written in English are notorious for containing such unexpected ambiguities, and standard committees routinely serve "requests for interpretation" long after the standards text was finalised. By contrast, for example the authors of the ML standard afterwards reported on how the requirement to fix a precise mathematical meaning to the language constructs often revealed disagreements even among the authors themselves.
  • By having a precise description of a language, it becomes possible to formalise and prove statements about it. The most typical result one might want to prove is type-safety: if a program is well typed (which can be checked automatically by the compiler), then it will not exhibit certain bad types of behaviour like misinterpreting a byte array as a floating point number or writing to unallocated memory. But other kinds of properties are also becoming the subject of much research, for example proving that the security features a language provides actually prevent malicious programs from doing bad things.
  • Finally, a formal specification of the language is of course necessary to give a proof that a given program satisfies some specification. This application is the most ambitious goal. When working with realistically large programs, it is impractical to derive results using the 'first principles' semantics of the programming language. Rather, the formal verification effort requires constructing a hierarchy of different layers of abstraction meeting in the middle. From the top one tries to convert the informal requirements into a mathematical specification in a language like Z. At the same time, working bottom up, one tries to derive more tractable proof rules, for example derivation rules for Floyd-Hoare triples in procedural programming languages, or fusion laws in functional languages.

The question of exactly how one should approach the problem of assigning meaning to programs is far from settled. Programs are slippery things, and reasoning about them has an unfortunate tendency to degenerate into an amorphous pile of case-by-case checks. (It is not a coincidence that so much research into automated theorem proving targets software verification!). So far there has emerged a few different styles, each of which has its strengths and weaknesses, but there still seems to be ample scope for new ideas in the field. The main strands of research so far are:

  • Operational semantics. The name derives from the idea that the meaning of a program is described by the operation of an abstract machine that executes it. This is done by specifying a reduction relation, that maps one state of the machine to the next, and then considering the transitive closure of that relation, in the same way that classic automata theory treats finite state machines and Turing machines. The tendency in semantics, however, is to try to avoid machine-inspired constructions and instead work directly on fragments of the abstract syntax trees. One example familiar from classic computer science is the semantics of lambda calculus, which is defined in terms of the beta-reduction relation.

    Operational semantics is probably the most used approach at the moment. It is very "user-friendly" - if a programming language can be implemented on a real machine, chances are that you can also write down an specification in terms of an abstract machine. Various real world languages (like Java and even C) have been given "post hoc" formal semantics in this way. It is also easy to write down rules describing things like parallel computation and nondeterminism. Operational semantics are useful for compiler implementors - it is easy to understand what the rules require, and one can "work down" through a progression of more and more concrete abstract machines to move from specification to implementation. Operational semantics are also popular in type theory, which tends to use purely syntactic constructions and proof techniques anyway.

  • Denotational semantics is the main rival. Rather than constructing an relation between fragments of syntax trees, denotational semantics tries to map each fragment into some mathematical object which describes its meaning - the denotation. For example, an ML function definition fn (x:int) => x*x is naturally mapped into the function f:ZZ defined by f(x) = x2. The denotation of a fragment typically does not change during evaluation: all of 2+(1+1), 2+2 and 4 denote the integer 4. For this reason, it is sometimes said the denotational semantics models the invariants of evaluation.

    The crux of denotational semantics is finding a sufficiently rich class of denotations to be able to model all programming language features. Elementary set theory is enough to model a simple language where the only datatype is integers and the only interesting control structure is while loops. Since the 70s, it has been known how to use the theory of complete partial orders to model programs containing recursive functions and recursive datatypes, e.g. ML style languages. In the 1980s a formalism known as event structures was developed to deal with concurrency and nondeterminism, while probabilistic algorithms can be handled by having the denotations of programs be probability distributions. The CPO approach is not considered completely satisfactory because it is a bit too rich; there are elements in the semantic domain which cannot arise as denotations of programs, and these get in the way. Other approaches, like game semantics, try to avoid this and achieve "full abstraction".

    Denotational semantics gives powerful tools to reason about the behaviour of programs. In particular, since it talks about what a fragment means rather than how it is evaluated, it makes it easy to show that two fragments are contextually equivalent, i.e. can be used interchangeably. This is of interest for example when proving the correctness of the transformations an optimising compiler carries out. Denotational semantics also gives a nice basis for deriving proof rules: for example so-called Scott induction is a trivial result given how CPO-style denotational semantics are set up, but it is also very amenable to implementation in an automatic theorem prover. In general, denotational semantics give a more abstract view of programs than operational semantics does. Typically, one defines an operational semantics as well, and prove the equivalence of the two.

  • Axiomatic semantics takes yet another approach. Rather than first specifying the meaning of programs and then deriving proof rules to reason about them, here one specifies a set of proof rules in (typically) Floyd-Hoare logic, and then simply specifies that an implementation of the language needs to make these axioms valid. This was considered a promising strategy in the 1980s, and e.g. the Pascal-like language Euclid was specified this way. Clearly setting things up this way around makes for less work in establishing proof rules. Less obviously, this style of specification is claimed to shift the focus of the development processes towards a more formal approach: features in the language that are difficult to prove theorems about, like complicated control structures, will simply not have any defined meaning at all, so one is in effect forced to program in a tractable style.

    However, nowadays proof rules are not considered a practical approach to semantics (even though they are of course crucial during program development). This is for a mixture of conceptual and practical problems. Conceptually, taking provability as the basic notion of meaning meets the same problems of undecidability that logicians discovered in the early 20th century when trying to equate provability with mathematical truth. Basically, as soon as the language grows even moderately complex it becomes the case that for any given proof system it is possible to construct statements about programs (Hoare triples) that are true when considering the operational or denotational semantics, but cannot be proven in that proof system. Since the operational semantics formalises our intuitive notions about how programs behave, it seems very unsatisfactory to declare that the "official" semantics of the language is given by some particular proof system. On a practical level, proof rules for language features like for-loops or procedures quickly become extremely hairy and difficult to understand or use. If one tried to construct a set of rules that covered all the dark corners of a language, it would become very complex. It is much nicer to keep a separate specification of the language, and use that to derive specialised proof rules as the need arises.

For a more in-depth treatment of the different styles, one standard reference is Glynn Winskel's textbook [1], which is a completely self-contained introduction to the field.

Doing formal semantics has so far been very much an academic enterprise; language standards that are used in industry tend to be written in natural language. Partly, this is probably because the conceptual tools have been lacking -- the publication of The Definition of Standard ML in 1990 was ground-breaking in that for the first time an entire real-world programming language had been formally specified, and ML is still a very "pure and clean" language. More recently, several real world systems have been dealt with (see e.g. [2], which reports on operational semantics for C and UDP sockets), so from a purely technical viewpoint, things seem to be looking up for formal semantics. However, there is a certain barrier to entry in the mathematical background that needs to be picked up before one can read and develop formal specifications. Perhaps the big break will come when the current generation of computer science undergraduates, who were taught formal semantics at university, have permeated into industry...


  1. Glynn Winskel. The Formal Semantics of Programming Languages. An introduction. MIT Press, 1993.
  2. Michael Norrish and Peter Sewell and Keith Wansbrough. Rigour is good for you, and feasible: reflections on formal treatments of C and UDP sockets.. Proceedings of 10th ACM SIGOPS European Workshop (Saint-Emilion), 2002. pp49-53.
Also based on lecture courses given by Peter Sewell, Glynn Winskel, and Mike Gordon