In the theory of generative grammar, the surface structure of a sentence is a mental representation that contains all the elements of the spoken sentence in the order in which they are spoken.

In the transformational theory introduced by Noam Chomsky in Aspects of the Theory of Syntax (1965), there are two internal levels, deep structure and surface structure. The deep structure represents the semantics, and chooses words from the lexicon, and a series of transformations, or rules of syntax, converts it into the surface structure. It is the surface structure that gets converted directly into phonetic form by the rules of phonology of the language.

For example, the semantic fact of dog biting man might be represented at a deep level by three elements DOG, BITE, MAN together with a logical nesting showing which is the verb, and which the subject and object. At the spoken level the two nouns need a determiner such as 'the', and the verb has to get a tense ('bites' or 'bit', or perhaps a compound like 'will have bitten'). If you're denying it you need to add a 'not' then transform it by inserting an auxiliary verb 'do', and making sure that that and not 'bite' gets the tense-marking ('did not bite', not 'do not bit'). If you're questioning the fact, you need to move something to the front of the sentence, either 'Did', or a word forming a wh question. The end result is a sentence like "Who didn't the dog bite?", which can be translated into phonetic values and articulated.

As his theories progressed in the 1970s, Chomsky dropped the condition that meaning was exclusively represented in deep structure, and also became unhappy with the unwanted implications of the words 'deep' and 'surface'. As he progressed in generalization, he adopted more abstract notation, calling the levels D-Structure and S-Structure.

In the Minimalist Program he has pursued from the 1990s, Chomsky finds no need for explicit D-Structure or S-Structure. A choice of words from the lexicon undergoes whatever is necessary to satisfy minimal constraints: such as that the word 'bite' knows it has two θ-roles it needs to fill, and the subject role has to be animate, and the object has to at least be concrete. Money can't bite anything, and dogs can't bite happiness. The lexical choice drives minimal operations needed to satisfy the constraints. The operations have to successfully converge on a Phonetic Form (PF), which is a structure in which every element is pronounceable; and also to a Logical Form (LF), which is the interface with non-linguistic mental operations such as hopes, fears, decisions, understanding, deduction, and will.

The point at which the two derivations part company, with phonological features preserved on the way to PF, and semantic properties preserved on the way to LF, is the level of description that was formerly regarded as the S-Structure. But in a logically minimalist sense, while there must be PF and LF, and must be some elementary operations such as Move and Merge, there is no requirement for some definite identifiable point at which the derivations split. The point of splitting for one sentence might be determined by something like a principle of least action acting on its components, and not bear any particular relation to how the components of a different sentence organize themselves. Chomsky now calls this level Spellout: it is the point where phonological information is stripped out of the branch of the derivation going towards LF.

