1. A company that makes nifty audio effects, like the Vortex .

2. Also, in the context of knowledge bases, a lexicon is a component that maps between words and their underlying meanings.

For example, if a knowledge base with a lexicon contained the concepts STAGE-MAGIC (meaning magic tricks performed for entertainment) and SPIRITUAL-MAGIC (meaning acts in which a practitioner actually (ahem) alters reality by force of will), the lexicon would associate the word "magic" with both of those concepts.

In linguistics the lexicon is the mental repository of what may informally be called words: the arbitrary links between sound and meaning. These are different for each language, and each speaker of a language internalizes their own list of words they know and how they understand their meanings and use.

The lexicon is part of what is stored in long-term memory. Some linguistic properties are predictable, and need not be stored individually: for example, regular plurals and past tenses can be generated when needed from other elements. The elements and at least some of the rules differ from language to language, so these need to be stored too. The lexicon normally refers to stored static elements, not to rules of derivation.

It also refers to the linguistic aspect of knowledge, not to a more general encyclopaedic knowledge. We know a vast number of things about cats, but those pertinent linguistically are that it is pronounced [kæt], that it is a count noun, that it has a regular plural, and a small amount of semantics: that it is animate but non-human implies the grammatical fact that we can optionally use it to refer to it, or she or he. Sentences are produced and understood using only this lexical information. A great many more inferences are possible at the level of pragmatics, outside the strictly linguistic domain, using our general knowledge of cats.


So instead of the informal 'words' we need to specify more precisely what kinds of things appear in the lexicon. There are several abstract terms for members of the lexicon: lexical item, or lexeme, or listeme, possibly with differences between them that don't concern me at this level of discussion. Then we need to say what properties are stored at the linguistic level and how they are connected.

First there need to be elements smaller than what are normally thought of as words. Morphemes build up into words. Derivational morphemes are those like un- and -able in undecipherable, and inflectional morphemes are the grammatically active ones like those in walks and walked. Derivational morphology is only partly predictable, and we get patterns like impress ~ impression shared by a large number of words, sometimes in more altered forms like describe ~ description, but where the final result is still a word that needs to be remembered in the lexicon. In contrast, inflectional morphology is usually freely predictable, so the lexicon (like a printed dictionary) doesn't need to store all the regular forms, though it does need to have entries for irregular forms like sang.

On the other hand, some entities larger than a traditional word also need to be stored. While compounding is sometimes transparent, so that we know what a wild horse is just from the two components, there are huge numbers of expressions that we do need to remember, such as gift horse and wild rice. Ray Jackendoff has proposed theories of semantics and the lexicon in which the lexicon is supposed to include all linguistic expressions held in long-term memory, regardless of size. This would include not just idioms like get over it but those that are of or approach full sentence size, such as get (your) knickers in a twist and the shit hits the fan, which behave in some ways like fully inflected grammatical constructions, but tend to have strange syntactic properties such as not permitting the usual range of grammatical variation: you can't normally say *the fan was hit by the shit.

Jackendoff would also include memorized quotations in the lexicon. This is getting perilously far away from the instantaneous-access component of the modularity hypothesis for language, and is straying into slow-access encyclopaedic information. It might depend on the quotation. Intuitively there seems to me to be a gap between the linguistic fragments 'to the manner born' and 'more honoured in the breach than the observance', which I can access and use pretty much instantly, and the matrix quotation that includes them both. Even though I know it well, I can't recall or use it in ordinary speech, or make puns or other variations on it, anywhere near as easily as its two component phrases. So I think there might be a limit to how fully any textual memory is integrated into the memory part of the language module.


A lexical item has a sound and a meaning. These are present at the interface to the phonetic and logical apparatus. Between them mediates the syntax, converting phonetic forms to logical forms and conversely by constructing syntactic representations that differ from language to language. Items must have some syntactic information in them: the part of speech, subcategorization properties (such as think requiring an animate subject and a propositional object), and irregular or suppletive allomorphy, at least.

The traditional picture of how these get into sentences has been called 'syntactocentric': a deep structure is composed of choices from the lexicon, then operations such as transformation or movement relate this structure to logical and phonetic structures, but without (during the syntactic phase) using any logical or phonetic features. This has been compared to each word carrying two locked suitcases through its syntactic processing.

Jackendoff's alternative is that each of semantics, phonology, and syntax is a generative system, and they work in parallel, connected across the lexicon. A lexical item is a correspondence between elements of the three systems.

The word /kæt/ in English will adhere to the language's constraints on nuclei and codas of the syllable, will receive stress and aspiration, and will engage in voice harmony with the plural morpheme, in one system, that of phonology. This is almost completely independent of the syntactic fact that it is an animate non-human count noun; and with whatever genuinely semantic information needs to be associated with it within the language module. However, larger-scale phonological processes such as what intonation to give it in a question do seem to depend on knowledge of the syntactic structure.

A morpheme like the English regular plural /z/ ~ /s/ ~ /əz/ belongs in the lexicon. It is common to treat the plural cats as lexical, that is one of the elements that is inserted into the beginning of the syntactic derivation, but another possibility is to treat this part of morphology as a part of the syntax, the morphosyntax, created by a kind of generative process. On the other hand some would say that morphosyntax is 'in the lexicon', and that therefore the lexicon is itself generative. Related but different considerations apply to common and perhaps lexicalized formations like catlike, to parallel nonce-formations like caterpillar-like, and to semi-productive patterns of formation like impress ~ impression ~ impressionable, where some of the semantics needs to be remembered.

Anomalous contents

If lexicon is a kind of memory, anything produced on the spot, like caterpillars or caterpillar-like, should not be counted as being in it. Chomsky's theoretical position is that nothing redundant should be in it: only what cannot be predicted. But in fact it is quite possible that early acquisition of regular forms could be a kind of memorization, so that lexicalized hands or walked are not displaced by subsequent inference of the morphemes generated from them and the newly-internalized rules for creating them.

There could be lexical items that are defective in some one of the three kinds of feature: hello and gosh having no syntax, and purely syntactic agreement and case marking having no semantics: indeed, theoretical syntacticians do posit a feature [-interpretable] on some grammatical markers, requiring them to be checked off and stripped out at some point in the syntax, after they've given rise to their phonetic form but before they reach interpretation of their logical form. They also posit elements having no phonetics, so-called empty categories, to give more consistent syntactic structures where there is no overt word. These too would have distinct properties noted in the lexicon.

Lexicon is a writing exercise for several writers. The goal of the exercise is to produce a hypertext in the form of an encyclopedic dictionary. The rules of this exercise were formulated by Neel Krishnaswami and published in The 20’ by 20’ Room.1 He cites The Dictionary of the Khazars as a literary model, although it is the work of a single author. The Encyclopedia of Arda provides a second model, even though it purports to be created from the study of a large body of extent text. And the Encyclopedia Frobozzica may be taken as a comic extreme. Naturally, the shade of Borges and his short story Tlön, Uqbar, Orbis Tertis gazes benevolently over such an exercise.

Essentially, the rules exist to aid the collaborative production of the hypertext. Each submission must include some number of citations of other articles. Since some of these citations might not (yet) exist as articles in the lexicon, this allows each participant to propose further directions of inquiry into the topic of the exercise. The actual production of submissions is aided by establishing deadlines; one submission from each participant every 3-7 days seems to be a decent pace. An alphabetical progression is applied so that the entries will be distributed evenly throughout the alphabet (although one can imagine cases in which this is not necessarily desired).

Neel Krishnaswami states that he originally formed this exercise to sketch out the background of an RPG he was running. Lexicon has also been used to develop illustration text for the new (forthcoming) edition of the Paranoia RPG.2

1. The 20’ by 20’ Room is a blog about the activity of role-playing gaming. The article that discusses Lexicon appeared there in November 2003. Neel Krishnaswami uses a wiki to moderate a game of the Nobilis RPG titled “Lower than Angels”. <www.20by20room.com/2003/11/lexicon_an_rpg.html>
2. The Toothpaste Disaster was a Lexicon game moderated by Alan Varney, one of the authors working on the Mongoose Publishing’s forthcoming edition of the Paranoia RPG, titled Paranoia XP. <paranoia.allenvarney.com/index.cgi/ParanoiaLexicon>

Lex"i*con (?), n. [Gr. (sc.), neut. of of or belonging to words, fr. a speaking, speech, a way of speaking, a single word or phrase, fr. to say, to speak. See Legend.]

A vocabulary, or book containing an alphabetical arrangement of the words in a language or of a considerable number of them, with the definition of each; a dictionary; especially, a dictionary of the Greek, Hebrew, or Latin language.

<-- also, a dictionary for use in computational linguistics -->


© Webster 1913.

Log in or register to write something here or to contact authors.