Some thoughts about the Language of Thought

"There must surely be a further, different connexion between my talk and N, for otherwise I should still not have meant HIM."

Certainly such a connexion exists. Only not as you imagine it: namely by means of a mental mechanism.

- Ludwig Wittgenstein, Philosophical Investigations §689

Language of thought?

Noam Chomsky once proposed in linguistics that some universal grammatical structures might be echoed in human neural or genetic structures, and in a way the Language of Thought Hypothesis is an attempt to work out in philosophy some of the consequences of this idea.

Introduced by Jerry Fodor in his 1975 book The Language of Thought, the Language of Thought Hypothesis (LoTH) may be stated roughly as the claim that there are representational structures, somewhere in our brains, which realise or instantiate a symbolic language underlying thought (sometimes termed mentalese.) This is intended to explain the nice regularities and causal consequences of thoughts, and our expressions of thoughts in language and actions. Very loosely speaking, it's a kind of type-type identity theory for the intentional parts of mind - the mental feature of 'aboutness' - though (in its orthodox form) it has nothing to say about the so-called "hard problem" of consciousness itself.

What's the problem?

The philosphical wrangles in which LoTH is involved are many and complex: representational theories of the mind, physicalism and eliminative materialism, theories of truth and meaning, reductionism, essentialism, connectionism ...

The hypothesis, as originally stated by Fodor, is at root intended to vindicate our everyday understanding of meaning and communication, by showing it's a good approximation to a description of real (physical) states of affairs in our brains.

Of course we take for granted that when Jean-Paul says "Le chat est mort" and when Bertrand says "the cat is dead", they mean the same thing. This idea of "meaning the same thing" is difficult to pin down consistently, however, and sometimes philosophers call such views (held without regard to their resistance to philosophical interrogation) a 'folk theory' of meaning and communication.

If we are to take it literally that they both mean the same thing, we could ask: "so where is this thing that they mean?" Ok, we can point to the cat: "there, that's what they mean", But suppose they had said "the cat's not here"? We can hardly point to the absence of the cat in the same straightforward way.

When we believe, hope, desire or fear that something is true it seems that we are in some state which relates particularly to a given proposition. This sort of state is sometimes characterised as a propositional attitude. Our willingness to accept as true certain kinds of explanations suggests that we believe in a complicated set of causal relations between propositions, propositional attitudes and human behaviour. The types of explanations concerned are all familiar enough:

"why did Johnny cool my writeup on frist node?"
"because he believed I would give him a million dollars if he did"
"why did johnny softlink the puppy node?"
"because he thought it was cool"
"why did X do Y?"
"because I told him that Z."

It's notoriously difficult to fully explicate the set of rules we do in fact use in evaluating explanations of this kind - though this doesn't bother us when we're actually dealing with them - but the set of rules may be called a 'folk psychology' (though we have other folk psychologies or areas of folk psychology dealing with different aspects of the mental, such as consciousness and volition).

The problem philosophers have with these folk psychologies (apart from normal philosophical arguing about what they are, whether they exist or not, etc.) is in assessing their status with regard to the more scientific brand of psychology practiced in academia. On the one hand, folk theories seem to have explanatory and predictive power, in some considerable measure. But on the other hand, they seem to lack any rigorous scientific grounding, Suppose we say that folk psychology is justifiable on the basis that it's the result of a long term empirical investigation of humans by humans. This is probably the nearest we can come to calling it scientific. Even in this case, the theories of folk psychology are isolated within the scientific discourse. There's nothing connecting them to other areas of scientific investigation in the same way that chemistry is connected to physics and biology to chemistry.

One approach is to simply conclude that these folk theories are just bad science, un-rigorous, unsystematic and so on, and when the academic theories provide better explanations, we'll simply discard the folk theories, and so much the better. This kind of eliminativsm is all very well, but if we take it to its logical conclusion, we might end up with a scientific psychology which just bears no relation to our everyday understanding of human behaviour, and would be forced to accept that all talk of belief, hope, desire and so on is simply nonesense. In that case, how to explain the predictive power of these notions?

So what use is LoTH with this problem?

Fodor, in arguing for a language of thought, is arguing for 'intentional realism': the view that propositional attitudes are as real as any other property that we could claim is real (like causal properties of physical objects, for example), and that words like believe, know, etc. are not just spurious terms in some misguided 'folk theory'. Philosophical behaviourism (such as that of Gilbert Ryle) reached the same conclusion, through different means - the beliefs and so on were claimed real in virtue of the 'dispositions' of the holders, to be inferred from their behaviour. In LoTH, the folk psychology is instead to be vindicated by the existence of a representational level of structure in the brain whose elements directly instantiate the causal properties we attribute to the terms of the propositions in our propositional attitudes.

We know that natural language follows rules (at least some of the time) so it has a certain systematicity (if we didn't have such rules, it is difficult to imagine that we would know how to deal with newly invented words - but in fact we find this very easy, because their use follows the rules with which we are already familiar: once we know what part of speech they are, we can immediately create sentences using them.)

The LoTH supposes that this systematicity in language is realised by a corresponding systematicity in neural states - that the causal relations obtaining between the various neural states immediately reproduce the required syntactic relations between the corresponding mentalese terms. In the extreme form of LoTH given by Fodor, the process of learning a natural language is the process of learning to translate one's 'native' mentalese terms into that language.

A computing analogy

Imagine we have a computer that can pass the Turing test, in several languages, and achieves this with a strategy of manipulating some symbolic structures according to a set of rules, and then translating to a human language for output.

In this case, we could offer a few candidates for "the Language of Thought": LISP, C, or whatever programming language we have chosen; a specification language, like Z; machine code; and so on. How do we decide which is the right language? They all fit the bill (computer programs can do little except manipulate symbolic structures according to rules) and they all completely specify the result!

Perhaps we might choose the machine code, on the grounds that it's more fundamental. It has a direct and transparent mapping onto the physical systematicity in the CPU, because each 1 or 0 can be directly correlated with a discrete physical state of affairs therein.

But no: we'd likely dismiss all these, claiming our program implemented a language of thought - that's what we designed and specified in the specification language - and the details of the LoT we designed are available by inspection of our design documents. Languages like C and machine code, though they happen to be used in our implementation, are not themselves sufficient for the language of thought.

Ok, brains and minds are not computers. But still, we can use the analogy to draw out some points: systematic descriptions are available at various levels of structure; these descriptions take the form of specifications of causal relations (a computer program, of course, is just a causal specifier: if a then b - you can't get more causal than that!) A language at a lower level can 'causally embed' a language at a higher level. A high level language can be 'realised' in many different ways in a language at a lower level.

Machine code might be thought analogous to atomic physics, say, and a higher level language to cell biology or neurobiology.

Corresponding to the LoT specified in our design documents, we could imagine a level of structure that is 'just below' our expression of thoughts in languages.

This LoT would be causally embedded, and multiply realizable, in neural structures, and its objects would stand to objects of discourse in the same way in which we take the word "cat" to stand to the cat. The universality we need to back up our assertion that "the cat" and "le chat" mean the same is then provided by the homogeneity of the descriptions of the LoT structures in Bertrand's brain and in Jean-Paul's, in much the same way that "s/blib/blob/" means the same thing in perls compiled on different architectures.

(I wouldn't want to take this analogy too far: the idea of a different CPU design really has no analogue, except maybe alien brains, radically different from ours, or perhaps universes where the laws of physics are different. Perhaps on the LoTH view, the individuation of thoughts into productions in some specific language could be conceived as different drivers, so speaking French would be more like having a different operating system running on the same type of CPU. Oops! I took the analogy too far!)

The 'higher up' we can take this structure, the more closely it will correspond to the idea of a language of thought - because obviously we have a lot of structural commonalities: atomic physics is just as true about me as it is about a French speaker; but no LoTH-er is going to claim that atomic physics is the language of thought, any more than they would claim it for machine code.

Deeper trouble

Now it's all very well to talk with abandon about neural structures, but it's a bit harder to pin down serviceable conditions for what should count as a 'structure' in such cases. At one extreme we have obvious structures such as the physical connectivity between the neurons - we can see this structure quite easily by examination of the brain. However, given that the operations of thought can proceed with proverbial swiftness, we're likely to be more interested in some 'second order' structure apparent in the patterns of firing in already connected neurons. In determining when such 'structures' should be taken as legitimate objects in a causal system we shouldn't unduly reify something like the gross national product of the neurons - a set of statistics which has only the most tenuous claim to causal efficacy.

Perhaps we could look at it something like the following (I'm not proposing a model of thought, here, just a hypothetical mechanism, in order to bring out an idea). Suppose that the mentalese terms are realised by different frequencies of firing in a group of neurons. Let's say that "raining" has a frequency of 716 (is represented by that frequency) and that "cloudy" has a frequency of 358. Now further suppose that we can identify a property of neural 'resonance' so that, as 358 is half of 716, the 'cloudy' frequency will be induced by the 'rainy' frequency in virtue of this resonance. In this sort of case, we're justified in attributing a causal role to the frequency - the representation itself - as opposed to the representational medium. If however, we were to find that there was no 'resonance' theory which would give the nice causal relations we expect between the terms - if we were to discover that the frequencies of 'unmarried' and 'bachelor' were not related in the way required for resonance, we would in the end be forced to concede that the representations - the frequencies themselves - were not causally effective, but merely a manifestation of the underlying causality in the neural process, which happens to exhibit some regularity, as an epiphenomenon

So the LoTH, in arguing for the existence of some properties of our neural structures which fulfil the role I've given to frequencies and resonance, above, is also arguing that these properties must be statable as a physical or physiological theory with explanatory and predictive power at its own level of description. This is to endorse the view that a top-down approach, like cognitive psychology (which on Fodor's view may be seen as a quest to describe the syntax of the LoT), and the 'bottom-up' field of neuroscience, investigating the physical properties of the neurons and their behaviour, may ultimately be joined into a single homogenous theory (or at least exhibit the same kind of close relationship seen between physics and chemistry.)

Digression: the 'immediate' language of thought

Perhaps there's a difference between a Language of Thought, in the sense of a set of structures underlying our thoughts, and a language for thinking in. One (disreputable) way of considering this is by trying to discern language use in your own thoughts. You can also consider the claim that given enough exposure to any language, a person could begin to think in it. I'm intending to include constructed languages, such as mathematical notation and programming languages, when used in considering problems in their appropriate domains.

The difficulty in the hypothesis that one can think in a mathematical formalism, say, is that all the usual conventions for verifying it are absent - other forms of language use all seem to have public (or potentially public) components: speech, writing, etc. But we can't exhibit our thinking except by producing a public form of the language: a mathematical proof, say, or a piece of code.

We have the same problem for natural languages: what evidence, other than speech, writing, and so on, can we produce for the supposition that we think in them?

But still, the idea of a language without any public productions is itself a little odd.

If we were to claim to be thinking in English or machine code, at least we have the languages (and other language users) at hand to verify this claim. Is it possible that when a cognitive neurobiologist presents a paper in Nature detailing the LoT, we will all slap our foreheads, realising that this was the language we had been thinking in all this time?

If we were to accept the ability to make public productions as a criterion for languages we may be said to think in, then it's quite plain there isn't a 'language of thought', over and above the usual suspects - the languages that we use publicly already.

We might develop our sense of an 'immediate' language of thought, by saying that in our experience of thinking, we are directly given the operation of rules on objects of thought (symbols, concepts, whatever) and we sometimes have the experience - we would say we are thinking in a particular language - of a very direct mapping of the rules and symbols in our thoughts onto those of the language concerned, so it's not so much the ability to produce the forms of the language to order as the direct correspondence of those forms with what we consider is going on in our heads as we do it, which is going to convince us we can 'think in' a given language.

After all this, I'm not sure how valid this type of objection is to the LoTH - LoTH-ers can simply say "Well, we're not discussing your subjective impressions or otherwise under any compulsion to meet your criteria for what a language is. Whether it's a language in the usual sense of the word isn't the issue, it's whether there's an identifiable layer of processing going on in your brain that renders 'public' what was going on in your head when you made the derranged claims about the cat. If you object to our calling this thing a language (though it seems a perfectly reasonable way of putting it) we can just call it something else!"

Or indeed, a wily LoTH supporter might just claim that the LoT is just what is mapped onto the 'external forms' of the languages. And to 'exhibit' the language, we need only call in a few brain surgeons and neoroscientists, suitably equipped. Shortly afterwards, attending carefully to the neurosemiologist's explanations, we would presumably slap our ghostly foreheads, exclaiming: "Yes! That's just what I was thinking!" (and I think it is at least conceivable that, if LoTH is true, and we understood the explanation, we would do this.)

A language of movement?

As well as being able to add numbers, we can perform some quite astonishing calculations without even being aware of it. as for example when someone catches a thrown ball. Most of us have no idea of the mathematicss and physics involved in predicting the path of the ball, and anyway people have been catching things since before these were even invented. Dogs can catch things! Indeed, computation is needed just for simple movement and navigation.

We could ask an analogous question: must there be, then, a 'language of movement' having a relation to these computations which is similar to the relation between the LoT and the computations involved in making linguistic productions?

One approach to building a walking android would be to use a general-purpose computer, connected to motors and sensors, which ran a program whose inputs come from the sensors and whose outputs control the motors. Since theoretically we can represent any possible physical arrangement having the same possibilities of movement as a model within the program, we can use this arrangement as a model for all such androids, and consider only the programming strategy. Which kinds of programs would we say implemented a 'language of movement', and which not?

Suppose we require our android specifically to be able to catch a thrown ball. I'd hazard that a 'language of movement' is implemented in those programs that have a specific representational structure (a variable, even!) for something like the position of the ball (or at least holding the optimal vector towards the ball.)

Alternatively we could imagine the program was constructed from a neural network-like assembly of identical components (with differing states) none of which have any way (individually) of representing a position (or a vector of movement) but whose combined behaviour meets our ball-catching requirements.

A determined LoTH-er might, after careful study of the behaviour of our components, produce a map of this behaviour (possibly statistical in nature) onto the nice clean explicit computational model of movement in the preceding example. Must we therefore grant that this model is 'the language of movement'?

We could imagine specifying 'the position of the ball' by means of a further computation, taking as its input the relations between the internal states of variously defined sets of our identical components. But what would happen if one of our components was randomly 'killed'? Would our mapping predict the result?

There's also the problem of ad-hoc neural implementations. It's possible, given sufficiently complex components, that due to the random variations in training, two identical systems might result in program instances implementating the required rule-following via entirely distinct (orthogonal) aspects of the components. In this case we would have to create a distinct map to the 'language of movement' for each of these instances. Our claims for the universality of these maps look particularly grim, here.

A connectionist strategy

A similar connectionist argument, denying the Language of Thought Hypothesis, might be that we have a bunch of simple neural components which produce, en masse, very general rule following and pattern matching behaviour - in effect, they allow us to navigate in rule-and-symbol-space.

Take an analogy with genetics: it's obviously false that a fertilised human egg has within it all the information necessary for the construction of a human being. The mother's womb is necessary as an environment, and it's an information-rich environment, introducing many regularities (in the form of diffusion gradients, cyclic behaviours, etc.) to the developing foetus.

The human linguistic environment in which individuals learn language is in an analogous position to the mother's womb during ontogeny. The regularities in others' language use we experience while learning language are what train us to reproduce these regularities in our own language use ... and so on, unto the nth generation.

The LoTH boils down to the assertion that there are structural regularities in our neurons which are specifically to do with a particular subset of regularities in our language use (the ones to do with propositions, and 'propositional attitudes', say)

I think it's fair to say that at the very least, we'd expect some practical difficulties in distinguishing this particular neural systematicity from the other systematicities induced by our experience of the diversely systematic behaviour of the world in general.

How, then, can we explain our ability to perform calculations? How do we use language consistently and correctly?

The only kind of explanation we know for computation is that physical cirumstances must be set up so that the laws of nature - causal specifiers - governing the behaviour of the components produce the results required.

The kind of apparatus we normally deal with must have an intelligible structure at the level of description used for the input and output of the computation: it's because I can describe a lever as a metal bar of such-and-such length with a fulcrum at point p that I can talk in my model of placing a weight w at one end and make a measurement of force at the other end, which I can then regard as the result of a multiplication.

If I were to describe the affair at the level of particle events, there is no description of the resultant force. It's because I can describe 'placing the weight' and 'measuring the force' as events that I'm entitled to call it a computation. At the particle level, there is no 'measurement', no 'position of the fulcrum', just a bunch of particles, some in the lever, some in the fulcrum, wobbling about and interacting.

We require our apparatus to be consistent (within limits). When it's not, we will regard it as broken and no further use for computation. In our case, this consistency is guaranteed by the rigidity of the lever and the law of nature governing the relation of the weight and force to the position of the fulcrum.

This is the 'classical' model of computation which fits with the mathematical notion of 'Turing computability'.

So LoTH-ers will say that the fact that we can perform discrete computations consistently (within certain limits) can be taken as a strong indication that in principle there's a level of description available of features in the brain, which instantiate the terms of the computation in the same way that a 4kg weight instantiates the number four in our multiplication-by-leverage apparatus.

A robustly non-classical connectionist must provide a model which shows how we can escape from this rigid classicist picture.

But the computations performed by neural networks that already exist (in computers), it might be argued, do, in fact, exhibit the features required. We have an array of identical software components, each with their own discrete state, all following the same rules for combining an input with their state to produce an output and a new state. With suitable 'training', we can induce various computational abilities in the array. The computations performed count as computations, because (at the classical level of description) there are discrete outputs exhibiting the required rule-following relationship to discrete inputs, as the resultant force does to the weight. But if we examine the machinery at the same level of description, we will find no equivalent of the length of the lever, the position of the fulcrum, and so on. We find only a chaotic jumble of numbers representing the states of our components.

The reply might be that nonetheless, for the computation to be performed at all, there must be some consistent set of relations between sets of those state-representing numbers which will map onto the requirements of the computation ... et voila: the lever and the fulcrum. But against this, we can point out that these relations are purely contingent. With even very slightly different training, we'd have to start from scratch, because the butterfly effect is certainly operational in neural networks, and we'd come up with a very different set of relations to explain the same computation by the same apparatus. Following Dennett, we'd say that these relations are not an 'engineering reality', merely observations of contingent, ephemeral properties of our apparatus - epiphenomena.

If we must find the lever and fulcrum somewhere in our apparatus - what it is that produces the systematic correctness of the results - I think we should more properly locate these in the practice of training the network. If we naively view a neural net as a black box which is equipped to produce a certain range of computational behaviour on receiving a training meeting some conditions, it seems we have to look at the conditions for the training for the real source of the regularity which produces the specific computational abilities. The rigidity of the lever is analogous to the consistent rule-absorbing properties of the software components, even, and the position of the fulcrum to the details of the training. The natural law that relates the two would be verified by the systematic observations of programmers and information scientists confirming that this type of neural net, given this type of training, will produce this kind of computational ability.

I'd argue that our connectionist model maps much more effectively onto what we currently know of the brain, and human behaviour. It matches what we know of the structure of the brain - an assemblage of similar components with the ability en masse to acquire rule-following characteristics. It lacks the tricky problem of causally separating certain (propositional) thoughts from the mental background in the physical goings on in the brain - indeed, it is friendly to the unity and homogeneous diversity of our mental life: it opens the possibility that our basic intuitions about reality, objecthood, extension, and so on, are well founded in our experience, and provides an easy route for them to mix with thoughts we class as 'propositional'. It gives proper weight to the importance of our 'training' - our experiences, personal history and the continuing regularities in our linguistic and physical environments - in our learning how to speak and think. It has no need for a Language of Thought, being content for thought to take place 'in' languages learned by the normal means - which is to say that our mental apparatus may successfully acquire the rules and practices exhibited by our fellow language users without aid from some idealised internal calculus.

Whither folk psychology?

None of this is to deny the validity of cognitive psychology, as the attempt to produce better and better maps of our computational characteristics, but it is to view the meeting ground of cognitive psychology and neuroscience as a sort of design meeting - the cogno's have a set of specifications, and the neuro's have to implement something that will meet the requirements (given a certain type of training, to produce certain kinds of rule-following, within tolerable limiits.) In contrast to the LoTH view of this meeting, where the cogno's would specify the very data structures to be used in the implementation, there will be no implication for the design strategy in the requirements themselves - the neuroscientists get a free hand, and can stick to their own data structures.

On this view, that Bertrand and Jean-Paul mean the same thing by their statements about the cat is better demonstrated by observing similarities in their rules for using French and English than it is by investigating similarities in their neural structures. And of course with due regard to the similarities in dead cats in both countries - a view which should go down well with those less terracentric types who believe it might be possible to communicate with aliens (robots?) despite their radically differing brain structures.

Whatever order is implicit in our neural structures, enabling us to follow these rules, is likely explicit only in the human (and natural) behaviours that constitute the performance and training which shapes our abilities, so cognitive psychology, as the attempt to map our computational characteristics, must then be seen as the attempt to formalise the results of that training.

If this is the correct view, it would seem that 'folk psychology' may well expect to be vindicated by its cognitivist cousin, because of course the expression, explanation, rejection, negotiation, prediction, calculation and evaluation of hopes, fears, desires, beliefs, and so on, which underly our claims for the existence of a folk psychology, themselves form a large chunk of the training whose results must be formalised.

Needless to say, the ontological status of those shadowy entities, the propositional attitudes, is not rendered any clearer by this approach - if anything they appear as viral memes, as emergent patterns in a societal / linguistic substrate, as much as a neural one, given that the causal locus, the engineering reality, is the training and interaction, linguistic and social. Though this falls short of a causal intentional realism, in Fodor's sense - the causality involved is emergent, distributed and circular - perhaps it's enough to justify our continued faith in talk of propositional attitudes, without recourse to a "Language of Thought".

Each thing she learned became part of herself, to be used over and over in new adventures	How your brain codes knowledge	Very High Level Language	Thinking without language
The Very Model of an Eliminative Materialist	language complicates our lives	Noam Chomsky	Wittgenstein on meaning as use
Haragei	Frist	Language of thought	mentalese
Natural History of the Chicken	Guerilla ontology	epiphenomenalism	Implicit and explicit knowledge
Dead Spread	Mary the color scientist	Ludwig Wittgenstein and the Problem of Other Minds	implicate order
Apeiron	Baldwin Effect	language is a virus

Some thoughts about the Language of Thought

Page categories: