Psychologists study psychological capacities – what we the call “the mind.” One of the distinctive psychological capacities of human beings is the ability to explain and predict the behaviour and mental states of other humans. Psychologists call this ability “Theory of Mind”. We all have “Theory of Mind” – but how does it work? That is, by what method or mechanism do we explain and predict other people’s behaviour?
People are very good at predicting and explaining each other’s behaviour. We are so good at it, that often we do not realise we are doing it. And it is very unclear how we do it. In this post, I will briefly introduce some ideas in psychology about how we do it.
“Theory of Mind” is the label for how we predict and explain the behaviour of others. It was originally called that because the first idea was that we have a theory of other people. On this account, we learn this theory as children, or it is innate — meaning we are born with it. It ought to be something like a theory in that it has some kind of rules in a system. They would say things like “everyone who wants some ice cream will go where they think the ice cream is.”
Subsequently, there was a debate as to whether this was really the right explanation for our Theory of Mind. Alternative accounts emerged. This means that some new terminology was required. The account I have already outlined above, where people use a theory to predict and explain others, became known as Theory Theory. It was, if you like, the theory that using a theory is how we do Theory of Mind! We use rules to predict and explain the actions of others.
The challenger account was called Simulation Theory. This says that people predict and explain others by simulating them. In other words, I predict what you will do in a situation by imagining that I am in that situation and then deciding what I would do. I might think (implicitly probably) “I want some ice cream, where would I go?”
We can see that both methods produce results that look plausible, to begin with. Both of them would account for the way that if I say to you “why did Jimmy go to the ice cream van?,” you don’t have any difficulty coming up with what looks like a good answer. What we don’t know is whether you came up with that answer by using a rule (Theory Theory) or put yourself in Jimmy’s place (Simulation Theory).
The debate continues as to whether Theory Theory or Simulation Theory is correct. The major objection to Simulation Theory was that it could not explain cases of systematic Theory of Mind error. In the Stanford Prison experiment, for example, the participants acted much more harshly than anyone outside the situation predicted. Those objecting to Simulation Theory said that if it was the correct account of Theory of Mind, then we would be able to get the right answer. We would be able to correctly predict the harshness of the participants by imagining that we were there.
I have provided what I think is the only response to this objection. I call it the bias mismatch defence. In it, suggest that if there is a systematic error in Theory of Mind, like the one in the prison experiment, it is because the people in the experiment are acting under a common cognitive bias, and the people outside it are not. They do not simulate the bias, in other words. There could be several reasons why they do not simulate it. They might, for example, have no particular emotional involvement in the situation. After all, being outside prison is much less intense than being in prison!
In this particular case of the prison experiment, I think the bias in question is Conformity Bias. This is the way we all tend to do what we are told, to some extent. But I could use this bias mismatch approach much more widely. It could be used to explain any cases where people systematically fail to predict how experimental participants will react, if those participants can be seen to be exhibiting any cognitive bias. We know about more than 150 of those so far, so there is plenty of opportunity for bias mismatch to arise. This bias mismatch happens a lot I think, and it is why so many results in social psychology are interesting and surprising — and also why so often, we fail to understand others.
I will argue that Proust has an interesting and modern perspective on the role and function of memory, based on an early perspective — by which I mean just the first two books.
Midway in book two, the “narrator” is surprised and delighted to receive a letter from Gilberte. (I place the word “narrator” in scare quotes because it is already clear to the reader that the person writing is doing so with a much more sophisticated perspective than would be available to a child or adolescent.) This occurs just before the introduction of the name Albertine; a name one is already certain will be of the highest significance.
One immediate observation is that both the names Gilberte and Albertine appear to an English speaker to be feminised versions of male names (but this may just be an artefact of time in that those were common at the time). More importantly, the name of Albertine is suggested in the way that Gilberte’s signature apparently begins with a G which looks like an A and the l is undotted; together with the way the final e is obscured in a “flourish” such that we could imagine it to be “–ine.”
This cannot be understood by the reader on first pass at least since the name Albertine has not yet at that stage been introduced — though it is about to be — which makes it seem to be something like a “shadow of the future…”
This all seems to tie in with the way “narrator” writes in an impossibly sophisticated way for a child and how all of the relations towards women seem to be similarly obsessive — it looks now as though either Swann relates to Odette in the same way as “narrator” does to Gilberte (and one awaits the arrival of Albertine with interest). Or, more plausibly, as though “narrator” is mapping his later more subtle appreciations on to others or finally — and we are presumably supposed to be talking about memory in In Remembrance of Times Past — to the idea that memory does not record events but is a later fabrication of them with heavy embeddings of later intellect/desire/perceptions which would be a very modern approach.
It is still common to see memory on a photography model: a more-or-less faithful record of actual events. Modern psychology sees it much more along the lines of a later heavily-biased reconstruction. Philosophers have taken varied views; Nietzsche has a particularly modern approach. I discuss this and outline some of the contrasting philosophical views in my thesis: http://discovery.ucl.ac.uk/1421265/
Returning to my “shadow of the future” remark above, we could say — if Albertine turns out to be yet another obsessive love, and perhaps the paradigm case — that “narrator” has painted aspects of that episode on to all previous memories of love affairs including the initial one he apparently had with Gilberte, and even his perceptions/recollections of the one that Swann “must have had” with Odette. That would explain Swann’s readiness to destroy his own social position and consort with the most mediocre people in order to be with her.
Naturally, all of this remains subject to revision depending on subsequent developments, but I think it is already clear that Proust is working with extremely subtle and sophisticated conceptions of memory and identity.
I will argue that Proust’s picture of how we get into the minds of other’s is simulationist, thus following the account that I favour rather than the mainstream one.
The term in psychology for the way in which we predict and explain the behaviour of others is “Theory of Mind.” This is, I suggest, something of a placeholder, because it is in fact deeply unclear how we do this. Or even if we get it right. It certainly looks like we do, but that’s just because we confirm our results using the same method. (This is sometimes known as the “dipstick problem” in philosophy. I can’t tell whether my fuel gauge is accurate if I only look at the fuel gauge.)
There are two accounts of Theory of Mind in academic psychology. One is called Theory Theory. This is the claim that we have a theory of other people that we learn when young. This is the mainstream account. The other account, which I support, is called Simulation Theory:
Simulation Theory suggests that instead of using a theory of others, what we do when we predict and explain their behaviour is to simulate them. Metaphorically, we place ourselves in what we think is their position with the information and desires we thing they have, and then work out what we would do.
Anyone who has read Proust knows that he has an exceptionally deep and unusual set of insights into our psychology. His insights are not paralleled elsewhere in my view, with the possible exception of Henry James. For this reason, it is unsurprising to me that he also favours Simulation Theory. Moreover, Proust even seems to suggest the defence of Simulation Theory using cognitive biases which I have proposed.*
There are two key quotations I will use to back up this claim.** The character Swann is discussing “fellow-feeling,” and remarks to himself as below:
“he could not, in the last resort, answer for any but men whose natures were analogous to his own, as was, so far as the heart went, that of M. de Charlus. The mere thought of causing Swann so much distress would have been revolting to him. But with a man who was insensible, of another order of humanity, as was the Prince des Laumes, how was one to foresee the actions to which he might be led by the promptings of a different nature?”
This tells us that Swann has observed that it is easier for him to predict or explain the behaviour of others when those others are similar to him. In this particular case, Swann is wondering which of his friends might have sent him a distressing anonymous letter. Swann believes that Charlus is similar to Swann himself, that Swann himself would not have sent such a letter, and therefore Swann concludes that Charlus did not send the letter.
On the other hand, Swann believes that des Laumes is a very different individual, who is “insensible.” (I suspect that a more modern translation would use “insensitive” here.). Note that Swann, in a very simulationist vein, does not say “des Laumes is insensitive, so he might have sent the letter.” Instead, he says “des Laumes is insensitive, so I cannot tell what he would do.”
This is a very simulationist line. It says, in effect, that Swann is unable, he believes, to simulate des Laumes, because des Laumes is very different to Swann. Note this is not consistent with the mainstream Theory Theory view. There is no reason why Swann, an intelligent and perceptive man, could not have a good theory of insensitive behaviour. There is by contrast every reason why Swann could struggle to simulate insensitive behaviour, lacking as he does the experience “from the inside” of such behaviour.
A further simulationist view is suggested later; someone might be a genius:
“or, although a brilliant psychologist, [not believe] in the infidelity of a mistress or of a friend whose treachery persons far less gifted would have foreseen.”
This is a claim that people may be extremely intelligent and even special gifted in academic psychology but still make Theory of Mind errors in relation to other people not so gifted. Note how uncongenial this is to Theory Theory. Intelligent people who are brilliant psychologists should have an excellent theory of others and so be able to make very good predictions of their behaviour. Simulation Theory, by contrast, will predict exactly what Proust is describing here: brilliant, intelligent (highly moral?) individuals will fail to predict the behaviour of others who do not possess those characteristics. And similarly, more ordinary mortals will be able to simulate much better and thus predict much better when the person to be predicted is more like the person doing the predicting.
The major objection to Simulation Theory is that it does not account for surprising results in social psychology, such as the infamous Stanford prison experiment. Here, people behave amazingly harshly, for no apparent reason. This behaviour is not predicted by anyone. Theory Theorists claim that Simulation Theory cannot explain this, because we should just be able to simulate being a guard in a fake prison and then predict the harsh behaviour.
I provide a response to this objection on behalf of Simulation Theory. I suggest that what is missing from the simulation is a cognitive bias. In the case of the Stanford Prison Experiment, the bias in question I propose is Conformity Bias. Simply put, this is just our tendency to do what we are told. This bias is a lot stronger than we suppose, in comfortable repose.
It is gratifying to find Swann also gesturing in the direction of this Bias Mismatch Defence, as I call it. Swann further observes that he:
“knew quite well as a general truth, that human life is full of contrasts, but in the case of any one human being he imagined all that part of his or her life with which he was not familiar as being identical with the part with which he was.”
This, if Swann is accurate in his self-perception here, is a description of a systematic Theory of Mind error. It is a form of synecdoche, if you like. Swann takes the part of the person he knows and assumes that all of the rest of that person is the same.
I have suggested that one of the biases which can throw off our simulations is the Halo Effect. This means we know one thing about a person or item which has a certain positive or negative perceived value, and we then assume that all of the attributes of the person or item have the same value. For instance, someone who is a good speaker is probably also honest etc. There is of course no strong reason to think this, rationally speaking.
I have discussed the implications of the Halo Effect on predicting behaviour in financial markets previously:
In that case, I called the Bitcoin bubble just before it burst by employing the Halo Effect and positing that it was affecting the judgement of buyers. It is encouraging to see that Swann is also on the same page as me here!
Note that I do not claim to be a Proust expert or even have completed my reading yet! I do not therefore suggest that the above represents a radical new reading of the whole of Proust. I make only the modest claim that in this one paragraph, Proust describes a version of Theory of Mind which is more congenial to simulation than to theory. Since there are only these two developed candidate explanations of Theory of Mind, then that is already interesting. (There is also a hybrid account which employs both simulation and theory, but that is a mess in my view and there is no evidence of for any theory in the above quotation and therefore no evidence for a hybrid account.)
*”IN SEARCH OF LOST TIME – Complete 7 Book Collection (Modern Classics Series): The Masterpiece of 20th Century Literature (Swann’s Way, Within a Budding … The Sweet Cheat Gone & Time Regained)” by Marcel Proust, C. K. Scott Moncrieff, Stephen Hudson).
**It might be argued that this view is not that favoured by Proust himself but by Swann, who is a character created by Swann. I will not pursue this sort of Plato/Socrates point, but merely observe that it is at the very least true that Proust considers the position worth discussing. Moreover, I think it is very clear that Swann is rather to be considered an intelligent, discerning individual, if perhaps somewhat afflicted by propensities for self-deception, and so the fact that this view is at least that of Swann is sufficient to make it interesting. (I am informed by someone who knows Proust better than me that I am likely to revise my view of Swann in a negative direction as my reading progresses.)
The Motor Theory of Speech Perception seeks to explain the remarkable fact that people have superior abilities to perceive speech as opposed to non-speech sounds. The theory postulates that people use their ability to produce speech when they perceive speech as well, through micro-mimicry. In other words, when we see someone speaking, we make micro replicas of the mouth movements we see, thus helping us to understand what is being said. A major objection to this explanation has been put forward by Mole (2010), who denies that there is anything special about speech perception as opposed to perception of non-speech sound. In this article I will defend the Motor Theory against Mole’s (2010) objection by arguing the contrary: there is something special about speech perception.
Our speech perception functions very well even in conditions where the signal is of poor quality. These abilities are markedly better than our perception of non-speech sounds. For example, consider how you can fairly easily pick out the words being uttered, even against a background of intense, and louder, traffic noise. This fact makes it seem that there is a special nature to speech perception as compared to perception of non-speech sounds.
The Motor Theory of Speech Perception (Liberman and Mattingly 1985) seeks to explain this special nature of speech perception. It postulates that the mechanical and neural elements involved in the production of speech are also involved in the perception of speech. On this view, speech perception is the offline running of the systems that when online, actually produce speech. According to the Motor Theory, motor activation – i.e. micro-movements of mouth and tongue muscles or preparations thereto – are also occurring when perception of speech takes place. The idea is that if you make subliminal movements of the type you would make to produce an `S’ sound, you are thereby well-placed to understand that someone else whom you see making such movements overtly is likely to be producing an `S’ sound. This is how we understand one another’s speech so well. And so it is key to the Motor Theory of Speech Perception that speech perception is special.
In some ways, the position of the Motor Theory in explaining speech perception is analogous to the position of Simulation Theory (see Short, 2015) in explaining how we are often able to predict and explain the behaviour of other people (so-called Theory of Mind). In both cases, the account seeks to generate a maximally powerful explanation of the phenomenon using the minimum of additional “moving parts”. The Motor Theory notes that we already have complicated machinery to allow us to produce speech and suggests that that machinery may also be used to perceive and understand speech. The Simulation Theory account of Theory of Mind notes that we already have an immensely complex piece of machinery – a mind – and postulates that we may also use that mind to simulate others and thus understand them. I see value in these parsimonious and economical simulation approaches in both areas.
Mole (Ch. 10, 2010) challenges the Motor Theory. He agrees that speech perception is special, but not that it is special in such a way as to support the Motor Theory. In this article, I will offer responses on behalf of the Motor Theory to Mole’s (2010) challenge in five ways, as outlined below.
Mole (2010) claims that speech perception is not special. If that is true, then the Motor Theory cannot succeed because it proceeds from that assumption. I will first deny Mole’s (2010) claim that other perception also involves mapping from multiple percepts to the same meaning and is therefore not unique to speech perceptionTaking an example from speech, we understand the name “Sherlock” to refer to that detective even though it may be pronounced in a myriad of different ways. This phenomenon is known as invariance. Mole (2010) claims that there is nothing special about speech perception, because other types of perception (such as colour perception) also involve mapping from multiple external sources of perceptual data to the same single percept. I will show that the example from visual perception invoked by Mole (2010) is not of the type that would dismiss the need for a special explanation of speech perception provided by the Motor Theory.
Mole (2010) makes another claim which is also intended to challenge the idea that underpins the Motor Theory that there is a special invariance in speech perception. This special invariance is the way that we always understand “Sherlock” to refer to the detective whichever accent the name is spoken in, or whatever the background noise level is (provided of course that we can actually hear the name). Mole (2010) claims that invariances in speech perception are not special as similar invariances also occur in face recognition. Mole (2010) seeks to make out his face recognition point by discussing how computers perform face recognition; I will show that he does not succeed here.
In the famous McGurk experiment, so-called “cross-talk” effects are seen. These occur where visual and aural stimuli interact with each other and change how one of them is perceived. For example, subjects seeing a video of someone saying “ga” but hearing a recording of someone saying “ba” report that they heard “da.” Since the Motor Theory postulates that speech perception is special, such cross-talk effects will support the Motor Theory if they are in fact special to speech perception. Mole (2010) uses cross-modal data from two experiments with the aim of showing that such cross-talk also exists in non-speech perception. I will suggest that the experiments Mole (2010) cites do not provide evidence for the sort of cross-talk phenomenon that Mole (2010) needs to support his position.
I will refute Mole’s (2010) claim that Motor Theory cannot account for how persons who cannot speak can nevertheless understand speech by outlining how that could occur.
Finally, I will briefly consider a range of additional data that support the Motor Theory which therefore challenges the position espoused by Mole (2010). These are that the the Motor Theory explains all three of cerebellar involvement in dyslexia, observed links between speech production and perception in infants and why neural stimulation of speech production areas enhances speech perception.
Challenges To Mole (2010)
Mole’s (2010) Counterexample From Visual Perception Is Disanalogous To Speech Perception
A phoneme is a single unit of speech. It can be thought of, roughly, as the aural equivalent of a syllable. Any single phoneme will be understood by the listener despite the fact that there will be many different sound patterns associated with it. It is clearly a very useful ability of people to be able to ignore details about pitch, intensity and accent in order to focus purely on the phonemes which convey meaning. This invariance is a feature of speech perception but not of sound perception, which situation motivated the proposal of the Motor Theory.
It is important to be clear on where there is invariance and where there is lack of invariance in perception. There is invariance in the item which the perceiver perceives (for example, Sherlock) even though there is a lack of invariance in the perceptual data that allows the perceiver to have the perception. So we can see that it is Sherlock’s face (an invariance in what is understood) even though the face may be seen from different angles (a lack of invariance in perceptual input). Similarly, we may hear that it is Sherlock’s name that is spoken (an invariance in what is understood) even though the name may be spoken in different accents (a lack of invariance in perceptual input). Lack of invariance is of course the same as variance; this discussion however tends to be couched in terms of invariance and its absence.
For supporters of the Motor Theory, this invariance in what the listener reports that they have heard is evidence that the perceptual object in speech perception is a single gesture – the one phoneme that the speaker intended to pronounce. This single object is always reportable despite the fact that the phoneme could have been pronounced in a wide variety of accents. The accents can vary a great deal but there is still invariance in what the speaker hears because most accents can be understood.
Mole (2010) denies that this invariance is evidence for the special nature of speech. Mole (p.217, 2010) writes: “[e]ven if speech were processed in an entirely non-special way, one would not expect there to be an invariant relationship between […] properties of speech sounds […] and phonemes heard for we do not […] expect perceptual categories to map onto simple features of stimuli in a one-to-one fashion.”
Mole’s (2010) argument is as follows. He allows that there is not a one-to-one mapping between stimulus and perceived phoneme in speech perception. I will also concede this. Mole (2010) then denies that this means that speech perception is special on the grounds that there is not in general a one-to-one mapping between stimulus and percept in perception (other than in speech). He produces a putative example in vision, by noting the existence of `metamers’. A metamer is one of two colours of slightly different wavelengths that are nevertheless perceived to be the same colour. Note that colour is defined here by wavelength rather than phenomenology. So Mole (2010) has indeed produced a further example of a situation where there is not a one-to-one mapping between stimulus and percept.
Mole (2010) has indeed produced a further example of a situation where there is not a one-to-one mapping between stimulus and percept. However, this lack of one-to-one mapping is not exactly what is cited as the cause of the special nature of speech perception under the Motor Theory. Rather the relevant phenomenon is ‘co-articulation’ – i.e., the way in which we are generally articulating more than one phoneme at a time. As Liberman and Mattingly write (1985, p. 4), “coarticulation means that the changing shape of the vocal tract, and hence the resulting signal, is influenced by several gestures at the same time” so the “relation between gesture and signal […] is systematic in a way that is peculiar to speech”. So while it is indeed the case that there are multiple stimuli being presented which result in a single percept, it is the temporal overlap between those stimuli that is the key factor, not the mere fact of their multiplicity. In other words, the Motor Theory argument relies on the fact that a speaker is pronouncing more than one phoneme at a time during overlap periods.
This means that Mole’s (2010) metamer example is disanalogous, because it only deals with the multiplicity of the stimuli in the mapping and not with their temporal overlap. This is the case because there cannot in fact be a temporal overlap between two colour stimuli. We can see this using a thought experiment. Let us imagine a lighting rig that is capable of projecting any number of arbitrary colours and also of projecting more than one colour at the same time.
In that case, we could not say that the perception of a colour being projected at a particular time was changed by the other colours being projected with it. That situation would simply be the projection of a different colour. So a projection of red light with green light does not produce a modified red, it produces yellow light. It is not possible to have a “modified red,” because such a thing is not red any more. The rig would not be projecting a different sort of red; it would be projecting a different colour that was no longer red.
I will illustrate this further with an example from a different sensory modality: hearing. The position I am taking about red (more exactly, an precise shade of red) is essentialist. On essentialist accounts, there are certain properties of an item which can be changed and will result in a modified version of that item. There are other properties, the essential ones, which cannot be modified consistent with the original item retaining its identity.
For example, some properties of an opera are essential to it being an opera. By definition, it is symphonic music with singing. A symphony requires only the musical instruments. Some properties of an opera can be changed and this will result in a modified opera. One could replace the glass harmonica scored for the Mad Scene in Lucia di Lammermoor with flute. One would then have a performance of a modified version of Lucia which would be a modified opera and would still be an opera.
What one could not do is change an opera into a symphony, strictly speaking. There could be a performance of the first act of Lucia as normal and one would be watching a performance of an opera. If in the second act the musicians came out and played without the singers, one would not have converted an opera into a symphony. One would have ceased to perform an opera and begun to perform a symphony, albeit one musically identical to the non-vocal parts of Lucia.
Returning to the lighting rig, we cannot say here that yellow is a modified red without abandoning any meaning for separate colour terms altogether – every colour would be a modified version of every other colour. This impossible lighting rig is what Mole (2010) needs to cite to have a genuine example, because it would be a case of multiple stimuli being projected at the same time and resulting in activation of the same perceptual category.
In sum, a metamer is an example where there is no one-to-one mapping between stimulus and perceptual category, but also where the different stimuli are not simultaneous. This is the case because we cannot be looking at both colours involved in a metamer at the same time. A co-articulation by contrast is an example of where there is no one-to-one mapping between stimulus and perceptual category, but where the different stimuli are indeed simultaneous. As it is that very simultaneity that is the key to the special nature of the systematic relation between gesture and signal under the Motor Theory, Mole (2010) does not have an example here that demonstrates that speech perception is not special.
Face Recognition Does Not Show A Similar Sort of Invariance Of Perception As Speech Recognition
Mole (2010) claims that face recognition is another example of invariance – for example, we can recognise that we are looking at Sherlock’s face from various angles and under different lighting conditions – thereby challenging the idea that invariance in speech perception is evidence for the special nature of speech perception. His claim is that the invariance in the way we can always report that we are looking at Sherlock’s face despite variance in input visual data is similar to the invariance in the way that we can always report we have heard Sherlock’s name despite variance in input aural data. If that is true, then Mole (2010) has succeeded in showing that speech perception is not special as the Motor Theory claims.
Mole (2010) allows that we use invariances in face recognition, but denies this could ever be understood by examination of retinal data. He writes: “[t]he invariances which one exploits in face recognition are at such a high level of description that if one were trying to work out how it was done given a moment-by-moment mathematical description of the retinal array, it might well appear impossible” (Nudds and O’Callaghan 2010, p. 216). What this means is that it would be difficult to get from the retinal array (displaying a great deal of lack of invariance) to the features we use in recognising Sherlock such as our idea of the shape of his nose (which is quite invariant).
However, this can be questioned as follows. Since the only thing that computers can do in terms of accepting data is to read in a mathematical array, Mole’s (2010) claim is in fact equivalent to the claim that it cannot be understood how computers can perform face recognition. That claim is false. To be very fair to Mole (2010), his precise claim is that the task might appear impossible, but I shall now show that since it is widely understood to be possible, it should not appear impossible either.
Fraser et al. (2003) describe an algorithm that performs the face recognition task better than the best algorithm in a ‘reference suite’ of such algorithms. Their computer is supplied with a gallery of pictures of faces and a target face and instructed to sort the gallery such that the target face is near the top. The authors report that their algorithm is highly successful at performing this task. Fraser et al. write (2003, p. 836): “[w]e tested our techniques by applying them to a face recognition task and found that they reduce the error rate by more than 20% (from an error rate of 26.7% to an error rate of 20.6%)”. So the computer recognized the target face around 80% of the time.
So we see firstly that the computer can recognize a face. [It is not an objection here to claim that strictly speaking, computers cannot `recognise’ anything. All that we require here is that computers can be programmed so as to distinguish faces from one another merely by processing visual input. It is this task which Mole (2010) claims appears impossible.] Then we turn to the claim that how the computer does this cannot be understood. That is refuted by the entire paper, which is an extended discussion of exactly that. Since this in an active area of research, we can take it that such understanding is widely to hand in computational circles, and should be more wide-spread.
It may be true in one sense that we could not efficiently perform the same feat as the computer – in the sense of physically taking the mathematical data representing the retinal array and explicitly manipulating it in a sequence of complex ways in order to perform the face recognition task. In another sense, we could, of course. It is what we do every time we actually recognize a face. The mechanics of our eyes and the functioning of our perceptual processing system have the effect of performing those same mathematical manipulations. We know this because we do in fact perform face recognition using only the retinal array as input data.
Mole (2010) has indeed provided an example of invariance (i.e., in face recognition) but the example does not damage the need for a special explanation of the speech perception invariances, because the face perception example can in fact easily be explained. Therefore Mole (2010) has not here provided a further example of a invariance and he has not thereby questioned the specialness of speech perception. Speech perception continues indeed to exhibit a unique invariance which continues to appear in need of unique explanation.
Experimental Data Do Not Show Cross-Modal Fusion
Mole (2010) argues that an experiment on judgments made as to whether a cello was being bowed or plucked shows the same illusory optical/acoustic combinations as are seen in the McGurk effect. The McGurk effect (McGurk and MacDonald 1976) is observed in subjects hearing a /ba/ stimulus and seeing a /ga/ stimulus. The subjects report that they have perceived a /da/ stimulus. It is important to note that this is not one of the stimuli presented; it is a fusion or averaging of the two stimuli. So an optical stimulus and and an acoustical stimulus have combined to produce an illusory result which is neither of them.
If Mole’s (2010) claim that the cello experiment shows McGurk-like effects is true, this would show that these illusory effects are not special to speech, thus challenging the claim that there is anything special about speech that the Motor Theory can explain. Mole (p. 221, 2010) writes: “judgments of whether a cello sounds like it is being plucked or bowed are subject to McGurk-like interference from visual stimuli”. However, the data Mole (2010) cites do not show the same type of illusory combination and so Mole (2010) is unable to discharge the specialness of speech perception as he intends.
The Motor Theory postulates that the gesture intended by the speaker is the object of the perception, and not the acoustical signal produced. The theory explains this by also postulating a psychological gesture recognition module which will make use of the speech production capacities in performing speech perception tasks. Thus the McGurk effect constitutes strong evidence for the Motor Theory by explaining that the module has considered optical and acoustical inputs in deciding what gesture has been intended by the speaker. This strong evidence would be weakened if Mole (2010) can show that McGurk-like effects occur other than in speech perception, because the proponents of the Motor Theory would then be committed to the existence of multiple modules and their original motivation by the observed specialness of speech would be put in question, in fact as in the McGurk effect.
More specifically, the paper Mole (2010) cites, Saldaña and Rosenblum (1993), describes an experimental attempt to find non-speech cross-modal interference effects using a cello as the source of acoustic and optical stimuli. Remarkably, Saldaña and Rosenblum (1993) state prominently in their abstract that their work suggests “the nonspeech visual influence was not a true McGurk effect” in direct contradiction of Mole’s (2010) stated reason for citing them.
There are two ways to make a cello produce sound: it can be plucked or it can be bowed. The experimenters proceed by presenting subjects with discrepant stimuli – for example, an optical stimulus of a bow accompanied by an acoustical stimulus of a pluck. Saldaña and Rosenblum (1993) found that the reported percepts were adjusted slightly by a discrepant stimulus in the direction of that stimulus.
However, to see a McGurk effect, we need the subjects to report that the gesture they perceive is a fusion of a pluck and a bow. Naturally enough, this did not occur, and indeed it is unclear what exactly such a fusion might be. Therefore, Mole (2010) has not here produced evidence that there are McGurk effects outside the domain of speech perception.
Mole’s (2010) response is to dismiss this as a merely quantitative difference between the effects observed by the two experiments. Mole (p. 221, 2010) writes: “[t]he McGurk effect does reveal an aspect of speech that is in need of a special explanation because the McGurk effect is of a much greater magnitude than analogous cross-modal context effects for non-speech sounds”. As we have seen, Mole (2010) is wrong to claim there is only a quantitative difference between the McGurk effect observed in speech perception and the cross-modal effects observed in the cello experiment because only in the former were fusion effects observed. That is most certainly a major qualitative difference.
Mole’s (2010) claim that the cello results are only quantitatively different to the results seen in the McGurk effect experiment produces further severe difficulties when we consider in detail the experimental results obtained. The cello experimenters describe a true McGurk effect as being one where there is a complete shift to a different entity – the syllable is reported as clearly heard and is entirely different to the one in the acoustic stimulus. Saldaña and Rosenblum (1993, p. 409) describe these McGurk data as meaning: “continuum endpoints can be visually influenced to sound like their opposite endpoints”.
The cello data were not able to make a pluck sound exactly like a bow and in fact the discrepant optical stimuli were only able to slightly shift the responses in their direction, by less than a standard deviation, and in some cases not at all. This is not the McGurk effect at all and so Mole (2010) cannot say it is only quantitatively different. Indeed, Saldaña and Rosenblum (1993, p. 410) specifically note that: “[t]his would seem quite different from the speech McGurk effect”.
In sum, the cross-modal fusion effect that Mole (2010) needs is physically impossible in the cello case and the data actually found do not even represent a non-speech analog of the McGurk effect, as is confirmed by the authors. Once again, speech perception remains special and the special Motor Theory is needed to explain it.
Sound Localization Experiment
The other experiment relied on by Mole (2010) was conducted by Lewald and Guski (2003) and considered the ventriloquism effect, whereAs above, the result that Mole (2010) needs to support his theory is an effect that is a good analogy to the McGurk effect in a non-speech domain. As I will show below, the data from the Sound Localisation Experiment also fails to bear out his claim that there are McGurk-like effects outside the domain of speech perception.
The Sound Localisation Experiment uses tones and lights as its acoustic and optical stimuli. It investigates the ventriloquism effect quantitatively in both the spatial and temporal domains. The idea is that separate optical and acoustic events will tend to be perceived as a unified single event with optical and acoustical effects. This will only occur if the spatial or temporal separation of the component events is below certain thresholds.
Lewald and Guski (2003, p. 469) propose a “spatio-temporal window for audio-visual integration” within which separate events will be perceived as unified. They suggest maximum values of 3◦ for angular or spatial separation and 100 ms for temporal separation. Thus a scenario in which a light flash occurs less than 3◦ away from the source of a tone burst will produce a unified percept of a single optical/acoustical event as will a scenario in which a light flash occurs within 100 ms of a tone burst. Since the two stimuli in fact occurred at slightly different times or locations, this effect entails that at least one of the stimuli is perceived to have occurred at a different time or location than it actually did.
To recap, in the McGurk effect, discrepant optical and acoustic stimuli result in a percept that is different to either of the two stimuli and is a fusion of them. We may allow to Mole (2010) that Lewald and Guski (2003) do indeed report subjects perceive a single event comprising a light flash and a tone burst. However, that is insufficient to constitute an analogy to the McGurk effect. Subjects do not report that their percept is some fusion of a light flash and a tone burst – as with the cello experiment, it is unclear what such a fusion could be – they merely report that an event has resulted in these two observable effects. [We may note that Lewald and Guski (2003) do not take themselves to be searching for non-speech analogs of the McGurk effect; the term does not appear in their paper or in the titled of any of their 88 citations, throwing doubt on the claim that they are working in the field at all.]
Indeed, the subjects were not even asked whether they perceived some fused event. They were asked whether the sound and the light had a common cause; were co-located or were synchronous. As Lewald and Guski write (p. 470, 2003): “[i]n Experiment 1, participants were instructed to judge the likelihood that sound and light had a common cause. In Experiment 2, participants had to judge the likelihood that sound and light sources were in the same position. In Experiment 3, participants judged the synchrony of sound and light pulses’ ”. A ‘common cause’ might have been some particular event but it is not the sound and the light and they were the only things that were perceived therefore the instructions do not even admit the possibility that a fused event was perceived.
Since Lewald and Guski (2003) are measuring the extent to which participants agree that a light and a tone had a common cause, were co-located or were synchronous, it is puzzling that Mole (p. 221, 2010) cites them to support his claim that perceived flash count can be influenced by perceived tone count. We see this when Mole writes (p. 221, 2010): “[t]he number of flashes that a subject seems to see can be influenced by the number of concurrent tones that he hears (Lewald and Guski 2003)”.
Moreover, neither the Sound Localisation Experiment nor the cello experiment support Mole’s (p. 221, 2010) summation that “[i]t is not special to speech that sound and vision can interact to produce hybrid perceptions influenced by both modalities” in the way he needs. Unlike with the McGurk effect, there are no hybrid perceptions in either case, where “hybrid” is understood to be ‘a perception of an event which is neither of the stimulus events’.
There are cross-modal effects between non-speech sound stimuli and optical stimuli but that is inadequate to support Mole’s (2010) claim that speech is not special. We still need the special explanatory power of the Motor Theory.
Mute Perceivers Can Be Accommodated
One of Mole’s (2010) challenges is that the Motor Theory cannot explain how some people can have the capacity to perceive speech that they lack the capacity to produce. Mole writes (p. 226, 2010) that “[a]ny move that links our ability to perceive speech to our ability to speak is an unappealing move, since it ought to be possible to hear speech without being able to speak oneself”. There is an equivocation here though on what is meant by ‘capacity to produce’. Mole (2010) is reading that term so that the claim is that someone who is unable to use their mouth to produce speech lacks the capacity to perceive speech. Since such mute people can indeed as he claims understand speech, he takes his claim to be made out.
However, in the article cited by Mole (2010), it is clear that this is not what is understood by ‘capacity to produce’. In the study by Fadiga et al. (2002) described, the neuronal activation related to tongue muscles is not sufficient to generate movement. This activation is a result of the micro-mimicry that takes place when people are perceiving speech. Fadiga et al. (2002) call this mimicry “motor facilitation.”
Fadiga et al. (p. 400, 2002) write: “The observed motor facilitation is under-threshold for overt movement generation, as assessed by high sensitivity electromyography showing that during the task the participants’ tongue muscles were absolutely relaxed”. Thus the question is whether the subject has the capacity to produce such a sub-threshold activation, and not the capacity to produce speech via a super-threshold activation. Naturally, since all the subjects had normal speech, they could produce both a sub-threshold and a super-threshold activation, with the latter resulting in speech.
However, someone could be able to activate their tongue muscles below the threshold to generate overt movement but not be able to activate those muscles above the threshold. That would mean that they lacked ‘capacity to produce’ in Mole’s (2010) sense, but retained it in Fadiga et al.’s (2002) sense. This would be a good categorization of the mute people who can understand speech they cannot utter. Those people would retain the ability to produce the neural activity that Fadiga et al. observe, which does not result in tongue muscle movement. This is a testable empirical claim to which my account is committed. It is possible that they may not be able to even produce the sub-threshold neural signals. If that turns out to be correct, it would be a problem for the Motor Theory and the defence I have offered for it here.
Similarly, we can resolve Mole’s (2010) puzzle about how one can understand regional accents that one cannot mimic; i.e. I can understand people who speak with an accent that is different to mine. The capacity to understand a particular accent could result from our ability to generate the necessary sub-threshold activations, but not the super-threshold ones. If we go on to acquire that regional accent, our super-threshold muscle activation capacities would be of the required form. This again is an empirical prediction which makes my account subject to falsification by data.
This hypothesis could have interesting implications in the field of developmental psychology. Mole (p. 216, 2010) outlines how infants can perceive all speech sound category distinctions, but eventually lose the ability to discriminate the ones that do not represent a phoneme distinction in their language. So it may be the case that all infants are born with the neural capacity to learn to generate super-threshold activations of all regional accents, but eventually retain that capacity only at the sub-threshold level – because they can later understand a wide range of regional accents – and lose the capacity at the super-threshold level – for those regional accents they cannot mimic.
Another implication here of the Motor Theory is to say that a listener’s vocal tract can function as a model of itself, just as a listener’s vocal tract can function as a model of a speaker’s vocal tract. This means that the sub-threshold activation functions as a model of the super-threshold activation. So, perceptual capacities involve the former modelling the latter exactly as the Motor Theory predicts. Such an approach does not commit the Motor Theory to the modelling/perception neurons controlling the sub-threshold activations being the same as the production neurons controlling speech production, so the account is not susceptible to falsification on that precise point.
Further Brief Challenges To Mole (2010)
The Motor Theory Explains Cerebellar Involvement In Dyslexia
Mole (2010) challenges the Motor Theory and in doing so, challenges the idea that speech production capacities are involved in speech recognition. For this reason, any data showing links between speech production capacities and speech recognition capacities will be a problem for him.
Ivry and Justus (2001) refer to a target article that shows that 80% of dyslexia cases are associated with cerebellar impairments. Since the cerebellum is generally regarded as a motor area, and dyslexia is most definitely a language disorder, we have clear evidence for a link between language and motor areas. That is naturally a result that can be clearly accommodated by the Motor Theory which links speech production and speech recognition.
It is not open to Mole (2010) to respond that the link is only between motor control areas and writing control areas, because although writing skills are the primary area of deficit for dyslexic subjects, the authors also found impairments in reading ability to be strongly associated with the cerebellar impairments. This can be explained on the Motor Theory because it says that Motor deficits will result in speech recognition deficits. Mole (2010) needs to provide an explanation of this which does not rely on the Motor Theory.
The Motor Theory Explains Links Between Speech Production And Perception In Infants
Mole (2010) does not address some important results supplied by Liberman and Mattingly (1985: p. 18) that link perception and production of speech. These data show that infants preferred to look at a face producing the vowel they were hearing rather than the same face with the mouth shaped to produce a different vowel. That effect is not seen when the vowel sounds were replaced with non-speech tones matched for amplitude and duration with the spoken vowels. What this means is that the infants are able to match the acoustic signal to the optical one. In a separate study, the same extended looking effect was seen in infants when a disyllable was the test speech sound. These data cannot be understood without postulating a link between speech production and speech perception abilities, because differentiating between mouth shapes is a production-linked task – albeit one mediated by perception – and differentiating between speech percepts is a perceptual task.
The Motor Theory Explains Why Neural Stimulation Of Speech Production Areas Enhances Speech Perception
D’Ausilio et al. (2009) conducted an experiment in which Transcranial Magnetic Stimulation (“TMS”) was applied to areas of the brain known to be involved in motor control of articulators. Articulators are the physical elements that produce speech, such as the tongue and lips. After the TMS, the subjects were tested on their abilities to perceive speech sounds. It was found that the stimulation of speech production areas improved the ability of the subjects to perceive speech. The authors suggest that the effect is due to the TMS causing priming of the relevant neural areas such that they are more liable to be activated subsequently.
Even more remarkably, the experimenters find more fine grained effects such that stimulation of the exact area involved in production of a sound enhanced perceptual abilities in relation to that sound. D’Ausilio et al (2009, p. 383) report: “the perception of a given speech sound was facilitated by magnetically stimulating the motor representation controlling the articulator producing that sound, just before the auditory presentation”. This constitutes powerful evidence for the Motor Theory’s claim that the neural areas responsible for speech production are also involved in speech perception.
Special situations require special explanations. The Motor Theory of Speech Perception is a special explanation of speech perception which, as evidenced by the rejection of Mole’s objections, continues to be needed. One might say that such “specialness” means the Motor Theory stands in a vulnerable and isolated position, as it seeks to explain speech perception in a way that is very different to how we understand other forms of perception. Here, I would revert to my brief opening remarks about the similarities between the Motor Theory and Simulation Theory. Whilst the Motor Theory is indeed a special way to explain speech perception, it is at the same time parsimonious and explanatorily powerful because like Simulation Theory, it does not require any machinery which we do not already know we possess. This is perhaps what underlies the continued attractiveness of Motor Theory as a convincing account of how people perceive speech so successfully.
D’Ausilio, Aet al. 2009 The Motor Somatotopy of Speech Perception. Current Biology 19: pp. 381–385. DOI: 10.1016/j.cub.2009.01.017
Fadiga, L et al. 2002 Speech Listening Specifically Modulates the Excitability of Tongue Muscles: a TMS study. European Journal of Neuroscience, 15: pp. 399–402. DOI: 10.1046/j.0953-816x.2001.01874.x
Fraser, A M et al. 2003 Classification modulo invariance, with application to face recognition. Journal of Computational and Graphical Statistics, 12 (4): pp. 829–852. DOI: 10.1198/1061860032634
Ivry, R B and T C Justus 2001 A neural instantiation of the motor theory of speech perception. Trends in Neuroscience, 24 (9): pp. 513–5. DOI: 10.1016/S0166-2236(00)01897-X
Lewald, J and R Guski 2003 Cross-modal perceptual integration of spatially and temporally disparate auditory and visual stimuli. Brain Research. Cognitive Brain Research (Amsterdam), 16: pp. 468–478. DOI: 10.1016/S0926-6410(03)00074-0
Liberman, A and I G Mattingly 1985 The Motor Theory of Speech Perception Revised. Cognition, 21: pp. 1–36. DOI: 10.1016/0010-0277(85)90021-6
McGurk, H and J MacDonald 1976 Hearing lips and seeing voices. Nature, 264, (5588): pp. 746–748. DOI: 10.1038/264746a0
Mole, C 2010 The motor theory of speech perception in Sounds and Perception: New Philosophical Essays. Oxford: Oxford University Press. DOI: 10.1093/acprof:oso/9780199282968.001.0001
Saldaña, H M and L D Rosenblum 1993 Visual influences on auditory pluck and bow judgments. Perception And Psychophysics, 54 (3): pp. 406– 416. DOI: 10.3758/BF03205276
Two preliminaries:. One: I was there at the first night, in seat W14 at the back of the Orchestra Stalls. If you weren’t, then you will have to take my word for it in terms of what actually happened. Two: I am a philosophical psychologist (cf. http://www.psypress.com/authors/i9043-tim-short) so if you would like to respond, do so to exactly what I write below and not to something in the vicinity of what I say which annoys you. If you want to be formal about it, I suppose the proposition for which I am arguing is “the scene was appropriate.”
I will start by outlining the events I saw and then show that all of the objections aiming to show that the scene was inappropriate fail.
A foreign army is occupying Switzerland. At the point in the libretto of interest, we are told that some soldiers force the local women to dance with them. One woman is offered champagne, somewhat against her will. She acquiesces nervously. She is then doused in champagne. The leader of the occupying forces, Gesler, molests her by placing a pistol between her legs at around mid-thigh level. She moves on to the dining table, upon which is placed a large table-cloth. She disappears behind a group of perhaps 10-15 soldiers. Shortly afterwards, she reappears naked. The duration of the nudity was something like half a second. She partly wraps the table-cloth around her and moves away from the table. The hero, Tell, appears and ensures that she is fully covered.
That’s it for the stage action. There ensued enormous amounts of booing which interrupted the action. One man shouted out “one step too fucking far mate” and another shouted “Holten out”. (Kaspar Holten is Director of Opera at the ROH.) There were a number of noisy walkouts.
The objections I have seen are as below.
The scene was too long
I don’t really see how this objection works. People have spoken of a ” five-minute gang rape”. I do not think you can get to five minutes even if you include all of the events I outline above in your duration. I would put it at two minutes; perhaps three at the outside. In any case, the nudity was momentary. This means at the outset we have to decide what constitutes a depiction of rape. That is a difficult question. Naturally, there was no sex or simulated sex on stage by anyone, so a fortiori there was no sex or simulated sex involving multiple men and the woman. However, it was clearly the intention of the director to depict rape in some sense and that intention we may assume was realized, because of the intense audience reaction. I think that this intense negative reaction meant that the “rape” that was perceived by the audience was too long simply because any duration was too long to be comfortable. But if we are purely talking about seconds on the clock, then it could not have been shorter and remained what it was. (You may wish to challenge me here by noting that the scene has now been cut and shortened. Is it still what it was?) You will also need to deal with the question as to how fictional objects get their properties; see my Sherlock piece: http://www.opticon1826.com/articles/10.5334/opt.bs/
The scene was gratuitous
This objection cannot succeed; it gains its initial plausibility by appearing to be the
nearby objection “the scene had a negative effect”. To make out the claim that the scene was gratuitous, you have to show that the scene had no effect. In other words, the aesthetic impact of the piece would have been identical if the scene had been eliminated. This is transparently false since the audience reaction to the scene and the reaction of others who were not there was immense. You may well feel that the aesthetic effect of the scene was undesirable, but that is not consistent with saying that its inclusion was gratuitous.
The scene was unnecessary
I can again respond similarly to what I said to counter the previous objection. In addition, I can observe that nothing is necessary. Even claims like “everything is identical to itself” are questionable under certain circumstances.
We need to protect victims of rape from depictions of rape
Was this a depiction of rape? Do we also need to protect people who have had a family member murdered from depictions of murder? There were several of those in this piece; they aroused no comment.
The inclusion of the scene condones rape
I don’t understand this objection, so if you share it, you will have to explain it to me. One question is whether or not it matters that the perpretators of the “rape” were the villains of the piece. If this is an alleviating factor, then it would have been an aggravating one to have had the hero Tell perpetrate it. Perhaps that would have been the provocative directorial choice.
The scene was “the last straw”
This is one of the more common objections. It seems to run approximately as follows: `this was a terrible production full of infantile symbolism, each scene was more offensive and unimaginative than the last, the “rape” scene was one step too far’. I happened to think that the production was brave and innovative, but that is not actually relevant to the argument. The problem with this objection is that it seems to entail the following: `this rape scene would have been appropriate in a more traditional production, or a production I liked more.’ That seems unmotivated and hard to argue for. It seems to be caused by the phenomenon of “moral licensing,” which is not a way to stand up an objection.
I conclude that all of the objections fail and the scene was appropriate. It is therefore unfortunate that the scene has now been modified by weakening it and shortening it. We may at least note that the Director of Opera did not insist on this; in fact he apologized for the offense that seemed to have been caused and explicitly did not apologize for the production. This is right and proper; I do not want what I can see at the Royal Opera House controlled by reactionary prudes who can only stomach totally traditional productions. The changes were made by the Director; so our regret should be that a courageous and ground-breaking production team have been forced to weaken the impact of their vision.
For me, the most dismaying part of the experience was seeing the change in the countenance of Malin Bystrom, who was superb. She was quite clearly delighted by the richly deserved approbation she received in her curtain call, but was still there for the booing of the production crew. This is what I call gratuitous. In fact, I can’t see any occasion on which booing is appropriate. Walk out silently if you must, but otherwise why not just stay at home. The ROH is generally sold out; we can do without your ticket money if you think you are going to decide what is appropriate in a production.
Review Of Soteriou: The Mind’s Construction Ch1 and Ch2
“Not all aspects of mind fill time in the same way. For example, some elements of our mental lives obtain over intervals of time, others unfold over time, some continue to occur” (p. 1)
Aim is to use these as individuation criteria for mental events/states/processes, which means it will be important that they are clearly definable and do not overlap, and then use those distinctions to illuminate `phenomenal consciousness’
(p. 9, p. 23) Distinction between the `manifest image’ of the mind and the `scientific image’ of the mind in Sellars 1962 is a bit like the distinction between folk psychology and scientific psychology. This is unsurprising since Sellars 1956 is credited with opening up the modern ToM debate in some ways. Similar questions arise. Is the former to be superseded by the latter, or is it to provide data for the latter? In other words, is introspection a legitimate means of enquiry?
(p. 9) Soteriou distinguishes between the legitimacy of introspection/phenomenology approaches to theorising about thought and about sensory experience. The idea is that the latter area seems to be more appropriate to the introspective mode of examination, because “conscious sensory experiences” have a “sensuous character” that “is somehow manifest to one”. This seems to approach but not reach a sort of Immunity to Error argument viz. my thinking some conscious states have certain features suffices to make it the case that they do have such features, such as if it seems to me to be raining, then there is something that seems to me to be the case. [Descartes at the root of this, presumably.]
(p. 9) Concession: introspection may not get us anywhere at all with the scientific image; nor will it (p. 11) alone resolve mental ontology
1.1 Introspection, ‘diaphanous’ experience, and the relation of perceptual acquaintance
(p. 12) The step from `you can introspect the sensuous character of a conscious experience’ to `you can introspect the sensuous character of a mental state’ looks innocent but isn’t.
(p. 13) Argument: Moore and diaphaneity. If you try to introspect an experience, you just get straight to the experience: the experience of blue is just the blue not `experience of blue’. Also, experiences of blue are not themselves blue.
(p. 14) A relational model of sensory experience raises more questions than it answers: what are the relata, what is the relation and how do we know introspection is any use for either question, given the Moore problem?
(p. 15) Relational accounts led to sense data theories to account for hallucination/error
1.2 Representational content and the properties of conscious experience
(p. 18) Introspection cuts both ways in the sense data debate. Looks like there is something relational going on; contra that it looks like the relation is between us and objects in the world not internal entities. Fashion dictates the winner; sense data theories not fashionable any more.
COP [Completeness of Physics]: “All physical effects have only physical causes”
P [Physicalism]: “all entities that exist are physical entities”
COP + P look problematic for sense data – are they physical or not?
(p. 18) “thoughts are to be individuated in terms of propositional contents”
(p. 19) “sensory experiences have intentional contents with veridicality conditions” cf. Frege, thoughts. Leads to: illusions are like false beliefs. We don’t think there needs to be anything in the world to correspond to a false belief so the argument from illusion for sense data looks less appealing. [Though of course this is a bit like `the problem is so big that it isn’t a problem anymore.]
(p. 20) Fechner, psychophysics. Wittgenstein!
1.3 The re-emergence of relational views
(p. 25) This new consensus needs a response to questions such as how much of the character of conscious experience is caused by the relatum and how much by the relation [cf. Frege again].
(p. 26) Preview of next chapter: whether there is a stream of consciousness or not will [as promised in Introduction] throw light on mental ontology and also can be investigated using the Fregean framework under which thoughts are differentiated by propositional content.
(p. 27) Consider: James `there is a stream of consciousness’ vs. Geach `there is not a stream of consciousness’
(p. 27) Mental states obtain and mental processes occur over time; even if the time taken is the same, these two unfoldings are different
2.1 The temporal profiles of thought and experience
(p. 28) Geach’s argument is basically that the stream of consciousness is seen as illusory on the line that thoughts are individuated by propositional contents, because those propositions then pass through the mind sequentially and separately. [But how do we know that this separation is not an artifact or mere consequence of the individuation criterion? Also, this looks a bit like a contest between competing introspections.]
2.2 Geach on the discontinuous character of thought
(p. 30) Geach’s argument: you can’t half have a thought; it must all be present at once. There are no transitions. Therefore you can’t have two at once — two thoughts cannot overlap. Therefore there is no stream of consciousness. Soteriou aims to look at all these steps.
(p. 31) Non-succession basically flows from the propositional content model. Saying `John is tall’ takes time but thinking it doesn’t because you haven’t thought anything unless you think the whole proposition.
(p. 31) “S can’t simply have a belief that ‘John’ ”. Can’t he, in a way, have that? Could it not be that a belief with the content `John exists’ could have that form? Alternatively, imagine hearing someone unknown come in, and wondering who it is, with John being the most likely option. We might express the content of your mental state as being `John?’. When you see him a second later, you know it is him. The two mental states separated by a second are 1). `John?’ and 2). `John’. Soteriou is again assuming a propositional model of thought content — which may be fine — and also it disallows propositions like `John’. Soteriou can probably say here that the account doesn’t mind what sort of propositions are allowed, as long as they can’t have duration. You still have to think the whole proposition at once if you think it at all.
(p 32) `the pack of cards is on the table’ is not thought in order with some bit of thought corresponding to `of’. [OK, but couldn’t there also be an ordering/division like `that’ `there’? Couldn’t you get half way through thinking the pack of cards is on the table when you realise that the thing on the table is a book and the cards are on the chair…?]
(p. 32) Geach: since there is no temporal order, there are also no transitions — because even if two propositions have a shared element, then they would not share a temporal part. [Can we think more than one proposition at once? Propositions entailed by a proposition thought. Subconscious propositions?]
(p. 34) Soteriou: however, there can be transitions between mental states, which is a problem for Geach. [Soteriou will try to fix the problem and adopt a modified version of Geach’s anti-stream of consciousness line. Is this consistent with Soteriou’s later commitment to a stream of sensory consciousness…?]
2.3 The ontology of the stream of consciousness
(p. 34) O’Shaughnessy: it is the necessity of flux that distinguishes the flow of the stream of consciousness, not just the flux itself, so experiences are not mental states
(p. 35) O’Shaughnessy: a mental state is like knowing that 9 + 5 = 14; it obtains
(p. 37) What distinguishes the cognitive from the sensory is not their properties but how they fill time [So that isn’t a property or reducible to one?]
2.4 Representational content and the ontology of experience
(p. 39) If over “t1–t5 S underwent an experience with the content ‘That F is G’, it would be a mistake to think that from t1 to t2 S underwent a conscious experience with the content that ‘That F’, and over the interval of time t3–t5 S underwent a conscious experience with the content that ‘is G’. This is a restatement of the modified Geach anti-stream of consciousness line espoused by Soteriou. [The claim looks phenomenologically plausible. But does it still work if the t1 to t3 etc time-slices become extremely small, of the order of nanoseconds? Soteriou handles this by saying that even so, the parts of the experience cannot be reduced to parts of the proposition.]
(p. 42) “the representational content of conscious sensory experience type-individuates a perceptual state of the subject”
2.5 Representational content and phenomenal character
[Qualia or what it is like to be a mental state need to be accounted for. Since Soteriou is not going with a stream of consciousness approach, then failure of such an account of qualia to be apt for inclusion in mental flow is no disqualification. Soteriou will now go on in 2.6 to outline the proposal he flagged in the introduction: we can categorise mental ontology by looking at the temporal underpinnings of phenomenal character.]
2.6 An ontological proposal: occurrence, state, and explanatory circularity
This will be Soteriou’s first outing of the major novelty in his approach.
(p. 47) The proposal: “individuate the kind of phenomenally conscious state that obtains in terms of the kind of mental event/process in virtue of whose occurrence the state obtains” — not a supervenience relation.
(p. 48) A circularity deriving from inter-dependence: “ interdependent status of event/process and state introduces a certain kind of explanatory circularity” i.e. each depends on the other. [How vicious is this circle, and circles generally…? Later Soteriou will say that the circularity may be not vicious but perhaps use its difficulty to reinforce its plausibility by suggesting it explains the `explanatory gap’. This is clever, because it suggests that the circularity is there because reality is just like that — and we have to get on with it.]
[For Soteriou, there is a stream of sensory consciousness but it will not be made up of a stream of propositions.]
[So — a good start. Soteriou has told us what the background is, what he is assuming, and where he wants to get to.]
That looked like a decent case, but what struck me more about the five well-chosen quotes is that they really show that Holmes is very well aware of the problem of Confirmation Bias. This is prevalent everywhere in everyone and completely bedevils our reasoning abilities. Given that this is very modern psychology, it is remarkable that Holmes was on to it so quickly.
I will proceed as follows. I will give you the quotes; I will tell you what Confirmation Bias is; I will show how the quotes show that Holmes is aware of the problem, and I will close with some brief remarks as to why Confirmation Bias is a problem.
Quotes from Sherlock
Here are the quotes; again courtesy of the Umbel blog.
1. “There is nothing like first-hand evidence.”
2. “The world is full of obvious things which nobody by any chance ever observes.”
3. “It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
4. “I never guess. It is a shocking habit,—destructive to the logical faculty.”
5. “‘Data! Data! Data!’ he cried impatiently. ‘I can’t make bricks without clay.’”
What is Confirmation Bias?
First cut: Confirmation Bias is the tendency to confirm what you already believe.
This of course is the enemy of good hypothesis formation. You should instead attempt to falsify what you believe. That is the only way of proving anything, because attempting to prove what you already believe just gives you an endless series of facts which are consistent with your hypothesis. You can have an infinite series of consistent observations but that proves nothing; whereas a single disconfirmatory observation disproves the hypothesis!
Given the remarkable asymmetry in power of potential observations, it is remarkable that few people ever look where they ought to. Of course, one reason for that is that if you falsify a hypothesis you already hold, you will have to track through the ramifications of that for your whole belief structure. If for instance, you find out that the man in the hat is not Moriarty, you will have to discard a large number of other beliefs. If you saw the man in the hat at the station, you now have to believe that the man at the station was not Moriarty, and so on, with potentially significant consequences for your picture of the world. This takes time and energy so people don’t want to do it.
Confirmation Bias comes in three main forms: a) not looking for disconfirmatory evidence; b) ignoring disconfirmatory evidence if it is pressed upon one; c) discounting disconfirmatory evidence.
Holmes on the Case
The key is quote 3, which is basically a statement of the problem of Confirmation Bias. The facts you actually see are twisted by what you are expecting to find, and so you will then inexorably find what you were expecting. For that reason, guessing is a mistake, as Holmes points out in quote 4. Because a guess does not stand in a vacuum. It is formed from currently existing half-beliefs and things you are prepared/want to believe. So it is biased. Worse still, the guess becomes a hypothesis which by the twisted magic of Confirmation Bias will now find ways of becoming your truth. Holmes is right to call this a shocking abuse of logic.
Quote 2 speaks to the problem of ignoring data. Many obvious things are unremarkable merely because we have seen them so often. Take gravity. Why do we stick to the earth? Isn’t that odd? No-one thinks so, but how can it be explained? (Incidentally I object to the latest TV version having Holmes say he doesn’t know that the earth goes around the sun because it changes nothing here. We would, for example, be shocked by his failure to expose as an impostor a scientist who claimed the sun goes round the earth. So Holmes needs an excellent theory of the world in order to have the excellent Theory of Mind that he clearly enjoys.)
Quotes 1 and 5 speak to the primary importance of data, which as I have been saying must be impartially collected and not merely what makes it through after Confirmation Bias.
Why is Confirmation Bias a problem?
Think about just two things: religion and politics. Imagine that you have been trained from a young age to believe a set of random hypothesis and have then had a lifetime exercising Confirmation Bias to back up these hypotheses. Some people move on from religious fairy tales, but many do not. Also, have you noticed that most people vote the way their parents did? They seem to know *without listening* that everything that the other political party says is wrong. This sort of factor gives you the political polarisation currently visible in America and elsewhere.
This is not a good thing and Holmes is right to warn us strongly against it. Beware Confirmation Bias!