Self-monitoring: a review of current literature

The purpose of this paper is simply to review current literature to enquire as to which model best describes the process of speech monitoring during dialogue participation, particularly that of self-monitoring.

The brief for this critical review was simple; to ascertain the extent to which self-monitoring is similar to speech comprehension executed during dialogue. In other words do you just listen to yourself, error detecting, as you would do with an interlocutor in dialogue.

To answer this brief first we must look at the different models currently in existence used to explain monitoring behaviour. This paper as such will give some background on the research area as a whole, followed by a synopsis of each proposed model.

The paper will review recent relevant literature highlighting argument for and against each of the proposed models, in the attempt to identify the best ‘fitting’ model, and more importantly to answer the brief; is self monitoring the same as language or speech comprehension?

Before any such review can take place the paper must first provide some background as to the subject area.

Arising from the study of language processing, particularly in dialogue, is the concept of self-monitoring or speech monitoring. Self-monitoring is the process of monitoring or checking the content of one’s own speech or dialogue output, against one’s own intention, to detect any deficits which would cause confusion or miss-communication.

Spontaneous speech, conversation or chitchat, whatever it’s label, they all contain errors, mistakes, slips of the tongue. To self monitor is to check on-line one’s own speech to detect any stray from intention, if any such speech error is detected then an interruption can be made, a hesitation can be placed, in order to rectify the mistake or to ‘self-repair’.

We all know that we make mistakes when we speak, and we all know that we correct ourselves, but how? Most people will intuitively answer, “by listening to what we say”, i.e., by using our speech comprehension system just as we would when listening to some other person making a mistake during conversation. The question is thus; is self monitoring just like listening to yourself talk to detect errors? The answer appears to yes and no. Yes, error detection can succeed through speech comprehension (audition), and no, research has shown than some self-repairs occur to fast as to be the result of auditory monitoring.

The paper begins to answer the question – to what extent is self-monitoring similar to listening to others speaking? – by highlighting the different models used to explain self monitoring.

Recent review and evaluations have centred on three models each arguing an attempt to explain the process of speech checking behavior. The three models include a perception based approach, a production based approach and finally a node structure approach (Postma 2000).

As this paper deals mainly with the question of comprehension, the perception based account will warrant the most attention, and as such will retain the focus of this paper. The other two models will be identified alongside their criticisms which appear to ill-favour them in explaining the behavior of self-monitoring.

Most abundant in the relevant self monitoring literature is the name of Levelt. Levelt (1983, 1989) proposed a theory of speech monitoring based on the concept of feedback via the perceptual system or speech comprehension system. He argued that speech was monitored via feedback ‘loops’ by a central monitor atop the hierarchy of the speech production system, a component he termed “the conceptualizer”. Levelt’s ‘loops’ however did not only belong post-articulatory, i.e., as with auditory feedback, but also pre-articulatory, feed back channels he termed “the outer loop” and “the inner loop” respectively.

Levelt’s “perceptual loop theory” (Levelt 1983, 1989) allows for monitoring at different levels, both pre and post articulatory, and argues for a central, attentional, resource-based monitor which monitors the end products of the speech production sequences.

As such his theory was that you could monitor your own output by audition via the outer loop (post-articulatory) but also pre-articulatory by parsing your ‘inner speech’ via the inner loop. Levelt actually included a further loop termed the conceptual loop, however for the purpose of this paper only the inner and outer will be considered.

In contrast the production-based account (Laver 1973, 1980 ; Schlenk, Huber & Wilmes 1987) assumes multiple local, autonomous devices, which can monitor formulation components of speech production not just end products. In addition these monitors may actually feedback on information from speech motor execution, e.g., efferent, tactile and proprioceptive feedback. Here Schlenk et al theorised that you could check your output, at each subcomponent of the language production sequence automatically using information from your physiology.

The third theory, which has been proposed to explain self-monitoring behaviour, is that of “Node Structure Theory” (MacKay 1987, 1992a,1992b). “Node structure theory views error detection as a natural outflow of the activation patterns in the node system for speech production. Errors result in prolonged activation of uncommitted nodes, which in turn may incite error awareness” (Postma 2000).

It is not within the scope of this paper to individually review and/or access each of these models, rather to comment on recent reviews within the relevant literature. Suffice to say production based monitoring theory and node structure theory found little support in Postma’s (2000) review of speech monitoring models. Postma (2000) reports “..Small but significant reductions in self-repair rates under dual task conditions clearly counter the idea that monitoring is a completely autonomous, self contained process, as proposed by both approaches (Production and node structure approaches)..”, concluding, “..As such, the present evidence favours the perception theory..”.

With two of the three monitoring theories finding little support in recent literature, this paper now looks to exhibit evidence from recent research in support of Levelt’s (1983, 1989) “perceptual loop theory” in attempt at using this model of explanation to answer the critical review question.

Central to Levelt’s theory of monitoring (1983, 1989) is the concept of loops. Our starting point in terms of supporting evidence is here.

Some of the first evidence supporting Levelt’s perceptual loop theory was from Blackmer & Mitton (1991). Blackmer & Mitton reported error to cut-off times as little as 150ms, demonstrating that error detection would not have taken place post-articulatory supporting Levelt’s idea of an inner loop.

Further support of perceptual loop theory was obtained, Lackner & Tuller (1979), Postma & Kolk (1992), and Postma & Noordanus (1996) reported that white noise presented during speaking did not prevent speakers detecting their errors, as white noise would prevent auditory loop feedback, again supporting the idea of inner loop inclusion.

Hartsuiker & Kolk (2001) formalised Levelt’s perceptual loop theory as a computational model, concluding that the inclusion of an “inner ” monitor is necessary to account for the empirical data.

Oomen & Postma (2002) further support Levelt’s perceptual loop theory using a dual task paradigm, they reported smaller percentages of errors detected in the dual task condition supporting Levelt’s assumption of a resource limited/attentional monitor.

Certain aspects of Levelt’s perceptual loop theory, as shown above, have been supported leading perceptual loop theory ahead in the race to explain monitoring, however not all assumptions of Levelt’s theory avoid criticism.

Levelt (1983, 1989) holds that pre-articulatory and post-articulatory monitoring feedback via the speech comprehension system, as well might be the case according to Oomen & Postma (2002). However results of a phoneme monitoring experiment (Wheelden & Levelt 1995) led Levelt et al (1999) to amend perceptual loop theory to suggest that the comprehension system could also access phonological code. This access to speech production subcomponents does not lie well with the concept of a central monitor which only accesses end products of speech production sequences, and not individual subcomponents.

As such Levelt’s perceptual loop theory does not entirely escape attack. The above issue of subcomponent access has had recent researchers hint at if not suggest a perceptual loop monitor complemented by production-based pre-articulatory monitoring devices (Postma 2000, Oomen & Postma 2002). Not a million miles from what Schlenck et al proposed in 1987.

In summary an observer of recent literature of this field would perhaps lean towards a perceptual based account in explanation of self monitoring as in Levelt’s perceptual loop theory (1983, 1989, 1999). A reader however would not be immune to the suggestion of current literature, that further research is required to provide a clear and conclusive model to explain self monitoring behaviour. Current empirical evidence does argue for a “..Central perception-based monitor, potentially augmented with a few automatic, production-based error detection devices.” (Postma 2000).

In answer to the question “To what extent is self monitoring similar to listening to others speak?”, such was the brief of this critical review, this paper answers; Self monitoring is very much like listening to others speak in regard to the fact that both behaviours utilise the speech comprehension system. On the other hand research has shown that mush monitoring takes place pre-articulatory and as such cannot be termed listening, in which case self monitoring is not similar to listening to others speaking, although both do use the comprehension system.

Again as above, current research suggests a possibility of production/perceptual complementarities which if supported would have the above question answer differently.

In conclusion this paper has looked first to identify different models which have been proposed over the last twenty years to explain the phenomenon of self monitoring, particularly with respect to the study of dialogue. These models were given as much background as the paper’s scope could afford.

Examination of current literature reviews resulted in this paper arguing in favour of a perceptual-based account of self monitoring as in Levelt’s (1983, 1989, 1999) perceptual loop theory.

As such, in closing, this paper’s review of current literature answers that self monitoring in dialogue is very much similar to that of listening to others speak, in reference to the fact that both do use the speech comprehension system to detect error via feedback loops rather than individual monitors at each level of speech production. However in so saying this paper wishes to make clear the acceptance of the possibility of complementarities (production/perceptual) in pre-articulatory monitoring.

