Steven Morris 
 
The Influence of Working Memory on 
L2 Oral Fluency: An Exploratory Study 
 
Abstract 
The goal of this exploratory study is to investigate the effect of working memory 
capacity on L2 oral fluency in 79 learners of English as a foreign language. Three tasks 
were used as measures of working memory (the reading span task, letter span task and 
an attention-switching task). Twelve measures of fluency were used spanning across 
speed, breakdown and repair fluency. Positive correlations were found with measures of 
repair fluency, specifically morphosyntactic, differency, and other repairs whereas 
negative correlations were found for lexical repairs. When participants were divided 
into groups based on proficiency, potential relationships were found between working 
memory and speed/breakdown fluency suggesting the possible existence of proficiency 
thresholds affecting the relationship between working memory and fluency. The results 
are discussed in light of previous research and De Bot?s (1992) model of L2 speech 
production. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
MA Thesis Applied Linguistics, Universitat de Barcelona 
Steven Morris 
Supervisor ? Roger Gilabert 
Steven Morris 
 
1. INTRODUCTION 
 
   The cognitive revolution in Psychology has had a huge impact on the field of applied  
linguistics. The exploration of how individual differences in cognitive abilities can  
affect second language (L2) attainment and performance is gaining momentum as  
important predictors of L2 acquisition are being found (Sawyer & Ranta, 2001). The  
goal of this paper is to investigate the role of working memory (WM) capacity on L2  
oral fluency. The introduction outlines the main concepts and existing research into the  
area. Subsequently, the study itself will be described and the results will be presented  
and discussed in relation to the studies? research questions, previous studies and future  
directions in this field. 
1.1. What is L2 fluency? 
   In order to measure something correctly it is necessary to first define what exactly it is  
that is being measured. The way in which fluency is measured depends on how it is  
conceptualised and defined. There exists a plethora of definitions and measures of  
fluency. People may use it to refer to global L2 proficiency, to be able to speak publicly  
in an L2 or to express thoughts and opinions quickly (Fillmore, 1979). The term is also  
associated with the metaphorical movement-like aspects of spoken language such as  
how well language flows or how fluid it is (Segalowitz, 2010). These are considered  
broad, qualitative definitions of fluency (Kormos, 2006). However, it is the narrow  
sense of fluency which the present study will focus on. 
   Narrow definitions regard fluency as just one aspect of performance (Kormos, 2006).  
Lennon (1990) states that fluent speech is speech in which the listener receives the  
impression that the underlying psycholinguistic processes responsible are working  
effortlessly. This definition alludes to fluency as a listener?s impression of efficiency in  
terms of a speaker?s cognitive processes. Schmidt (1992) stated that fluency also relates  
to the cognitive processes themselves and that fluent speech emanates from automaticity  
and proceduralization. This led to Lennon re-defining fluency as ?rapid, smooth,  
accurate, lucid and efficient translation of thought or communicative intention into  
language under the temporal constraints of on-line processing? (2000: 26).  
   Segalowitz (2010) outlined a new framework for the conceptualization of fluency  
pertaining 3 types of fluency; cognitive, utterance and perceived fluency (see figure 1).  
This 3-way distinction advises that these 3 types be thought of separately. Previous  
conceptions of L2 fluency have been imprecise and do not reflect the multi-dimensional  
nature of fluency as a construct. These three definitions provided by Segalowitz will be  
Steven Morris 
 
described below.  
   Cognitive fluency refers to the extent to which the underlying cognitive processes  
which mediate L2 speech are working efficiently and rapidly. Furthermore, it refers to  
the co-ordination and integration of the processes by the speaker and the extent to which  
there exists internal interference or crosstalk between them. It is believed that high  
levels of efficiency and speed paired with low levels of interference promote greater  
fluency. These mechanisms responsible for speech are described in section 1.2.  
   Utterance fluency refers to the measurable features of speech which indicate fluency.  
Features used to measure utterance fluency include speed, hesitation and repair  
phenomena. There are many potential features of fluency which can be found in speech  
and a number of ways to operationalize each one (which will be reviewed in section 1.3.  
- Measures of fluency).  
   Perceived fluency refers to an impression received by the listener of a speaker?s  
cognitive fluency, a judgement of their cognitive processes based on the utterance  
fluency of their speech. Of course, as this is an ?on-line? impression the listener may  
take into account different factors when judging how features of the utterance affect  
their fluency.  
   The current study will be focusing on and measuring utterance fluency. With fluency  
defined, this review turns to how individual differences in oral fluency can be explained  
using speech production models. 
1.2. Models of L2 speech and how fluency is manifest 
   De Bot?s 1992 adaptation of Levelt?s 1989 blueprint of first language (L1) speech  
production is widely cited by L2 researchers as an accurate model of the bilingual  
speaker (Segalowitz, 2010). The model outlines the act of speaking step by step. De  
Bot?s model (1992) is presented below along with the 7 points at which fluency can be  
affected according to Segalowitz.  
   To begin with, the model assumes that there are three separate encoding modules; the  
conceptualizer, the formulator and the articulator (Kormos, 2006). Furthermore there  
are three knowledge stores; encyclopaedic knowledge, the mental lexicon and the  
syllabary (Kormos, 2006). Speech begins with the conceptualizer and with  
macroplanning, where the speaker plans what is to be said. During macroplanning, the  
speaker uses information from their encyclopaedic knowledge of the outside world and  
knowledge of their interlocutor?s thoughts and beliefs. Here, speech register and  
language choice are also selected. The knowledge used for macroplanning is not  
language-specific and as such does not affect L2 fluency. The next step is  
Steven Morris 
 
microplanning which is the preparation of speech and the process which leads to a  
preverbal message. While still being based on concepts, microplanning is much more  
specific than macroplanning because it only deals with concepts which can be  
lexicalized. The output of microplanning is the preverbal message. This is a conceptual  
code which can but has not yet been lexicalized. The preverbal message takes into  
account the speaker?s own perspective of the event.  Microplanning is the first potential  
source of L2 dysfluency because in forming the preverbal message the speaker must  
only use the concepts which they know they can express in the L2. L2 knowledge  
limitations will therefore lead to a slowing down of this process and thus, dysfluency ? 
despite the case being that formation of the preverbal message is not language-specific.  
   Grammatical encoding, or the formulator, is the next step. Here the preverbal message  
is encoded into appropriate words, it is given linguistic shape. Grammatical encoding  
entails which words are to be used but also how they relate to each other to express the  
perspective of the speaker. This process requires the use of information from another  
knowledge store, the mental lexicon. The mental lexicon contains representations of all  
the different lemmas known in a language and their meanings. Syntactic information  
relating to the lemma is also called upon when it is selected and these syntactic  
characteristics are also stored in the mental lexicon. Here the second source of potential  
dysfluency is encountered, where the speaker may face difficulties in using or retrieving  
information from the mental lexicon because the system may be incomplete in the L2. 
   Grammatical encoding leads to formation of the surface structure, a verbalized plan.  
This must then be converted into overt or ?real? speech in the articulator. It begins with  
morpho-phonological encoding which utilises morpho-phonological codes. These are  
stored together with lemmas in the mental lexicon. Again, difficulties in retrieval of  
information from the mental lexicon or when this retrieval is not automatized can lead  
to dysfluency in L2 speech (the fourth such point). The product of morpho-phonological  
encoding is the phonological score which is used for overt speech but firstly needs to be  
converted into an articulatory score. The articulatory score gives the articulatory system  
(speech apparatus) specific instructions in order to produce overt speech. Here,  
information is taken from the syllabary. The syllabary is a separate knowledge source  
which contains information on how to produce the specific motor responses which  
produce certain speech sounds, known as gestural scores. Here lies the fifth area for  
potential L2 fluency, occurring if the speaker does not select gestural scores  
automatically. Overt speech is the result of execution of the gestural scores relating to  
the articulatory score. Here there is also potential for dysfluency if the speaker does not  
Steven Morris 
 
carry out articulation of gestural scores automatically (the sixth such point). 
   Monitoring is where the speaker analyses what has been produced in the speech  
process. Kormos (2006) states that monitoring occurs in L2 speech just as it does in L1  
speech. She outlines three monitor loops. Firstly, after the pre-verbal message is  
produced, the speaker will compare it to the original concept produced during  
macroplanning. The second loop is the comparison of the phonological score with the  
speaker?s intentions. Finally, overt, parsed speech is monitored against a speaker?s  
original intentions. This monitoring system is the seventh and final point for potential  
dysfluency. Kormos (2006) proposed the bilingual speech production model which  
differs in some ways to the Levelt/de Bot model. For example, it is proposed that there  
is just one large memory store, long-term memory (LTM), which consists of sub- 
components analogous to the knowledge stores proposed by Levelt/de Bot but which  
includes an L2-specific store for declarative rules relating to L2 syntax and phonology. 
   The notion of automaticity in the underlying cognitive processes which carry out  
speech is a persistent factor mentioned above in explaining L2 oral fluency. According  
to Segalowitz (2010), automaticity is a difficult notion to define and is often left poorly  
operationalized but which in general refers to a process which is functioning at a high  
level of efficiency. Unfortunately, explanations and definitions of the nature of  
automaticity are sparse and the area is under researched. One theory is that automaticity  
can be defined as ballistic processing, the idea that once the specific process has begun  
it cannot be stopped (Favreau & Segalowitz,1983). 
   Here it is worth noting that a key goal of this study is to measure participants?  
utterance fluency through analysing the speech produced. Utterance fluency will relate  
to the performance and automaticity of the underlying system at the seven potential  
?points of fluency vulnerability? and can be considered as being closely linked with  
cognitive fluency. L2 fluency is probably affected by many variables, which is why the  
dynamic systems approach described below may be extremely important to  
understanding L2 fluency. 
   At the beginning of the text it was suggested that in order to measure something it  
must firstly be defined. The current study defines fluency in the narrow sense, as the  
efficiency, speed and degree of automaticity of the underlying cognitive processes  
responsible for L2 speech found in de Bot?s speech production model. It will take  
Segalowitz?s (2010) idea of utterance fluency as a working definition of and way of  
measuring fluency. Thus, now fluency has been appropriately defined, measurements of  
it can now be identified. 
Steven Morris 
 
1.3. Measuring fluency 
   Many issues arise once researchers begin to consider how to assess L2 fluency.  
Firstly, which quantifiable measures can be used to measure fluency? Secondly, how do  
different measures tap into different aspects of fluency? Finally, can measures of  
fluency be indicative of other aspects of the L2 such as global proficiency? These issues  
will be analysed in order below. 
   Many measurable and quantifiable features of oral production have been used which  
are thought to indicate a speaker?s L2 fluency (specifically, utterance fluency). Kormos  
(2006) provides an overview of common utterance fluency measures which have been  
used in the literature, including; speech rate (number of syllables/amount of time take  
including pauses) which can be pruned (repetitions, self-corrections and false starts  
removed) or unpruned (all utterances included), articulation rate (same as speech rate  
but excluding pause time), phonation-time ratio (seconds spent speaking/total time),  
mean length of runs (number of syllables in utterances between pauses of over 0.25  
seconds), number of silent pauses over 0.2 seconds/minute, mean length of pauses over  
0.2 seconds, filled pauses per minute (e.g. uhms and errs), dysfluencies per minutes  
(such as repetitions, restarts and repairs), pace (stressed words/minute) and space  
(stressed words/total number of words). Whilst being extensive, this list is by no means  
exhaustive.  
   It has been suggested by Skehan (2009) that fluency measures should be classified  
into 3 categories (see figure 1) and that different measures indicate performance for  
these different aspects of fluency. Skehan distinguishes between breakdown  
(dys)fluency (referring to pausing behaviour), repair (dys)fluency (referring to restarts,  
repetitions and repairs) and speed fluency (for example, speech rate). He also points out  
that there are higher-order measures which can indicate the level of automatization  
within L2 speech such as mean length of run. Skehan believes that in studies of L2  
fluency all of these categories of fluency should be considered, one of the goals of the  
current study. 
 
Figure 1: Fluency types and types of fluency measures 
 
                                                                ? Cognitive fluency 
   3 types of fluency (Segalowitz, 2010) ? Utterance fluency 
                                                                ? Perceived fluency 
 
                                                                                 ? Speed fluency 
   3 categories of fluency measure (Skehan, 2009) ? Repair fluency 
                                                                                ? Breakdown fluency 
Steven Morris 
 
   It has been shown that some common fluency measures can also be indicative of  
global proficiency as well as oral fluency.  Iwashita et al. (2008) compared L2 oral  
fluency (measured using 6 measures of L2 fluency) with the global proficiency ratings  
given to participants by raters. Oral productions were elicited using 5 oral production  
tasks. 6 measures of L2 fluency were used. They found that unfilled pauses, total pause  
time and speech rate had a significant impact on global proficiency ratings but filled  
pauses, repair and mean length of run did not.  
   To summarise the above it can be seen that there are broad and narrow definitions of  
fluency, explanations of differences in L2 oral fluency based on L2 speech models and  
many methods of measuring oral fluency. One of the main goals of the current study is  
to measure the utterance fluency of participants? L2 speech. This will be carried out  
using a broad range of fluency measures spanning across Skehan?s 3 categories;  
breakdown, repair and speed fluency. With fluency thoroughly reviewed, the second  
key part of the current study is now to be described, working memory, a potential  
predictor L2 oral fluency. 
1.4. Working memory 
   Working memory (WM) is the name given to the system in the brain which carries out  
the temporary storage of information and the dynamic processing and manipulation of  
this information. It is essential for the completion of higher-order cognitive tasks such  
as reasoning, learning and language comprehension (Baddeley, 2000). WM is seen to be  
at the very centre of cognition and all mental processing (Mota, 2003). The original  
model of WM proposed by Baddeley and Hitch (1974, cited in Baddeley, 2000)  
continues to be a widely accepted model of WM that integrates the limited capacity  
information storage aspect of previous models (e.g. Atkinson & Schiffrin?s (1968)  
model of short-term memory) with dynamic processing (Juffs & Harrington, 2011).  
Since the first model was proposed in 1974 it has been subject to a number of revisions.  
Here the model as revised by Baddeley (2000) is outlined.  
   WM is a multi-component system. The central executive is responsible for attentional  
control and it coordinates two subsidiary slave systems. The central executive is the  
most important and central component of WM. Its subsidiary systems are the visuo- 
spatial sketchpad which holds visual and spatial information and the phonological loop  
which holds verbal and acoustic information (thus they are modality-specific). The  
phonological loop holds chunks of information for a few seconds. This information can  
be refreshed using the articulatory rehearsal process. This is the sub-vocal repetition of  
information used for example when trying to remember a phone number. This refreshes  
Steven Morris 
 
information and avoids decay (Kormos & S?f?r, 2008). The original 1974 model of  
WM consists of these 3-components. Subsequently, the episodic buffer has been  
integrated into the model (Baddeley, 2000). The buffer is assumed to be another slave  
component controlled by the central executive but which is not modality-specific. It  
temporarily stores information from the other subsidiary components and serves as an  
intermediary between them and long-term memory (LTM). It binds information from  
WM to episodic LTM. Episodic LTM refers to conscious, declarative memories related  
to specific events (Baddeley, 2001). The introduction of the episodic buffer helps  
explain how WM can represent and store modality-specific information and how this  
information can be bound to LTM, something which was lacking from the initial model  
(Baddeley, 2000). As mentioned in the opening paragraph of this study, individual  
differences in cognitive abilities such as WM and PM can affect L2 acquisition and  
performance. 
1.5. Individual differences and measures of phonological short-term memory/working  
memory and how they relate to language learning 
   The phonological loop has perhaps been the most extensively researched component  
of Baddeley and Hitch?s model of WM (Kormos & S?f?r, 2008). As it is a component  
of WM it is important to this study. Phonological short-term memory (PSTM) capacity  
has been shown to have a strong relationship with various aspects of language learning  
for example; L1 lexical acquisition (Gathercole et al., 1999); L2 lexical access and  
language aptitude (Kormos & S?f?r, 2008); and function word and subordinate clause  
usage (O`Brien et al., 2006). Simple span tasks such as the letter span task and the word  
span task are designed to measure the storage capacity of PSTM (Engle et al., 1999). 
   WM involves not only the storage of information but also the dynamic processing of  
information central to many cognitive processes (Baddeley, 2000). Complex tasks  
which tap not only storage but processing as well can be used to measure individual  
differences in WM. The storage of to be remembered (TBR) items is interspersed with  
some kind of mental operation. A number of complex span tasks including the reading  
span task (where participants are given various sentences and instructed to say if the  
sentence makes sense and then remember the final word) have been shown to be valid  
and reliable measures of WM along with the counting span, operation span and  
backward digit span tasks (Conway et al., 2005; Kormos & Trebits, 2008). WM and  
PSTM have been shown to be distinct constructs which are highly related to each other  
(Engle et al., 1999). Essential to memory testing is that tests are administered in the  
participants? L1 to remove L2 proficiency as a confound (Gilabert & Mu?oz, 2010).  
Steven Morris 
 
   Individual differences in WM capacity have also been shown to be influential in many  
aspects of L2 learning. Example include; overall L2 proficiency and 3 major skills of  
reading, listening and speaking (Kormos & S?f?r, 2008); acquisition of morpho-syntax  
(French & O?Brien, 2008); and lexical complexity (Gilabert & Mu?oz, 2010). 
   Mackey et al. (2010) suggest that increased WM capacity affords language learners  
the sufficient cognitive resources to reflect on their output whilst speaking. They found  
that increased WM capacity was significantly correlated to modified output in L2  
speech i.e. reformulations and restarts. Inevitably WM has been considered to be one of  
the main predictors of language learning success (Juffs & Harrington, 2011).  
Correlations have been found between WM and language learning aptitude (Kormos &  
S?f?r, 2008; Hummel, 2009) and between WM and L2 proficiency (Hummel, 2009).  
   In summary, it has been shown that there are many measures of PSTM and WM and  
that they are related to many aspects of L2 learning (see Juffs & Harrington, 2011 for a  
complete review). However, it is oral production and specifically L2 fluency that will be  
the focus of attention in this study.  
1.6. L2 fluency and working memory 
   A number of studies have investigated a possible link between working memory and  
L2 oral fluency. Research within the area is scarce and no substantial body of evidence  
has accumulated for or against this link. Moreover, much of the research is limited by  
methodological problems.  
      Gilabert and Mu?oz (2010) examined the influence of WM capacity on proficiency  
and performance in foreign language acquisition. It was found that WM capacity and L2  
fluency were significantly, positively correlated, although a fairly modest correlation  
was found (.231). When the groups were split into high-proficiency and low-proficiency  
groups this correlation disappeared although this may have been because of decreased  
group size. WM capacity was measured using a reading span task. L2 fluency was  
measured by calculating unpruned speech rate (syllables per minute including all the  
utterances such as repetitions, self-corrections and false starts). Two limitations of this  
study are that analyses were correlational (thus non-inferential) and fluency  
measurement was limited to just speed fluency. 
      Mota (2003) too found some moderate correlations between WM capacity (as  
measured by the speaking span test) and L2 oral fluency using measures of speed  
fluency; filled/unfilled pauses, and mean length of run for 2 narratives. Filled and  
unfilled pauses however were not significantly correlated with WM. Using a simple  
linear regression analysis, WM capacity was found to account for 53%, 52% and 49%  
Steven Morris 
 
of speech rate unpruned, speech rate pruned and mean length of run respectively. This  
suggests that there is a link between WM and fluency. However, this study has severe  
limitations in that it used a very small sample (n=13) and proficiency in English was  
regarded as being homogenous by the author (advanced) without it being measured and  
was not considered a potential covariate. Moreover, the use of the speaking span test as  
a valid measure of WM capacity is questionable as the task requires spoken production  
of sentences which have to be generated by the participants themselves. This is essential  
to test performance and therefore reativity is bound to be a confounding variable. Given  
that the test itself was administered in the L2, English L2 proficiency will also be a  
confound. 
   Mizera (2006) found scant evidence for a relationship between WM capacity and L2  
oral fluency. WM was measured using three measures; the speaking span test, the math  
span test and the non-word repetition test. A wide range of fluency measures were  
carried out measuring speed, breakdown, repetitions and morphosyntactic accuracy.  
Morphosyntactic accuracy is a measure normally associated with accuracy not fluency.  
It was selected on the basis that it correlated with holistic judgements of fluency by  
raters in the first part of the study. Three working memory tests were also carried out.  
Of the thirty-five correlations only three were significant and the correlations found  
were very weak. The same pattern was found when only low-proficiency participants  
were analysed. The study used measures of many aspects of fluency which is rarely  
carried out and can be considered a huge positive. However, a limitation of the study is  
that the sample size used was fairly small (n=44), the speaking span test was used (see  
criticisms of Mota, 2003) and only non-inferential correlational analyses were  
performed. 
   Weissheimer and Mota (2009) found no overall correlations between working  
memory and L2 fluency. When the sample was split into a high and low group based on  
WM score (measured using a speaking span test) a significant correlation was found for  
the lower-WM group but not for the higher-WM group. This suggests that the effect of  
WM on fluency may differ depending on capacity of WM itself. However, the higher- 
span group was very small (n=8) and were heterogeneous when compared to lower-WM  
group which may have resulted in a non-significant correlation. Moreover, the complete  
sample used was small (n=32).  Again, the study only operationalized fluency in terms  
of speed (pruned and unpruned speech rate), used the speaking span test in the  
participants? L2, and used correlational analysis only.  
   Kormos and Trebits (2011) found that WM capacity as measured by the backward  
Steven Morris 
 
digit span task had no effect on oral fluency in two narrative tasks. Using a MANOVA  
test the study compared the fluency of four groups with differing WM capacities. The  
study however had many limitations; only one measure of fluency was used (unpruned  
speech rate), only one measure of WM was used, and the sample was limited in size and  
homogenous in terms of proficiency. The possibility that WM may affect fluency  
depending on the proficiency level of the individual is an area the current study aims to  
explore. A positive aspect of the study is the use of the MANOVA as an inferential  
statistical tool where many of the above studies rely solely on correlational analysis.  
1.7. How might working memory affect fluency? 
   At this point, it is important to provide an explanation as to how L2 fluency may be  
influenced by WM. Current explanations are rooted in de Bot?s model of the L2 speaker  
process. There is the potential for greater or lesser fluency at seven points in de Bot?s  
model (see section 1.2) according to Segalowitz (2010). Higher WM capacity could  
increase the extent to which L2 speech processes are automatizatized and  
proceduralized. Higher automaticity and proceduralization at these seven points will  
mean that the system relies less on WM itself during the formulation of speech (Mota,  
2003) which would lead to greater speed and efficiency in the speech process (that  
which Segalowitz describes as cognitive fluency). This speed and efficiency would be  
manifest as greater utterance fluency - that which is measured.    
   A second theory relates to Sawyer and Ranta?s (2001) suggestion that higher WM  
capacity allows more WM attentional resources to be freed up. This could also mean  
that throughout the learning process more cognitive resources can be dedicated to the  
learning of speech strategies and the rules related to production (Weissheimer & Mota,  
2009). The above ideas are hypotheses and are speculative in nature. 
1.8. Objectives and research questions 
   The current study will look further into this relationship between WM and L2 oral  
fluency. The majority of existing research has used just one or two measures of fluency  
(usually speech rate, an all-round measure of fluency) or has used questionable  
measures of WM capacity. These are methodological issues which the current study  
aims to overcome. The objectives of this study are; (a) to use a wide range of fluency  
measures (twelve in total) spanning across the three categories outlined by Skehan  
(2009); speed, breakdown and repair fluency; (b) to use a wide range of valid measures  
of WM; the reading span task, the letter span task, and the attention switching task; (c)  
to use inferential  statistical analysis as well correlational and; (d) to sub-divide  
participants based on L2 proficiency and WM capacity in order to further explore the  
Steven Morris 
 
complexities of any relationship between L2 fluency and WM capacity. The study is  
exploratory in nature and will produce a large amount of data which can be used to  
guide further research into this specific area. 
 
   The research questions are the following:- 
 
1) Does working memory capacity as measured by the reading span task, the letter  
span task and the attention-switching task affect L2 oral fluency? 
2) Does L2 proficiency affect the relationship between working memory and  
fluency? 
3) Does the relationship between working memory and fluency differ between 
measures of breakdown fluency, speed fluency and repair fluency?  
 
2. METHOD 
 
2.1. Participants 
    79 participants took part in this study. Their mean age was 21.2. They were taken  
from a pool of participants from the Age, Input and Aptitude in foreign language  
learners project (GRAL group, University of Barcelona). They were mainly L1  
Spanish/Catalan speakers (89% - all other L1s 11%) Their proficiency in English  
ranged between intermediate and upper-intermediate. 
2.2. Tests and Tasks 
   L2 English proficiency was measured using the Oxford Placement Test (OPT), a  
standardized general proficiency test. Mean = 44.7 (1dp), Standard deviation = 6.4  
(1dp). 
   The automated reading span task was used as developed by the GRAL group (Spanish  
or Catalan) to test both reading span (RS) and letter span (LS). The test has been to have  
high internal consistency (Cronbach?s Alpha .755 (RS) and .787 (LS)). The LS task is  
considered a simple memory task. It is a test of the information storage aspect of  
working memory without requiring dynamic information processing. The reading span  
task (RS) is a complex memory task. This type of task taps into both the storage and the  
dynamic information manipulation aspects of WM. It can be expected that these tests  
will correlate weakly as they measure some of the same components of WM (storage)  
but not all (information processing). 
   In the automated span task participants have to take three practice tests followed by  
the actual experiment. In the first test participants are asked to memorize a series of  
letters ranging from 3 to 9. The scores obtained by means of this practice task constitute  
Steven Morris 
 
the ?letter span score?. The second test asks participants to rate sentences as either  
making sense or not. The test also serves the purpose of automatically measuring the  
participants? mean reaction time in order present sentences in accordance with that  
mean (i.e. participants with longer reactions times during the practice test will observe  
the sentences on the screen for longer than participants with lower reactions times  
during the actual experiment). In the third practice test, participants are asked to practice  
rating sentences and memorizing letters simultaneously. This is followed by the actual  
experiment. It does this by giving participants to-be-remembered items and concurrently  
a distractor task requiring information processing (i.e. to decide whether the sentences  
make sense or not). 81 sentences in Spanish or Catalan were used here which were  
between 12-15 words long. They were organized into 15 sets ranging from 3 to 9  
sentences. Participants were a) given a sentence and asked to say if it made sense or not  
and b) asked to remember a letter given to them at the end. The final score is  
automatically calculated by the test and it factors in accuracy, order of the recalled  
letters and reaction times.  
   The attention-switching (AS) task (often referred to as the Trail Making Test) is a  
measure of attentional control and requires use of the executive control component of  
WM, but not information storage. Therefore it is expected to correlate only weakly with  
the RS and not at all with the LS. The task used was made up of two parts which were  
timed. The first part, participants joined up consecutive numbers (from 1?25) which  
were scrambled across a sheet of paper. The second task involved joining up numbers  
and letters (from 1-13 and from A-L), alternating between the two i.e. A-1-B-2-C-3 etc.  
which were again scrambled. Errors were pointed out by the experimenter and the time  
taken to correct errors was included. The time taken to complete the first part was taken  
as a baseline. This was subtracted from the time taken to complete the second task. Both  
tasks require visual search, visual perceptual ability and motor skills but only the second  
task requires task shifting and working memory. Therefore, that which remains was  
considered a measure of executive control functioning, the ?time cost? of switching  
between letters and numbers (Arbuthnott and Frank, 2000). 
2.3. L2 speech elicitation 
   The participants were given a film-retelling task in order to elicit speech in the L2,  
English. Participants were asked to view two short excerpts from the Charlie Chaplin  
film, ?Modern Times? twice. The excerpt lasted 8 minutes 30 seconds and can be found  
on this link (available July 2012) http://www.youtube.com/watch?v=5njTU7a3__g  
between 0:50-9:20.  
Steven Morris 
 
   After seeing the first half of the excerpt twice participants were asked, ?Can you tell  
me what happened in the first part of the story?? After the second half they were asked,  
?Can you tell me what happened in the second part of the story?? Responses were all in  
English except from asides in L1 Spanish/Catalan. They were recorded using a  
microphone and a sound recorder. The length of the combined narratives varied from 49  
? 564 seconds (mean=273.7s, s.d.=106.7s). The narratives were transcribed using  
computerized language analysis (CLAN) software. Transcriptions followed the  
conventions and principles of the CHAT transcription system. The transcriptions were  
then analysed using measures of oral fluency. 
2.4. Measures of fluency 
   Twelve measures of fluency were used. As all of these measures are sensitive to  
length of speech sample, they were divided by total speech time (seconds) and  
multiplied by 60 to give results per minute. 
   Speed fluency was measured using two speech rates. Unpruned speech rate (total  
number of syllables/minute) and pruned speech rate (total number of syllables produced  
excluding repetitions, repairs and restarts/minute) were calculated (taken from Mora and  
Valls, in press). 
   Repair fluency was measured using seven measures (taken from Mora and Valls, in  
press) Dysfluency ratio was used as a global measure of repair fluency  
(dysfluencies/minute). Dysfluencies are classified as; repetitions (where one word or  
groups of words appearing next to each other are repeated by the speaker), repairs  
(where the speaker corrects themselves) or restarts (where the speaker leaves a sentence  
unfinished and begins a new one). Repetitions/minute, repairs/minute and  
restarts/minute were also calculated as measures of repair fluency.  
   Repairs were sub-classified into three categories (as in Gilabert, 2007); lexical repairs  
(where the repaired word used is judged to be incorrect at the lexical level e.g.  
her?him), morphosyntactic repairs (where the repaired word used is part of a syntactic  
structure e.g. works?worked), and other repairs. Other repairs include; appropriacy  
repairs (where inappropriate or inadequate information is judged to have been supplied  
e.g. there is a man?a gentleman) and different repairs (where the original speech plan  
is changed and different information is encoded).  
   Breakdown fluency was measured using filled pauses/minute (such as ?umm? or  
?erm?). Filled pauses were sub-classified as intra-clausal filled pauses/minute (pauses  
within clauses) and inter-clausal filled pauses/minute (pauses at the boundary of two  
different clauses). Clauses were operationally defined as a grammatical unit consisting  
Steven Morris 
 
of a verb plus any dependent elements linked to it. Thus, a subject, verb, object and  
adverbial would count together as a single clause (as in Mora and Valls-Ferrer, in  
press). 
 
3. RESULTS 
 
   Descriptive statistics for the group as a whole can be found in Appendix #1. 
   The three measures of WM were correlated with each other in order to gain insight  
into the extent to which they are measuring the same ability. The reading span (RS) and  
letter span (LS) tasks showed a weak, positive correlation (.257, p=.020). The RS and  
attention span (AS), and the LS and AS tasks did not correlate significantly. This  
suggests that the assumption made that each test was measuring different abilities can  
be confirmed, with some overlap between the RS and LS tasks (in line with Engle et al.,  
1999). 
   The first research question was addressed i.e. does working memory capacity affect  
L2 oral fluency. The relationship between fluency and WM was explored using partial  
correlations in which the effect of L2 proficiency was removed as an extraneous  
variable (OPT scores were controlled for). These correlations can be found in Table 1.  
   As can be seen, WM capacity as measured by the RS and LS tasks correlated  
positively and albeit weakly with ?morphosyntactic repairs? and the same was found for  
?other repairs? and performance on the LS task. Correlations between dysfluencies/min  
and the LS task were nearly significant. These results suggest that those with higher  
WM-capacity tend to make more morphosyntactic and other types of repair. Overall  
there are few significant relationships between fluency and the measures used here. 
   In order to further explore research question 1 the groups for each WM test were sub- 
divided into two groups, one higher-capacity and one lower-capacity to see if closer  
analysis of the data would produce more findings. Partial correlations, again controlling  
for OPT scores were run once more for the lower-capacity and higher-capacity reading  
span groups (see Table 2).  
   The groups were split using k-means clustering (RS - lower-capacity N=31, higher- 
capacity N=48; LS ? lower-capacity N=60, higher-capacity N=5; AS ? lower-capacity  
N=65, higher-capacity=11). For LS and AS, some of the groups created were too small.  
Therefore a median split was utilised, where the median value is taken as a cut-off point  
(LS ? lower-capacity=31, higher-capacity N=30, AS lower-capacity N=37, higher- 
capcity N=37).  
 
Steven Morris 
 
Table 1: Partial correlations (controlling for OPT score) of three WM measures and twelve  fluency 
measures 
 Reading Span 
Task (N=79) 
Letter Span Task 
(N=65) 
Attention Span 
Task (N=77) 
 
Rate A 
 
Sig (2-tailed) 
.035 
.381 
.025 
.423 
-.056 
.316 
Rate B 
 
Sig (2-tailed) 
.030 
.399 
.008 
.476 
-.084 
.238 
Fp/min 
 
Sig (2-tailed) 
-0.25 
.415 
-0.11 
.466 
-.019 
.437 
Inter-clausal fp/min  
 
Sig (2-tailed) 
.037 
.373 
.036 
.390 
.111 
.172 
Intra-clausal fp/min 
 
Sig (2-tailed) 
-0.058 
.308 
-0.38 
.384 
-0.88 
.227 
Repairs/min 
 
Sig (2-tailed) 
-.030 
.398 
.146 
.125 
.036 
.380 
Lexical repairs/min 
 
Sig (2-tailed) 
 
-.132 
 
.124 
-0.76 
 
.275 
-.056 
 
.316 
Morphosyntactic repairs/min 
 
Sig (2-tailed) 
.195* 
.044 
.273* 
.014 
.147 
.104 
Other repairs/min 
 
Sig (2-tailed) 
-.010 
.467 
.213* 
.045 
-.053 
.325 
Repetitions/min 
 
Sig (2-tailed) 
.097 
.199 
.016 
.451 
.087 
.229 
Restarts/min 
 
Sig (2-tailed) 
.022 
.423 
-015 
.453 
.164 
.080 
Dysfluencies/min 
 
Sig (2-tailed) 
.078 
.248 
.343 
.051 
.101 
.194 
*p < 0.05, **p < 0.01, bold type = approaching significance, fp=filled pauses 
 
 
   
Steven Morris 
 
Table 2. Partial correlations (controlling for OPT score) of fluency measures and WM measures divided 
into low- and high-proficiency categories 
 Lower-
capacity 
RS group 
(N=31) 
Higher-
capacity 
RS group 
(N=48) 
Lower-
capacity 
LS group 
(N=31) 
Higher-
capacity 
LS group 
(N=30) 
Lower-
capacity 
AS group 
(N=37) 
Higher-
capacity 
AS group 
(N=37) 
 
Rate A 
 
Sig (2-tailed) 
.073 
 
.350 
-.148 
 
.161 
-.073 
 
.341 
.144 
 
.212 
.116 
 
.251 
.168 
 
.161 
 
Rate B 
 
-0.25 
 
-.109 
 
-.104 
 
.174 
 
.087 
 
.114 
 
Sig (2-tailed) 
 
.447 
 
.234 
 
.279 
 
.167 
 
.307 
 
.252 
 
Fp/min 
 
Sig (2-tailed) 
 
 
.178 
 
.173 
 
.197 
 
-.127 
 
-.071 
 
.463 
 
-.087 
 
.315 
 
-.128 
 
.229 
 
.041 
 
.404 
Inter-clausal fp/min  
 
Sig (2-tailed) 
.236 
 
.105 
.-0.14 
 
.462 
-.017 
 
.463 
-.095 
 
.299 
.041 
 
.406 
.095 
 
.288 
 
Intra-clausal fp/min 
 
Sig (2-tailed) 
 
.100 
 
.299 
 
-.155 
 
.149 
 
-.085 
 
.317 
 
-.065 
 
.359 
 
-.190 
 
.133 
 
.003 
 
.494 
 
Repairs/min 
 
Sig (2-tailed) 
 
.310* 
 
.048 
 
-.57 
 
.353 
 
.186 
 
.146 
 
-.278 
 
.059 
 
.198 
 
.123 
 
-.085 
 
.308 
 
Lexical repairs/min 
 
Sig (2-tailed) 
 
.141 
 
.228 
 
-.160 
 
.141 
 
 
.110 
 
.267 
 
-.385* 
 
.013 
 
.120 
 
.242 
 
-.018 
 
.457 
Morphosyntactic repairs/min 
 
Sig (2-tailed) 
.280 
 
.067 
-.16 
 
.459 
.249 
 
.078 
-.088 
 
.313 
.076 
 
.329 
-.052 
 
.379 
 
Other repairs/min 
 
Sig (2-tailed) 
 
.374* 
 
.021 
 
.239 
 
.053 
 
.055 
 
.379 
 
.238 
 
.091 
 
.144 
 
.202 
 
-.077 
 
.325 
 
Repetitions/min 
 
Sig (2-tailed) 
 
.483** 
 
.003 
 
 
.069 
 
.323 
 
-.029 
 
.435 
 
-.050 
 
.392 
 
.143 
 
.203 
 
.018 
 
.459 
Restarts/min 
 
Sig (2-tailed) 
.281 
 
.066 
-.336* 
 
.011 
 
-.029 
 
.435 
.157 
 
.191 
.030 
 
.432 
.173 
 
.153 
Dysfluencies/min 
 
Sig (2-tailed) 
.518** 
 
.002 
.011 
 
.470 
.026 
 
.442 
-.083 
 
.323 
.184 
 
.142 
.016 
 
.464 
*p<0.05, **p<0.01, bold type=significant/approaching significance, fp=filled pauses 
 
 
   Beginning with RS, the lower-capacity group showed positive significant correlations  
for repairs, other repairs, repetition and dysfluencies. Morphosyntactic repairs and  
Steven Morris 
 
restarts showed a string trend. For higher-capacity RS a significant negative correlation  
was found with restarts and other repairs reached positive, near-significance. Together,  
it seems to suggest that repair fluency decreases as WM increases with the exception of  
restarts which showed the opposite effect. No other significant results were found. 
   For the lower-capacity LS, near significant, positive correlation was found between  
morphosyntactic repairs and LS score. In the higher-capacity group a negative, medium- 
strength correlation was found for lexical repairs. This second result seems to contradict  
the above findings for RS. However, these two tests are measuring different things.  
   No significant results were found for AS. 
   Inferential methods of analysis were then used. The higher- and lower- WM capacity  
groups were  compared for each measure of fluency in order to see if having greater  
WM inferred greater oral fluency (research question 1). This was carried out using the  
independant samples T-test (for normally distributed fluency measures) and the Mann- 
Whitney U-test (for non-normally distributed fluency measures). Normality was  
assessed using the Kolmogorov-Smirnov normality test. The findings are displayed in  
Tables 3, 4 and 5.  
 
Table 3. Mann-Whitney U-tests and independent samples T-tests comparing higher- and lower-capacity 
reading span task groups (2-split) on twelve fluency measures 
 
 
 
Mann-Whitney U-
test 
 Independant samples T-test 
Rate A .129  X 
Rate B X  .352 
Fp/min X               .687 
Inter-clausal fp/min .514  X 
Intra-clausal fp/min X  .687 
Repairs/min X  .312 
Lexical repairs/min .491  X 
Morphosyntactic repairs/min .115  X 
Other repairs/min .096  X 
Repetitions/min X  .831 
Restarts/min .338  X 
Dysfluencies/min X  .729 
fp=filled pauses 
    
Steven Morris 
 
   Table 3 shows that for RS scores there were no significant difference between the  
higher- and lower-WM capacity groups in any of the fluency measures. 
 
Table 4. Mann-Whitney U-tests and independent samples T-tests comparing higher- and lower-capacity 
listening span task groups on twelve fluency measures 
 
 
 
Mann-Whitney U-
test 
 Independant samples T-test 
Rate A .784  X 
Rate B X  .759 
Fp/min X               .986 
Inter-clausal fp/min .901  X 
Intra-clausal fp/min X  .901 
Repairs/min X  .261 
Lexical repairs/min .971  X 
Morphosyntactic repairs/min .021*   X 
Other repairs/min .090  X 
Repetitions/min X  .573 
Restarts/min .744  X 
Dysfluencies/min X  .441 
*p<0.05, bold type = approaching significance, fp=filled pauses 
 
    
   Table 4 shows that for LS scores there was a significant difference between the two  
groups for morphosyntactic repairs (repair fluency). Analysis of the means shows that  
the higher-capacity group tended to make more morphosyntactic repairs. 
   Table 5 shows that for AS scores there were no significant difference between the  
higher- and lower-WM capacity groups in any of the fluency measures. 
   For the RS group, k-means clustering produced three groups well-balanced in size  
(low to high ? 22, 35, 22). It was thought that comparing the highest and lowest groups  
would be of interest as there would be a clear distinction between the groups (as they  
were seperated by the middle-capacity group. The same inferential tests were carried  
out. The results are found in Table 6. Again, there were no significant differences  
between the high and low-capacity groups. 
 
 
 
Steven Morris 
 
 
 
Table 5. Mann-Whitney U-tests and independent samples T-tests comparing higher- and lower-capacity 
attention-switching task groups on twelve fluency measures 
 
 
Mann-Whitney U-
test 
 Independant samples T-test 
Rate A .320  X 
Rate B X  .105 
Fp/min X               .990 
Inter-clausal fp/min .446  X 
Intra-clausal fp/min X  .687 
Repairs/min X  .785 
Lexical repairs/min .851  X 
Morphosyntactic repairs/min .352  X 
Other repairs/min .475  X 
Repetitions/min X  .556 
Restarts/min .389  X 
Dysfluencies/min X  .530 
fp=filled pauses 
   
Table 6. Mann-Whitney U-tests and independent samples T-tests comparing higher- and lower-capacity 
reading span task groups (3-split) on twelve fluency measures 
 
 
Mann-Whitney U-
test 
 Independant samples T-test 
Rate A .453  X 
Rate B X  .854 
Fp/min X               .501 
Inter-clausal fp/min .372  X 
Intra-clausal fp/min X  .391 
Repairs/min X  .854 
Lexical repairs/min .851  X 
Morphosyntactic repairs/min .213  X 
Other repairs/min .847  X 
Repetitions/min X  .292 
Restarts/min .709  X 
Steven Morris 
 
Dysfluencies/min X  .394 
fp=filled pause 
 
   In order to answer research question 3, whether proficiency and affects the  
relationship between fluency and WM the group was split by proficiency (measured by  
OPT score). K-means clustering was used to create two groups; lower-proficiency  
(N=53) and higher-proficiency (N=26). Partial correlations were again carried out for  
all three WM tests (controlling for OPT score). These can be found in tables 7, 8 and 9.  
   Looking at table 7 The higher-proficiency RS group have significant negative  
correlations for RS with filled pauses, intra-clausal filled pauses and near-significant  
negative correlations with all repairs and lexical repairs. The lower-proficiency group  
has significant, positive correlations for RS with rate A, repairs and morphosyntactic  
repairs. This suggests that higher-proficiency speakers use fewer filled pauses and less  
repairs than lower-proficiency speakers. However, lower-proficiency speakers might  
have greater speed fluency. 
   Table 8 shows that the higher-proficiency group have significant positive correlations  
for LS with repairs and morphosyntactic repairs, the opposite to the RS results. This  
suggests that as proficiency increases the amount of morphosyntactic repairs increases.  
The lower-proficiency group showed no significant correlations between LS and any of  
the fluency measures. 
   Table 9 shows that the higher-proficiency group had no significant correlations  
between AS and the fluency measures. For the lower-proficiency measures there was a  
strong, positive correlation between AS and morphosyntactic repairs and near- 
significance with restarts. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Steven Morris 
 
 
 
Table 7. Correlations for higher- and lower-proficiency groups ? RS and twelve fluency measures 
 
 
 
RS higher-proficiency group 
(N=53) 
RS lower-proficiency group 
(N=26) 
Rate A 
 
Sig (2-tailed) 
-0.73 
 
.303 
.310 
 
.066 
 
Rate B 
 
-0.58 
 
.225 
 
Sig (2-tailed) 
 
.340 
 
.139 
 
Fp/min 
 
Sig (2-tailed) 
 
 
-.243* 
 
.041 
 
.276 
 
.091 
Inter-clausal fp/min  
 
Sig (2-tailed) 
-.125 
 
.188 
.262 
 
.103 
 
Intra-clausal fp/min 
 
Sig (2-tailed) 
 
-.245* 
 
.040 
 
.240 
 
.124 
 
Repairs/min 
 
Sig (2-tailed) 
 
-.215 
 
.063 
 
.341* 
 
.048 
 
Lexical repairs/min 
 
Sig (2-tailed) 
 
-.215 
 
.063 
 
 
.042 
 
.422 
Morphosyntactic repairs/min 
 
Sig (2-tailed) 
.064 
 
.326 
.438* 
 
.014 
 
Other repairs/min 
 
Sig (2-tailed) 
 
-.056 
 
.345 
 
.053 
 
.400 
 
Repetitions/min 
 
Sig (2-tailed) 
 
.086 
 
.273 
 
 
.125 
 
.276 
Restarts/min 
 
Sig (2-tailed) 
-.048 
 
.369 
 
.157 
 
.227 
Dysfluencies/min 
 
Sig (2-tailed) 
.199 
 
.078 
.215 
 
.151 
*p<0.05, fp=filled pauses 
 
    
 
 
 
Steven Morris 
 
 
Table 8. Correlations for higher- and lower-proficiency groups ? LS and twelve fluency measures 
 
 
 
LS task higher-proficiency 
group (N=43) 
LS task lower-proficiency group 
(N=22) 
 
Rate A 
 
Sig (2-tailed) 
.023 
 
.443 
-.147 
 
.263 
 
Rate B 
 
.010 
 
.107 
 
Sig (2-tailed) 
 
.474 
 
.323 
 
Fp/min 
 
Sig (2-tailed) 
 
 
-.136 
 
.195 
 
.109 
 
.319 
Inter-clausal fp/min  
 
Sig (2-tailed) 
-.060 
 
.354 
.086 
 
.355 
 
Intra-clausal fp/min 
 
Sig (2-tailed) 
 
-.142 
 
.185 
 
.109 
 
.320 
 
Repairs/min 
 
Sig (2-tailed) 
 
.139 
 
.190 
 
.135 
 
.279 
 
Lexical repairs/min 
 
Sig (2-tailed) 
 
 
-.076 
 
.316 
 
-.118 
 
.305 
Morphosyntactic repairs/min 
 
Sig (2-tailed) 
.307* 
 
.024 
.217 
 
.172 
 
Other repairs/min 
 
Sig (2-tailed) 
 
.265* 
 
.045 
 
.189 
 
.207 
 
Repetitions/min 
 
Sig (2-tailed) 
 
 
.065 
 
.341 
 
-.070 
 
.381 
Restarts/min 
 
Sig (2-tailed) 
-.105 
 
.253 
 
-.150 
 
.259 
Dysfluencies/min 
 
Sig (2-tailed) 
.079 
 
.310 
-.016 
 
.472 
*p<0.05, fp=filled pauses 
 
   
 
 
 
Steven Morris 
 
 
 
Table 9. Correlations for higher- and lower-proficiency groups ? AS and twelve fluency measures 
 
 
 
AS task higher-proficiency 
group (N=51) 
AS task lower-proficiency group 
(N=25) 
 
Rate A 
 
Sig (2-tailed) 
-.053 
 
.357 
-.136 
 
.262 
 
Rate B 
 
-.069 
 
-.175 
 
Sig (2-tailed) 
 
.317 
 
.206 
 
Fp/min 
 
Sig (2-tailed) 
 
 
-.036 
 
.401 
 
-.095 
 
.329 
Inter-clausal fp/min  
 
Sig (2-tailed) 
.010 
 
.472 
.189 
 
.188 
 
Intra-clausal fp/min 
 
Sig (2-tailed) 
 
-.053 
 
.357 
 
.224 
 
.136 
 
Repairs/min 
 
Sig (2-tailed) 
 
-.083 
 
.284 
 
.191 
 
.185 
 
Lexical repairs/min 
 
Sig (2-tailed) 
 
 
-.084 
 
.281 
 
-.048 
 
.411 
Morphosyntactic repairs/min 
 
Sig (2-tailed) 
-.103 
 
.237 
.432* 
 
.017 
 
Other repairs/min 
 
Sig (2-tailed) 
 
.019 
 
.448 
 
-.209 
 
.163 
 
Repetitions/min 
 
Sig (2-tailed) 
 
 
.089 
 
.269 
 
.103 
 
.316 
Restarts/min 
 
Sig (2-tailed) 
 
.075 
 
.302 
.333 
 
.056 
Dysfluencies/min 
 
Sig (2-tailed) 
.061 
 
.337 
.170 
 
.214 
*p<0.05, fp=filled pauses 
 
 
 
 
Steven Morris 
 
 
4. DISCUSSION 
 
4.1. Research questions 3 and 1 
 
   Going against the status quo for discussing results, this section will begin by  
answering the third research question posed by the study, ?Does the relationship  
between WM and fluency differ between measures of speed, breakdown and repair  
fluency?? with a view to answering research question one simultaneously which was,  
?Does WM capacity affect L2 oral fluency??. 
   The study strongly suggests that WM capacity influences the 3 types of fluency (as  
outlined by Skehan, 2009) to different extents. To begin with, it seems that there is  
probably no relationship between WM capacity and speed fluency (with one potential  
exception elaborated on in section 4.2., as it appears to be proficiency dependent). A  
more concrete yet modest relationship was found between WM capacity and breakdown  
fluency - relating to filled pauses. This may also be linked to proficiency and is to be  
discussed in section 4.2. By far the strongest link between WM capacity and fluency  
was found for repair fluency. Results suggest that learners with higher WM-capacity  
tend to use more morphosyntactic, appropriacy repairs, different repairs and restarts.  
This relationship was a weak one found in some of the correlational analyses and  
comparisons of the means.  
   How could this be explained? A speaker?s output is constantly being monitored and  
compared to what the speaker knows (the interlanguage system) using a feedback loop  
(Kormos, 2006). When there is incongruence between speech and the interlanguage  
system of the speaker, i.e. a mistake has been recognised or something has not been  
expressed as desired, the speaker will normally stop to either reformulate (in the case of  
morphosyntactic repairs) or reconceptualise (in the case of appropriacy repairs, different  
repairs and restarts). All these processes require conscious effort. For example, a  
morphosyntactic repair will usually require the adaptation of a word e.g. work?works,  
finded?found. This requires dynamic processing of information stored in WM, 
attentional resources (in order to notice the incongruence), and memory resources (to  
hold all the elements of the conversation whilst information manipulation is being  
carried out). This is a complex process, attentional and information processing resources  
(in other words, WM resources) are being stretched. Therefore, those with higher WM- 
capacity are more likely to have the available resources ?left over? (since speaking in a  
L2 is a demanding task in itself) to make these conscious, effortful repairs. This is in  
line with the theory posited by Sawyer and Ranta (2001) who suggested that higher  
Steven Morris 
 
WM capacity allowed for more attentional resources to be freed up for L2 speech  
processing. 
   Bearing the above in mind, it is appropriate now to introduce lexical repairs, for which  
the opposite effect was found - higher-capacity WM participants made less lexical  
repairs. It is suggested that lexical repairs may be qualitatively different from the other  
types of repairs. Indeed, lexical repairs also require WM resources in terms of storage  
and attention to feedback. However, it is suggested here that dynamic processing is not  
part of the process, the speaker must instead return to the mental lexicon in order to  
search for a new lemma which better represents the preverbal message. It is  
hypothesised that this requires less WM resources and therefore it is not only higher  
WM capacity learners who can carry it out. But how can the fact that fewer lexical  
repairs were carried out by higher-WM learners be explained? Indeed, the process of  
lexical access may be more automatized in the higher capacity group which is in line  
with Segalowitz?s (2010) idea that lexical access is one point of ?dysfluency  
vulnerability?. Thus, lexical access is more efficient in higher WM capacity learners and  
lemmas noticed as being incongruent are less likely to be selected in the first place. But  
if a higher WM capacity speaker has an advantage for lexical access why then do they  
not have a greater advantage for forming correct morphosyntactic structures? Their  
interlanguage system may still be incomplete or underspecified, higher WM does not  
guarantee high proficiency, they simply have more resources available to deal with  
morphosyntactic errors when they arise. Of course, an unrepaired utterance does not  
imply that it is ?correct?, the error may not have been noticed. These are speculative  
hypotheses and certainly warrant further research. 
   These findings open up some interesting avenues. Schmidt?s noticing hypothesis  
(1993) posits that the features of a language will not be learned unless they have been  
noticed. Thus repairs, being an overt indication of noticing one?s own mistakes, are  
essential for the restructuring of an interlanguage and thus, learning. This is especially  
true for a learner using explicit, conscious learning mechanisms. Therefore, the  
understanding of the different processes involved in repair behaviours, how they are  
mediated by individual differences and how it relates to learning is an exciting area for  
future research as they so important for language learning. 
   Whilst restarts were included with morphosyntactic and ?other? repairs, their  
relationship with WM was found to be modest and much more complex. In the higher  
capacity WM group there was a trend for a positive correlation and in the lower  
capacity WM group there was a significant negative correlation (higher- and lower- 
Steven Morris 
 
capacity RS groups, respectively). This suggests the existence of a non-linear  
relationship where some kind of WM threshold indicates a change in the effect of WM  
on restarting behaviours. Inappropriate statistical methods were used here to investigate  
non-linear relationships, but this finding suggests that the relationship may be more  
complex than was imagined. 
   To conclude this section, a return to research question one. It seems that WM capacity  
does affect L2 fluency but only some aspects of it. 
4.2. Research question 2 
   Research question two asked whether proficiency had an effect on the relationship  
between WM capacity and fluency.  
   For speed fluency, there was a trend for a positive correlation between WM and  
speech rate (speed of speech) for lower-proficiency learners. Although only a trend was  
found here, it may be that WM makes a difference only at a lower-level of proficiency.  
Past a certain proficiency threshold, WM may no longer discriminate and all speakers  
will use comparable speeds.  
   For breakdown fluency it was found that for higher-proficiency learners there was a  
weak, negative correlation between filled pauses and WM suggesting that WM  
increases the speed of lexical access. These higher-WM/proficiency learners were not  
forced to pause to search for a word. This argument is supported by the fact that this  
pattern was found for intra-clausal filled pauses and not inter-clausal pauses. Inter- 
clausal pauses are frequent amongst native-speakers as they have a non-linguistic  
purpose, they allow time for coneptualisation (Tavakoli, 2011). Intra-clausal pauses  
relate to lexical access however and are much more frequent in non-native speakers.  
There is a delay in retrieval of the appropriate lemma. This seems to decrease as WM  
capacity increases. Why might this relationship exist only in higher-proficiency  
learners? WM capacity may only discriminate once a certain proficiency level is  
reached. As vocabulary size increases, the speaker?s interlanguage system becomes  
more complex as do their utterances. It might be that only at this point does WM start to  
play a role. Here an important limitation of the study must be stated, that only filled  
pauses were measured and that pause duration was not taken into account. What is  
reported is only part of the total picture and learners with a tendency to pause silently  
are not represented here. 
   The positive relationship reported in section 4.1 for WM measures and  
morphosyntactic and other repairs were also found with the proficiency divide.  
Significant results were balanced across the higher-proficiency and lower-proficiency  
Steven Morris 
 
groups with no clear pattern. Therefore it seems that if proficiency does affect the  
relationship between WM and repair fluency, it is a complex process whose intricacies  
were not detected in this study. This seems unlikely. Kormos (1999) found that self- 
repair frequency was unaffected by proficiency, only the type of repairs made are  
affected by proficiency. 
4.3. Points of discussion and limitations 
   In order to probe the research questions more deeply groups were split into two or  
three based on task performance and proficiency scores using either k-means clustering  
or a median split. This could be considered a limitation of the study for three reasons.  
Firstly, the splits could be considered arbitrary in the case of the median split. Secondly,  
there is no reason to believe that the participants constitute a representative range of the  
attributes measured e.g. it is unlikely that the higher-proficiency group constituted high- 
proficiency in comparison to standardised norms. Finally, the power of the statistical  
analyses was considerably weakened by using fewer participants. However, the splitting  
of the groups can be justified by the insightful findings which resulted from it. Given  
the exploratory nature of the study and embryonic status of this particular research area,  
this was essential. Moreover, on splitting the groups the possible existence of non-linear  
relationships and thresholds arose, as in the case of restarts. 
   De Bot, Lowie and Verspoor (2007) describe a dynamic systems approach to  
language which states that any speaking situation that is thought to be a measure of  
competence will have a relationship to an individual?s performance. This can be related  
to fluency where a dynamic relationship can be reasonably assumed for fluency and the  
context of a speech act - they affect each other. Segalowitz (2010) proposed L2 fluency  
is influenced by three components which interact; motivation to communicate, the  
larger social context and perceptual and cognitive experiences relating to fluency. This  
is an example of the numerous interesting avenues of investigation in relation to L2  
fluency and show that the foci of research into fluency should not be limited to just  
cognitive variables such as WM. 
   In conclusion, it seems that WM capacity has affects repair fluency and the type of  
repair which a person makes. The difference between lexical repairs and the other types  
of repair measured here may reflect a distinction between the underlying processes  
mediating these repairs. Future research should further investigate repair frequency and  
type in relation to WM capacity as repairs are so important to L2 learning ? they are  
indicative of attention to form, interlanguage restructuring and progress. 
     
Steven Morris 
 
APPENDIX 
 
Appendix 1 ? A table showing the descriptive statistics for the participants  
 N Minimum Maximum Mean Std. Deviation 
Rate A 79 99,17 298,52 154,9101 34,99874 
Rate B 79 86,78 256,23 138,6015 33,49441 
Filled pauses/min 79 ,50 14,24 5,0749 2,75451 
Inter-clausal filled pause/min 79 ,00 6,67 1,6507 1,22267 
Intra-clausal filled 
pauses/min 
79 ,00 8,97 3,4241 1,97459 
Repairs/min 79 ,00 5,84 1,7993 1,03393 
Lexical repairs/min 79 ,00 4,54 ,9636 ,80468 
Morphosyntactic repairs/min 79 ,00 2,70 ,6106 ,56704 
Other repairs/min 79 ,00 1,53 ,2460 ,35529 
Repetitions/min 79 ,00 18,37 5,2190 3,40399 
Restarts/min 79 ,00 1,82 ,3913 ,45280 
Dysfluencies/min 79 ,27 20,82 7,4095 3,97473 
Reading span task 79 7,00 75,00 35,9620 16,83589 
Letter span task 65 5,00 93,00 66,9692 15,86365 
Attention span task 76 -5,48 63,94 16,6897 15,16139 
Oxford Proficiency Test 79 28,00 56,00 44,7089 6,40343 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Steven Morris 
 
REFERENCES 
 
Arbuthnott, K. and Frank, J. (2000). Trail making test, part B as a measure of executive 
control: Validation using a set-switching paradigm. Journal of Clinical and 
Experimental Neuropsychology, 22, 518-528. 
 
Atkinson, R.C.; Shiffrin, R.M. (1968). Human memory: A proposed system and its 
control processes. In Spence, K.W.; Spence, J.T. (Eds.). The psychology of learning and 
motivation (Volume 2). New York: Academic Press. pp. 89?195. 
 
Baddeley, A. (2000). The episodic buffer: A new component of working memory? 
Trends in Cognitive Sciences, 4, 417-422. 
 
Baddeley, A. (2001). The concept of episodic memory. Philosophical Transactions of 
the Royal Society of London B: Biological Sciences. 356(1413): 1345?1350. 
 
De Bot, K. (1992). A bilingual production model: Levelt?s ?speaking? model adapted. 
Applied Linguistics, 13, 1-24.  
 
Conway, A.R.A., Kane, M.J., Bunting, M.F. Hambrick, D.Z., Wilhelm, O. and Engle, 
R.W. (2005). Working memory span tasks: A methodological review and user?s guide. 
Psychonomic Bulletin and Review, 12, 769-786. 
 
Dornyei, Z. and Kormos, J. (1998). Problem-solving mechanisms in L2 communication. 
SSLA, 20, 349-385. 
 
Engle, R.W., Laughlin, J.E.,Tuholski, S.W. and Conway, A.R.A. (1999). Working 
memory, short-term memory, and general fluid intelligence: A latent-variable approach. 
Journal of Experimental Psychology, 128, 309-331. 
 
Iwashita, N., Brown, A., McNamara, T. and O?Hagan, S. (2008). Assessed levels of 
second language speaking proficiency: How distinct? Applied Linguistics, 29, 24-49. 
 
Juffs, A. and Harrington, M (2011). Aspects of working memory in L2 learning. 
Language Teaching, 44, 137-166. 
 
Favreau, M., & Segalowitz, N. (1983). Automatic and controlled processes in the first 
andsecond language reading of fluent bilinguals. Memory & Cognition, 11, 565?574. 
 
Fillmore, C.J. (1979). On fluency. In D. Kempler and W.S.Y. Wang (Eds), Individual 
differences in language ability and language behaviour (pp. 85-102). New York: 
Academic Press.  
 
French, L.M. and O?Brien, I. (2008). Phonological memory and children?s second 
language grammar learning. Applied Linguistics, 29, 463-487. 
 
Gathercole, S.E., Service, E., Hitch, G.J., Adams, A. and Martin, A.J. (1999). 
Phonological short-term memory and vocabulary develoPSTMent: further evidence on 
the nature of the relationship. Applied Cognitive Psychology, 13, 65-77. 
 
Steven Morris 
 
Gilabert (2007). The simultaneous manipulation of task complexity along planning and 
+/- Here-and-Now: effects on L2 oral production. In Garc?a-Mayo, M.P (Ed.). 
Investigating tasks in formal language learning. Clevedon: Multilingual Matters. 
 
Gilabert, R. and Mu?oz, C. (2010). Differences in attainment and performance in a 
foreign language: The role of working memory. International Journal of English 
Studies, 10, 19-42. 
 
Hummel, K.M. (2009). Aptitude, phonological memory and second language 
proficiency in non-novice adult learners. Applied Linguistics, 30, 225-249. 
 
Kormos, J. (1999). Monitoring and self-repair in L2.  Language Learning, 49 (2), 303-
342. 
 
Kormos, J. (2006). Speech production and second language acquisition. Mahwah NJ: 
Lawrence Erlbaum. 
 
Kormos, J. and Denes, M. (2004). Exploring measures and perceptions of fluency in the 
speech of second language learners. System, 32, 145-164. 
 
Kormos, J. and S?f?r (2008). Phonological short-term memory and foreign language 
performance in intensive language learning. Bilingualism, Language and Cognition, 11,  
261-271. 
Kormos, J. & Trebits, A. (2011). Working memory capacity and narrative task 
performance. In Robinson, P. (Ed.). Second language task complexity. Amsterdam: 
John Benjamins p. 267-289. 
Lennon, P. (1990). Investigating fluency in EFL: A quantitative approach. Language 
Learning, 40, 387-417. 
 
Lennon, P. (2000). The lexical element in spoken second language fluency. In H. 
Riggenbach (Ed), Perspectives on fluency (pp. 25-42). Ann Arbor: University of 
Michigan Press. 
Mackey, A., Adams, R., Stafford, C, and Winke, P. (2010). Exploring the Relationship 
Between Modified Output and Working Memory Capacity. Language Learning, 60, 
501-533. 
 
Mizera, G.J. (2006). Working memory and L2 oral fluency. PhD Dissertation. 
University of Pittsburgh. 
 
Mora, J. C. & Valls-Ferrer, M. (in press) Oral fluency, accuracy and complexity in 
formal instruction and study abroad learning contexts. TESOL Quarterly. doi: 
10.1002/tesq.034 
 
Mota, M.B. (2003). Working memory capacity and fluency, accuracy, complexity, and 
lexical density in L2 speech production. Fragmentos, 24, 69-104. 
 
Steven Morris 
 
O`Brien, I. Segalowitz, N., Collentine, J. and Freed, B. (2006). Phonological memory 
and lexical, narrative, and grammatical skills in second language oral production by 
adult learners. Applied Psycholinguistics, 27, 377-402. 
 
Sajavaara, K. (1987). Second language speech production: Factors affecting fluency. In 
H. W. Dechert and M. Raupach (Eds.) Psycholinguistic models of production. 
Norwood, NJ: Ablex. Pp. 137-174. 
 
Sawyer, M. and L. Ranta. 2001: Aptitude, individual differences and instructional 
design, in P. Robinson, (Ed.) Cognition and second language instruction. Cambridge: 
Cambridge University Press. Pp. 319-353. 
 
Schmidt, R. (1992). Psychological mechanisms underlying second language fluency. 
Studies in Second Language Acquisition, 14, 357-385. 
 
Schmidt, R. (1993). Awareness and second language acquisition. Annual Review of 
Applied Linguistics, 13, 206-226. 
 
Segalowitz, N. (2010). The cognitive bases of second language fluency. New York: 
Routeledge. 
 
Tavakoli, P. (2011). Pausing patterns: Differences between L2 learners and native 
speakers, ELT Journal. 65, 71-79. 
 
Weissheimer, J. and Mota, M.B. (2009). Individual differences in working memory 
capacity and the development of L2 speech production. Issues in Applied Linguistics, 
17, 93-112.