• Aspects of fluency across assessed levels of speaking proficiency

      Tavakoli, Parveneh; Nakatsuhara, Fumiyo; Hunter, Ann-Marie (Wiley, 2020-01-25)
      Recent research in second language acquisition suggests that a number of speed, breakdown, repair and composite measures reliably assess fluency and predict proficiency. However, there is little research evidence to indicate which measures best characterize fluency at each assessed level of proficiency, and which can consistently distinguish one level from the next. This study investigated fluency in 32 speakers’ performing four tasks of the British Council’s Aptis Speaking test, which were awarded four different levels of proficiency (CEFR A2-C1). Using PRAAT, the performances were analysed for various aspects of utterance fluency across different levels of proficiency. The results suggest that speed and composite measures consistently distinguish fluency from the lowest to upper-intermediate levels (A2-B2), and many breakdown measures differentiate between the lowest level (A2) and the rest of the proficiency groups, with a few differentiating between lower (A2, B1) and higher levels (B2, C1). The varied use of repair measures at different levels suggest that a more complex process is at play. The findings imply that a detailed micro-analysis of fluency offers a more reliable understanding of the construct and its relationship with assessment of proficiency.
    • A comparison of holistic, analytic, and part marking models in speaking assessment

      Khabbazbashi, Nahal; Galaczi, Evelina D. (SAGE, 2020-01-24)
      This mixed methods study examined holistic, analytic, and part marking models (MMs) in terms of their measurement properties and impact on candidate CEFR classifications in a semi-direct online speaking test. Speaking performances of 240 candidates were first marked holistically and by part (phase 1). On the basis of phase 1 findings – which suggested stronger measurement properties for the part MM – phase 2 focused on a comparison of part and analytic MMs. Speaking performances of 400 candidates were rated analytically and by part during that phase. Raters provided open comments on their marking experiences. Results suggested a significant impact of MM; approximately 30% and 50% of candidates in phases 1 and 2 respectively were awarded different (adjacent) CEFR levels depending on the choice of MM used to assign scores. There was a trend of higher CEFR levels with the holistic MM and lower CEFR levels with the part MM. While strong correlations were found between all pairings of MMs, further analyses revealed important differences. The part MM was shown to display superior measurement qualities particularly in allowing raters to make finer distinctions between different speaking ability levels. These findings have implications for the scoring validity of speaking tests.
    • Scaling and scheming: the highs and lows of scoring writing

      Green, Anthony; University of Bedfordshire (2019-12-04)
    • Research and practice in assessing academic English: the case of IELTS

      Taylor, Lynda; Saville, N. (Cambridge University Press, 2019-12-01)
      Test developers need to demonstrate they have premised their measurement tools on a sound theoretical framework which guides their coverage of appropriate language ability constructs in the tests they offer to the public. This is essential for supporting claims about the validity and usefulness of the scores generated by the test.  This volume describes differing approaches to understanding academic reading ability that have emerged in recent decades and goes on to develop an empirically grounded framework for validating tests of academic reading ability.  The framework is then applied to the IELTS Academic reading module to investigate a number of different validity perspectives that reflect the socio-cognitive nature of any assessment event.  The authors demonstrate how a systematic understanding and application of the framework and its components can help test developers to operationalise their tests so as to fulfill the validity requirements for an academic reading test.  The book provides:   An up to date review of the relevant literature on assessing academic reading  A clear and detailed specification of the construct of academic reading  An evaluation of what constitutes an adequate representation of the construct of academic reading for assessment purposes  A consideration of the nature of academic reading in a digital age and its implications for assessment research and test development  The volume is a rich source of information on all aspects of testing academic reading ability.  Examination boards and other institutions who need to validate their own academic reading tests in a systematic and coherent manner, or who wish to develop new instruments for measuring academic reading, will find it of interest, as will researchers and graduate students in the field of language assessment, and those teachers preparing students for IELTS (and similar tests) or involved in English for Academic Purpose programmes. 
    • Reflecting on the past, embracing the future

      Hamp-Lyons, Liz; University of Bedfordshire (Elsevier, 2019-10-14)
      In the Call for Papers for this anniversary volume of Assessing Writing, the Editors described the goal as “to trace the evolution of ideas, questions, and concerns that are key to our field, to explain their relevance in the present, and to look forward by exploring how these might be addressed in the future” and they asked me to contribute my thoughts. As the Editor of Assessing Writing between 2002 and 2017—a fifteen-year period—I realised from the outset that this was a very ambitious goal, l, one that no single paper could accomplish. Nevertheless, it seemed to me an opportunity to reflect on my own experiences as Editor, and through some of those experiences, offer a small insight into what this journal has done (and not done) to contribute to the debate about the “ideas, questions and concerns”; but also, to suggest some areas that would benefit from more questioning and thinking in the future. Despite the challenges of the task, I am very grateful to current Editors Martin East and David Slomp for the opportunity to reflect on these 25 years and to view them, in part, through the lens provided by the five articles appearing in this anniversary volume.
    • Developing tools for learning oriented assessment of interactional competence: bridging theory and practice

      May, Lyn; Nakatsuhara, Fumiyo; Lam, Daniel M. K.; Galaczi, Evelina D. (SAGE Publications, 2019-10-01)
      In this paper we report on a project in which we developed tools to support the classroom assessment of learners’ interactional competence (IC) and provided learning oriented feedback in the context of preparation for a high-stakes face-to-face speaking test.  Six trained examiners provided stimulated verbal reports (n=72) on 12 paired interactions, focusing on interactional features of candidates’ performance. We thematically analyzed the verbal reports to inform a draft checklist and materials, which were then trialled by four experienced teachers. Informed by both data sources, the final product comprised (a) a detailed IC checklist with nine main categories and over 50 sub-categories, accompanying detailed description of each area and feedback to learners, which teachers can adapt to suit their teaching and testing contexts, and (b) a concise IC checklist with four categories and bite-sized feedback for real-time classroom assessment. IC, a key aspect of face-to-face communication, is under-researched and under-explored in second/foreign language teaching, learning, and assessment contexts. This in-depth treatment of it, therefore, stands to contribute to learning contexts through raising teachers’ and learners’ awareness of micro-level features of the construct, and to assessment contexts through developing a more comprehensive understanding of the construct.
    • Towards a model of multi-dimensional performance of C1 level speakers assessed in the Aptis Speaking Test

      Nakatsuhara, Fumiyo; Tavakoli, Parveneh; Awwad, Anas; British Council; University of Bedfordshire; University of Reading; Isra University, Jordan (British Council, 2019-09-14)
      This is a peer-reviewed online research report in the British Council Validation Series (https://www.britishcouncil.org/exam/aptis/research/publications/validation). Abstract The current study draws on the findings of Tavakoli, Nakatsuhara and Hunter’s (2017) quantitative study which failed to identify any statistically significant differences between various fluency features in speech produced by B2 and C1 level candidates in the Aptis Speaking test. This study set out to examine whether there were differences between other aspects of the speakers’ performance at these two levels, in terms of lexical and syntactic complexity, accuracy and use of metadiscourse markers, that distinguish the two levels. In order to understand the relationship between fluency and these other aspects of performance, the study employed a mixed-methods approach to analysing the data. The quantitative analysis included descriptive statistics, t-tests and correlational analyses of the various linguistic measures. For the qualitative analysis, we used a discourse analysis approach to examining the pausing behaviour of the speakers in the context the pauses occurred in their speech. The results indicated that the two proficiency levels were statistically different on measures of accuracy (weighted clause ratio) and lexical diversity (TTR and D), with the C1 level producing more accurate and lexically diverse output. The correlation analyses showed speed fluency was correlated positively with weighted clause ratio and negatively with length of clause. Speed fluency was also positively related to lexical diversity, but negatively linked with lexical errors. As for pauses, frequency of end-clause pauses was positively linked with length of AS-units. Mid-clause pauses also positively correlated with lexical diversity and use of discourse markers. Repair fluency correlated positively with length of clause, and negatively with weighted clause ratio. Repair measures were also negatively linked with number of errors per 100 words and metadiscourse marker type. The qualitative analyses suggested that the pauses mainly occurred a) to facilitate access and retrieval of lexical and structural units, b) to reformulate units already produced, and c) to improve communicative effectiveness. A number of speech exerpts are presented to illustrate these examples. It is hoped that the findings of this research offer a better understanding of the construct measured at B2 and C1 levels of the Aptis Speaking test, inform possible refinements of the Aptis Speaking rating scales, and enhance its rater training programme for the two highest levels of the test.
    • Second language listening: current ideas, current issues

      Field, John (Cambridge University Press, 2019-08-01)
      This chapter starts by mentioning the drawbacks of the approach conventionally adopted in L2 listening instruction – in particular, its focus on the products of listening rather than the processes that contribute to it. It then offers an overview of our present understanding of what those processes are, drawing upon research findings in psycholinguistics, phonetics and Applied Linguistics. Section 2 examines what constitutes proficient listening and how the performance of an L2 listener diverges from it; and Section 3 considers the perceptual problems caused by the nature of spoken input. Subsequent sections then cover various areas of research in L2 listening. Section 4 provides a brief summary of topics that have been of interest to researchers over the years; and Section 5 reviews the large body of research into listening strategies. Section 6 then covers a number of interesting issues that have come to the fore in recent studies: multimodality, levels of listening vocabulary, cross-language phoneme perception, the use of a variety of accents, the validity of playing a recording twice, text authenticity and listening anxiety. A final section identifies one or two recurring themes that have arisen, and considers how instruction is likely to develop in future.
    • Vocabulary explanations in beginning-level adult ESOL classroom interactions: a conversation analysis perspective

      Tai, Kevin W.H.; Khabbazbashi, Nahal; University College London; University of Bedfordshire (Linguistics and Education, Elsevier, 2019-07-19)
      Re­cent stud­ies have ex­am­ined the in­ter­ac­tional or­gan­i­sa­tion of vo­cab­u­lary ex­pla­na­tions (VEs) in sec­ond lan­guage (L2) class­rooms. Nev­er­the­less, more work is needed to bet­ter un­der­stand how VEs are pro­vided inthese class­rooms, par­tic­u­larly in be­gin­ning-level Eng­lish for Speak­ers of Other Lan­guages (ESOL) class­room con­texts where stu­dents have dif­fer­ent first lan­guages (L1s) and lim­ited Eng­lish pro­fi­ciency and theshared lin­guis­tic re­sources be­tween the teacher and learn­ers are typ­i­cally lim­ited. Based on a cor­pus of be­gin­ning-level adult ESOL lessons, this con­ver­sa­tion-an­a­lytic study of­fers in­sights into how VEs are in­ter­ac­tion­ally man­aged in such class­rooms. Our find­ings con­tribute to the cur­rent lit­er­a­ture in shed­ding light on thena­ture of VEs in be­gin­ning-level ESOL class­rooms.
    • Measuring L2 speaking

      Nakatsuhara, Fumiyo; Inoue, Chihiro; Khabbazbashi, Nahal (Routledge, 2019-07-11)
      This chapter on measuring L2 speaking has three main focuses: (a) construct representation, (b) test methods and task design, and (c) scoring and feedback. We will briefly trace the different ways in which speaking constructs have been defined over the years and operationalized using different test methods and task features. We will then discuss the challenges and opportunities that speaking tests present for scoring and providing feedback to learners. We will link these discussions to the current understanding of SLA theories and empirical research, learning oriented assessment approaches and advances in educational technology.
    • Developing an advanced, specialized English proficiency test for Beijing universities

      Hamp-Lyons, Liz; Wenxia, Bonnie Zhang; University of Bedfordshire; Tsinghua University (2019-07-10)
    • Interactional competence with and without extended planning time in a group oral assessment

      Lam, Daniel M. K. (Routledge, Taylor & Francis Group, 2019-05-02)
      Linking one’s contribution to those of others’ is a salient feature demonstrating interactional competence in paired/group speaking assessments. While such responses are to be constructed spontaneously while engaging in real-time interaction, the amount and nature of pre-task preparation in paired/group speaking assessments may have an influence on how such an ability (or lack thereof) could manifest in learners’ interactional performance. Little previous research has examined the effect of planning time on interactional aspects of paired/group speaking task performance. Within the context of school-based assessment in Hong Kong, this paper analyzes the discourse of two group interactions performed by the same four student-candidates under two conditions: (a) with extended planning time (4–5 hours), and (b) without extended planning time (10 minutes), with the aim of exploring any differences in student-candidates’ performance of interactional competence in this assessment task. The analysis provides qualitative discourse evidence that extended planning time may impede the assessment task’s capacity to discriminate between stronger and weaker candidates’ ability to spontaneously produce responses contingent on previous speaker contribution. Implications for the implementation of preparation time for the group interaction task are discussed.
    • The mediation and organisation of gestures in vocabulary instructions: a microgenetic analysis of interactions in a beginning-level adult ESOL classroom

      Tai, Kevin W.H.; Khabbazbashi, Nahal (Taylor & Francis, 2019-04-26)
      There is limited research on second language (L2) vocabulary teaching and learning which provides fine-grained descriptions of how vocabulary explanations (VE) are interactionally managed in beginning-level L2 classrooms where learners have a limited L2 repertoire, and how the VEs could contribute to the learners’ conceptual understanding of the meaning(s) of the target vocabulary items (VIs). To address these research gaps, we used a corpus of classroom video-data from a beginning-level adult ESOL classroom in the United States and applied Conversation Analysis to examine how the class teacher employs various gestural and linguistic resources to construct L2 VEs. We also conducted a 4-month microgenetic analysis to document qualitative changes in learners’ understanding of the meaning of specific L2 VIs which were previously explained by the teacher. Findings revealed that the learners’ use of gestures allows for an externalization of thinking processes providing visible output for inspection by the teacher and peers. These findings can inform educators’ understanding about L2 vocabulary development as a gradual process of controlling the right gestural and linguistic resources for appropriate communicative purposes.
    • Development of empirically driven checklists for learners’ interactional competence

      Nakatsuhara, Fumiyo; May, Lyn; Lam, Daniel M. K.; Galaczi, Evelina D.; University of Bedfordshire; Queensland University of Technology; Cambridge Assessment English (2019-03-27)
    • Restoring perspective on the IELTS test

      Green, Anthony (Oxford University Press, 2019-03-18)
      This article presents a response to William Pearson’s article, ‘Critical Perspectives on the IELTS Test’. It addresses his critique of the role of IELTS as a test for regulating international mobility and access to English medium education and evaluates his more specific prescriptions for the improvements to the quality of the test itself.
    • Rethinking the second language listening test

      Field, John (Equinox, 2019-03-12)
      The book begins with an account of the various processes that contribute to listening, in order to raise awareness of the difficulties faced by second language learners. This information feeds in to a new set of descriptors of listening behaviour across proficiency levels and informs much of the discussion in later chapters. The main body of the book critically examines the various components of a listening test, challenging some of the false assumptions behind them and proposing practical alternatives. The discussion covers: the recording-as-text, the recording-as-speech, conventions of test delivery, standard task formats and item design. Major themes are the critical role played by the recorded material and the degree to which tests impose demands that go beyond those of real-world listening. The following section focuses on two types of listener with different needs from the general candidate: those aiming to demonstrate academic or professional proficiency in English and young language learners, where level of cognitive development is an issue for test design. There is a brief reflection on the extent to which integrated listening tests reflect the reality of listening events. The book concludes with a report of a study into how feasible it is to identify the information load of a listening text, a factor potentially contributing to test difficulty.