• Developing an advanced, specialized English proficiency test for Beijing universities

      Hamp-Lyons, Liz; Wenxia, Bonnie Zhang; University of Bedfordshire; Tsinghua University (2019-07-10)
    • The discourse of the IELTS Speaking Test : interactional design and practice

      Seedhouse, Paul; Nakatsuhara, Fumiyo (Cambridge University Press, 2018-02-15)
      The volume provides a unique dual perspective on the evaluation of spoken discourse in that it combines a detailed portrayal of the design of a face-to-face speaking test with its actual implementation in interactional terms. Using many empirical extracts of interaction from authentic IELTS Speaking Tests, the book illustrates how the interaction is organised in relation to the institutional aim of ensuring valid assessment. The relationship between individual features of the interaction and grading criteria is examined in detail across a number of different performance levels.
    • Exploring performance across two delivery modes for the IELTS Speaking Test: face-to-face and video-conferencing delivery (Phase 2)

      Nakatsuhara, Fumiyo; Inoue, Chihiro; Berry, Vivien; Galaczi, Evelina D. (IELTS Partners, 2017-10-04)
      Face-to-face speaking assessment is widespread as a form of assessment, since it allows the elicitation of interactional skills. However, face-to-face speaking test administration is also logistically complex, resource-intensive and can be difficult to conduct in geographically remote or politically sensitive areas. Recent advances in video-conferencing technology now make it possible to engage in online face-to-face interaction more successfully than was previously the case, thus reducing dependency upon physical proximity. A major study was, therefore, commissioned to investigate how new technologies could be harnessed to deliver the face-to-face version of the IELTS Speaking test.  Phase 1 of the study, carried out in London in January 2014, presented results and recommendations of a small-scale initial investigation designed to explore what similarities and differences, in scores, linguistic output and test-taker and examiner behaviour, could be discerned between face-to-face and internet-based videoconferencing delivery of the Speaking test (Nakatsuhara, Inoue, Berry and Galaczi, 2016). The results of the analyses suggested that the speaking construct remains essentially the same across both delivery modes.  This report presents results from Phase 2 of the study, which was a larger-scale followup investigation designed to: (i) analyse test scores obtained using more sophisticated statistical methods than was possible in the Phase 1 study (ii) investigate the effectiveness of the training for the video-conferencing- delivered test which was developed based on findings from the Phase 1 study (iii) gain insights into the issue of sound quality perception and its (perceived) effect (iv) gain further insights into test-taker and examiner behaviours across the two delivery modes (v) confirm the results of the Phase 1 study. Phase 2 of the study was carried out in Shanghai, People’s Republic of China in May 2015. Ninety-nine (99) test-takers each took two speaking tests under face-to-face and internet-based video-conferencing conditions. Performances were rated by 10 trained IELTS examiners. A convergent parallel mixed-methods design was used to allow for collection of an in-depth, comprehensive set of findings derived from multiple sources. The research included an analysis of rating scores under the two delivery conditions, test-takers’ linguistic output during the tests, as well as short interviews with test-takers following a questionnaire format. Examiners responded to two feedback questionnaires and participated in focus group discussions relating to their behaviour as interlocutors and raters, and to the effectiveness of the examiner training. Trained observers also took field notes from the test sessions and conducted interviews with the test-takers.  Many-Facet Rasch Model (MFRM) analysis of test scores indicated that, although the video-conferencing mode was slightly more difficult than the face-to-face mode, when the results of all analytic scoring categories were combined, the actual score difference was negligibly small, thus supporting the Phase 1 findings. Examination of language functions elicited from test-takers revealed that significantly more test-takers asked questions to clarify what the examiner said in the video-conferencing mode (63.3%) than in the face-to-face mode (26.7%) in Part 1 of the test. Sound quality was generally positively perceived in this study, being reported as 'Clear' or 'Very clear', although the examiners and observers tended to perceive it more positively than the test-takers. There did not seem to be any relationship between sound quality perceptions and the proficiency level of test-takers. While 71.7% of test-takers preferred the face-to-face mode, slightly more test-takers reported that they were more nervous in the face-to-face mode (38.4%) than in the video-conferencing mode (34.3%).  All examiners found the training useful and effective, the majority of them (80%) reporting that the two modes gave test-takers equal opportunity to demonstrate their level of English proficiency. They also reported that it was equally easy for them to rate test-taker performance in face-to-face and video-conferencing modes.  The report concludes with a list of recommendations for further research, including suggestions for further examiner and test-taker training, resolution of technical issues regarding video-conferencing delivery and issues related to rating, before any decisions about deploying a video-conferencing mode of delivery for the IELTS Speaking test are made.
    • Exploring performance across two delivery modes for the same L2 speaking test: face-to-face and video-conferencing delivery: a preliminary comparison of test-taker and examiner behaviour

      Nakatsuhara, Fumiyo; Inoue, Chihiro; Berry, Vivien; Galaczi, Evelina D. (The IELTS Partners: British Council, Cambridge English Language Assessment and IDP: IELTS Australia, 2016-11-10)
      This report presents the results of a preliminary exploration and comparison of test-taker and examiner behaviour across two different delivery modes for an IELTS Speaking test: the standard face-to-face test administration, and test administration using Internetbased video-conferencing technology. The study sought to compare performance features across these two delivery modes with regard to two key areas:  • an analysis of test-takers’ scores and linguistic output on the two modes and their perceptions of the two modes  • an analysis of examiners’ test management and rating behaviours across the two modes, including their perceptions of the two conditions for delivering the speaking test.  Data were collected from 32 test-takers who took two standardised IELTS Speaking tests under face-to-face and internet-based video-conferencing conditions. Four trained examiners also participated in this study. The convergent parallel mixed methods research design included an analysis of interviews with test-takers, as well as their linguistic output (especially types of language functions) and rating scores awarded under the two conditions. Examiners provided written comments justifying the scores they awarded, completed a questionnaire and participated in verbal report sessions to elaborate on their test administration and rating behaviour. Three researchers also observed all test sessions and took field notes.  While the two modes generated similar test score outcomes, there were some differences in functional output and examiner interviewing and rating behaviours. This report concludes with a list of recommendations for further research, including examiner and test-taker training and resolution of technical issues, before any decisions about deploying (or not) a video-conferencing mode of the IELTS Speaking test delivery are made. 
    • Exploring the use of video-conferencing technology in the assessment of spoken language: a mixed-methods study

      Nakatsuhara, Fumiyo; Inoue, Chihiro; Berry, Vivien; Galaczi, Evelina D.; University of Bedfordshire; British Council; Cambridge English Language Assessment (Taylor & Francis, 2017-02-10)
      This research explores how internet-based video-conferencing technology can be used to deliver and conduct a speaking test, and what similarities and differences can be discerned between the standard and computer-mediated face-to-face modes. The context of the study is a high-stakes speaking test, and the motivation for the research is the need for test providers to keep under constant review the extent to which their tests are accessible and fair to a wide constituency of test takers. The study examines test-takers’ scores and linguistic output, and examiners’ test administration and rating behaviors across the two modes. A convergent parallel mixed-methods research design was used, analyzing test-takers’ scores and language functions elicited, examiners’ written comments, feedback questionnaires and verbal reports, as well as observation notes taken by researchers. While the two delivery modes generated similar test score outcomes, some differences were observed in test-takers’ functional output and the behavior of examiners who served as both raters and interlocutors.
    • Implementing a learning-oriented approach within English Language assessment in Hong Kong schools: practices, issues and complexities.

      Hamp-Lyons, Liz (Palgrave Macmillan, 2016-12-16)
      This paper provides an overview of the multiple studies carried out between 2005 and 2011 on the Hong Kong School-based assessment (SBA), which was designed to implement an assessment for learning philosophy, and places the work within a learning-oriented language assessment (LOLA) paradigm (Hamp-Lyons & Green 2014) which is growing worldwide. The Hong Kong SBA continues to be used Hong Kong-wide to formatively assess the English as a second language speaking of all students in secondary years 4, 5 and 6. After discussing the structure and goals of this innovative assessment and its teacher language assessment literacy aims and processes, the chapter then discusses some of the constraints and issues, which have inhibited the degree to which the intended consequences have transpired. It points to compulsory ‘statistical moderation’, which undermines teachers’ trust in the new system; and to local contextual issues such as heavy reliance on ‘cram schools’, competition among school and teachers’ perceptions of fairness as being ‘the same for everyone’.
    • Investigating examiner interventions in relation to the listening demands they make on candidates in oral interview tests

      Nakatsuhara, Fumiyo (John Benjamins, 2018-08-08)
      Examiners intervene in second language oral interviews in order to elicit intended language functions, to probe a candidate’s proficiency level or to keep the interaction going. Interventions of this kind can affect the candidate’s output language and score, since the candidate is obliged to process them as a listener and respond to them as a speaker. This chapter reports on a study that examined forty audio-recorded interviews of the oral test of a major European examination board, with a view to examining examiner interventions (i.e., questions, comments) in relation to the listening demands they make upon candidates. Half of the interviews involved candidates who scored highly on the test while the other half featured low-scoring candidates. This enabled a comparison of the language and behaviour of the same examiner across candidate proficiency levels, to see how they were modified in response to the communicative competence of the candidate. The recordings were transcribed and analyzed with regard to a) types of examiner intervention in terms of linguistic and pragmatic features and b) the extent to which the interventions varied in response to the proficiency level of the candidate. The study provides a new insight into examiner-examinee interactions, by identifying how examiners are differentiating listening demands according to the task types and the perceived proficiency level of the candidate. It offers several implications about the ways in which examiner interventions engage candidates’ listening skills, and the ways in which listening skills can be more validly and reliably measured when using a format based on examiner-candidate interaction.
    • An investigation into double-marking methods: comparing live, audio and video rating of performance on the IELTS Speaking Test

      Nakatsuhara, Fumiyo; Inoue, Chihiro; Taylor, Lynda (The IELTS Partners: British Council, IDP: IELTS Australia and Cambridge English Language Assessment, 2017-03-01)
      This study compared IELTS examiners’ scores when they assessed test-takers’ spoken performance under live and two non-live rating conditions using audio and video recordings. It also explored examiners’ perceptions towards test-takers’ performance in the two non-live rating modes.  This was a mixed-methods study that involved both existing and newly collected datasets. A total of six trained IELTS examiners assessed 36 test-takers’ performance under the live, audio and video rating conditions. Their scores in the three modes of rating were calibrated using the multifaceted Rasch model analysis.  In all modes of rating, the examiners were asked to make notes on why they awarded the scores that they did on each analytical category. The comments were quantitatively analysed in terms of the volume of positive and negative features of test-takers’ performance that examiners reported noticing when awarding scores under the three rating conditions.  Using selected test-takers’ audio and video recordings, examiners’ verbal reports were also collected to gain insights into their perceptions towards test-takers’ performance under the two non-live conditions.  The results showed that audio ratings were significantly lower than live and video ratings for all rating categories. Examiners noticed more negative performance features of test-takers under the two non-live rating conditions than the live rating condition. The verbal report data demonstrated how having visual information in the video-rating mode helped examiners to understand test-takers’ utterances, to see what was happening beyond what the test-takers were saying and to understand with more confidence the source of test-takers’ hesitation, pauses and awkwardness in their performance.  The results of this study have, therefore, offered a better understanding of the three modes of rating, and a recommendation was made regarding enhanced double-marking methods that could be introduced to the IELTS Speaking Test.
    • An investigation into double-marking methods: comparing live, audio and video rating of performance on the IELTS Speaking Test

      Nakatsuhara, Fumiyo; Inoue, Chihiro; Taylor, Lynda; University of Bedfordshire (IELTS Partners, 2017-03-01)
      This study compared IELTS examiners’ scores when they assessed test-takers’ spoken performance under live and two non-live rating conditions using audio and video recordings. It also explored examiners’ perceptions towards test-takers’ performance in the two non-live rating modes.  This was a mixed-methods study that involved both existing and newly collected datasets. A total of six trained IELTS examiners assessed 36 test-takers’ performance under the live, audio and video rating conditions. Their scores in the three modes of rating were calibrated using the multifaceted Rasch model analysis.  In all modes of rating, the examiners were asked to make notes on why they awarded the scores that they did on each analytical category. The comments were quantitatively analysed in terms of the volume of positive and negative features of test-takers’ performance that examiners reported noticing when awarding scores under the three rating conditions.  Using selected test-takers’ audio and video recordings, examiners’ verbal reports were also collected to gain insights into their perceptions towards test-takers’ performance under the two non-live conditions.  The results showed that audio ratings were significantly lower than live and video ratings for all rating categories. Examiners noticed more negative performance features of test-takers under the two non-live rating conditions than the live rating condition. The verbal report data demonstrated how having visual information in the video-rating mode helped examiners to understand test-takers’ utterances, to see what was happening beyond what the test-takers were saying and to understand with more confidence the source of test-takers’ hesitation, pauses and awkwardness in their performance.  The results of this study have, therefore, offered a better understanding of the three modes of rating, and a recommendation was made regarding enhanced double-marking methods that could be introduced to the IELTS Speaking Test.
    • Language assessment literacy for learning-oriented language assessment

      Hamp-Lyons, Liz (Australian Association of Applied Linguistics, 2017-12-16)
       A small-scale and exploratory study explored a set of authentic speaking test video samples from the Cambridge: First (First Certificate of English) speaking test, in order to learn whether, and where, opportunities might be revealed in, or inserted into formal speaking tests, order to provide language assessment literacy opportunities for language teachers teaching in test preparation courses as well as teachers training to become speaking test raters. By paying particular attention to some basic components of effective interaction that we would want an examiner or interlocutor to exhibit if they seek to encourage interactive responses from test candidates. Looking closely at body language (in particular eye contact; intonation, pacing and pausing), management of turn-taking, and elicitation of candidate-candidate interaction we saw ways in which a shift in focus to view tests as learning opportunities is possible: we call this new focus learning-oriented language assessment (LOLA).
    • The mediation and organisation of gestures in vocabulary instructions: a microgenetic analysis of interactions in a beginning-level adult ESOL classroom

      Tai, Kevin W.H.; Khabbazbashi, Nahal (Taylor & Francis, 2019-04-26)
      There is limited research on second language (L2) vocabulary teaching and learning which provides fine-grained descriptions of how vocabulary explanations (VE) are interactionally managed in beginning-level L2 classrooms where learners have a limited L2 repertoire, and how the VEs could contribute to the learners’ conceptual understanding of the meaning(s) of the target vocabulary items (VIs). To address these research gaps, we used a corpus of classroom video-data from a beginning-level adult ESOL classroom in the United States and applied Conversation Analysis to examine how the class teacher employs various gestural and linguistic resources to construct L2 VEs. We also conducted a 4-month microgenetic analysis to document qualitative changes in learners’ understanding of the meaning of specific L2 VIs which were previously explained by the teacher. Findings revealed that the learners’ use of gestures allows for an externalization of thinking processes providing visible output for inspection by the teacher and peers. These findings can inform educators’ understanding about L2 vocabulary development as a gradual process of controlling the right gestural and linguistic resources for appropriate communicative purposes.
    • A new test for China? Stages in the development of an assessment for professional purposes.

      Jin, Yan; Hamp-Lyons, Liz; Shanghai Jiao Tong University; University of Bedfordshire (Taylor & Francis, 2015-03-22)
      It is increasingly recognised that attention should be paid to investigating the needs of a new test, especially in contexts where specific purpose language needs might be identified. This article describes the stages involved in establishing the need for a new assessment of English for professional purposes in China. We first investigated stakeholders’ perceptions of the target language use activities and the necessity of the proposed assessment. We then analysed five existing tests and six language frameworks to evaluate their suitability for the need of the proposed assessment. The resulting proposal is for an advanced-level English assessment capable of providing a diagnostic evaluation of the proficiency of potential employees in areas of relevance to multinationals operating in China. The study has demonstrated the value of following a principled procedure to investigate the necessity for and the needs of a new test at the very beginning of the test development
    • Purposes of assessment

      Hamp-Lyons, Liz (De Gruyter/Mouton, 2016-11-01)
      Commissioned as the lead paper in the Volume, this chapter shows how the terms assessment, testing, examining and evaluation have become increasingly intertwined, and multiple new terms have emerged. The chapter takes a somewhat historical view as it lays out a picture of the ways that the field of language ‘assessment’ has expanded its knowledge and skills but also its socio-political effects. Taking the notion of ‘purposes’ as a broad organizing principle, the chapter is intended to introduce new readers in this field to the wide range of issues current in the field, and prepare them for the more specific chapters that follow.
    • Researching the comparability of paper-based and computer-based delivery in a high-stakes writing test

      Chan, Sathena Hiu Chong; Bax, Stephen; Weir, Cyril J. (Elsevier, 2018-04-07)
      International language testing bodies are now moving rapidly towards using computers for many areas of English language assessment, despite the fact that research on comparability with paper-based assessment is still relatively limited in key areas. This study contributes to the debate by researching the comparability of a highstakes EAP writing test (IELTS) in two delivery modes, paper-based (PB) and computer-based (CB). The study investigated 153 test takers' performances and their cognitive processes on IELTS Academic Writing Task 2 in the two modes, and the possible effect of computer familiarity on their test scores. Many-Facet Rasch Measurement (MFRM) was used to examine the difference in test takers' scores between the two modes, in relation to their overall and analytic scores. By means of questionnaires and interviews, we investigated the cognitive processes students employed under the two conditions of the test. A major contribution of our study is its use - for the first time in the computer-based writing assessment literature - of data from research into cognitive processes within realworld academic settings as a comparison with cognitive processing during academic writing under test conditions. In summary, this study offers important new insights into academic writing assessment in computer mode.
    • Vocabulary explanations in beginning-level adult ESOL classroom interactions: a conversation analysis perspective

      Tai, Kevin W.H.; Khabbazbashi, Nahal; University College London; University of Bedfordshire (Linguistics and Education, Elsevier, 2019-07-19)
      Re­cent stud­ies have ex­am­ined the in­ter­ac­tional or­gan­i­sa­tion of vo­cab­u­lary ex­pla­na­tions (VEs) in sec­ond lan­guage (L2) class­rooms. Nev­er­the­less, more work is needed to bet­ter un­der­stand how VEs are pro­vided inthese class­rooms, par­tic­u­larly in be­gin­ning-level Eng­lish for Speak­ers of Other Lan­guages (ESOL) class­room con­texts where stu­dents have dif­fer­ent first lan­guages (L1s) and lim­ited Eng­lish pro­fi­ciency and theshared lin­guis­tic re­sources be­tween the teacher and learn­ers are typ­i­cally lim­ited. Based on a cor­pus of be­gin­ning-level adult ESOL lessons, this con­ver­sa­tion-an­a­lytic study of­fers in­sights into how VEs are in­ter­ac­tion­ally man­aged in such class­rooms. Our find­ings con­tribute to the cur­rent lit­er­a­ture in shed­ding light on thena­ture of VEs in be­gin­ning-level ESOL class­rooms.
    • What counts as ‘responding’? Contingency on previous speaker contribution as a feature of interactional competence

      Lam, Daniel M. K. (Sage, 2018-05-10)
      The ability to interact with others has gained recognition as part of the L2 speaking construct in the assessment literature and in high- and low-stakes speaking assessments. This paper first presents a review of the literature on interactional competence (IC) in L2 learning and assessment. It then discusses a particular feature – producing responses contingent on previous speaker contribution – that emerged as a de facto construct feature of IC oriented to by both candidates and examiners within the school-based group speaking assessment in the Hong Kong Diploma of Secondary Education (HKDSE) English Language Examination. Previous studies have, similarly, argued for the importance of ‘responding to’ or linking one’s own talk to previous speakers’ contributions as a way of demonstrating comprehension of co-participants’ talk. However, what counts as such a response has yet to be explored systematically. This paper presents a conversation analytic study of the candidate discourse in the assessed group interactions, identifying three conversational actions through which student-candidates construct contingent responses to co-participants. The thick description about the nature of contingent responses lays the groundwork for further empirical investigations on the relevance of this IC feature and its proficiency implications.