• Rating scale development: a multistage exploratory sequential design

      Galaczi, Evelina D.; Khabbazbashi, Nahal; Cambridge English Language Assessment (Cambridge University Press, 2016-03-01)
      The project chosen to showcase the application of the exploratory sequential design in second/ foreign (L2) language assessment comes from the context of rating scale development and focuses on the development of a set of scales for a suite of high-stakes L2 speaking tests. The assessment of speaking requires assigning scores to a speech sample in a systematic fashion by focusing on explicitly defined criteria which describe different levels of performance (Ginther 2013). Rating scales are the instruments used in this evaluation process, and they can be either holistic (i.e. providing a global overall assessment) or analytic (i.e. providing an independent evaluations for a number of assessment criteria, e.g. Grammar, Vocabulary, Organisation, etc.). The discussion in this chapter is framed within the context of rating scales in speaking assessment. However, it is worth noting that the principles espoused, stages employed and decisions taken during the development process have wider applicability to performance assessment in general.
    • Reading in a second language: process, product and practice

      Urquhart, A.H.; Weir, Cyril J. (Routledge, 2014-01-01)
      Reading in a Second Language sets the testing and teaching of reading against a theoretical background, discussing research from both applied linguistics and cognitive psychology. Where possible, it focuses on research into second language readers and distinguishes different kinds of reading, particularly expeditious as opposed to careful reading, and emphasizes the validity of each.Sandy Urquhart and Cyril Weir relate testing and teaching, discussing similarities and differences, providing a comprehensive survey of both methods with the emphasis on those which have been substantiated or supported by research evidence. Finally, the book proposes specific research topics, and detailed advice on how to construct tests of language for academic purposes and suggestions for further research.
    • Recommending a nursing-specific passing standard for the IELTS examination

      O'Neill, Thomas R.; Buckendahl, Chad W.; Plake, Barbara S.; Taylor, Lynda (Taylor & Francis, 2007-12-05)
      Licensure testing programs in the United States (e.g., nursing) face an increasing challenge of measuring the competency of internationally trained candidates, both in relation to their clinical competence and their English language competence. To assist with the latter, professional licensing bodies often adopt well-established and widely available international English language proficiency measures. In this context, the National Council of State Boards of Nursing (NCSBN) sought to develop a nursing-specific passing standard on the International English Language Testing System that U.S. jurisdictions could consider in their licensure decisions for internationally trained candidates. Findings from a standard setting exercise were considered by NCSBN's Examination Committee in conjunction with other relevant information to produce a legally defensible passing standard on the test. This article reports in detail on the standard setting exercise conducted as part of this policy-making process; it describes the techniques adopted, the procedures followed, and the outcomes obtained. The study is contextualized within the current literature on standard setting. The latter part of the article describes the nature of the policy-making process to which the study contributed and discusses some of the implications of including a language literacy test as part of a licensure testing program.
    • Reflecting on the past, embracing the future

      Hamp-Lyons, Liz; University of Bedfordshire (Elsevier, 2019-10-14)
      In the Call for Papers for this anniversary volume of Assessing Writing, the Editors described the goal as “to trace the evolution of ideas, questions, and concerns that are key to our field, to explain their relevance in the present, and to look forward by exploring how these might be addressed in the future” and they asked me to contribute my thoughts. As the Editor of Assessing Writing between 2002 and 2017—a fifteen-year period—I realised from the outset that this was a very ambitious goal, l, one that no single paper could accomplish. Nevertheless, it seemed to me an opportunity to reflect on my own experiences as Editor, and through some of those experiences, offer a small insight into what this journal has done (and not done) to contribute to the debate about the “ideas, questions and concerns”; but also, to suggest some areas that would benefit from more questioning and thinking in the future. Despite the challenges of the task, I am very grateful to current Editors Martin East and David Slomp for the opportunity to reflect on these 25 years and to view them, in part, through the lens provided by the five articles appearing in this anniversary volume.
    • The relative significance of syntactic knowledge and vocabulary breadth in the prediction of reading comprehension test performance

      Shiotsu, Toshiko; Weir, Cyril J.; Kurume University, Japan; University of Bedfordshire (SAGE, 2007-01-01)
      In the componential approach to modelling reading ability, a number of contributory factors have been empirically validated. However, research on their relative contribution to explaining performance on second language reading tests is limited. Furthermore, the contribution of knowledge of syntax has been largely ignored in comparison with the attention focused on vocabulary. This study examines the relative contribution of knowledge of syntax and knowledge of vocabulary to L2 reading in two pilot studies in different contexts - a heterogeneous population studying at the tertiary level in the UK and a homogenous undergraduate group in Japan - followed by a larger main study, again involving a homogeneous Japanese undergraduate population. In contrast with previous findings in the literature, all three studies offer support for the relative superiority of syntactic knowledge over vocabulary knowledge in predicting performance on a text reading comprehension test. A case is made for the robustness of structural equation modelling compared to conventional regression in accounting for the differential reliabilities of scores on the measures employed. © 2007 SAGE Publications.
    • Relevance Theory and the in second language acquisition

      Žegarac, Vladimir (SAGE, 2004-07-01)
      This article considers the implications of Sperber and Wilson's (1986/95) Relevance Theory for the acquisition of English the by second language (L2) learners whose first language (L1) does not have an article system. On the one hand, Relevance Theory provides an explicit characterization of the semantics of the, which suggests ways of devising more accurate guidelines for teaching/learning than are available in current textbooks. On the other hand, Relevance Theoretic assumptions about human communication together with some effects of transfer from L1 provide the. basis for a number of predictions about the types of L2 learners' errors in the use of the. I argue that data from previous research (Trenkić, 2002) lend support to these predictions, and I try to show that examples drawn from the data I have collected provide evidence for the view that L2 learning is not influenced only by general pragmatic principles and hypotheses about L2 based on transfer from L1, but that learners also devise and test tacit hypotheses which are idiosyncratic to them.
    • Repeated test-taking and longitudinal test score analysis: editorial

      Green, Anthony; Van Moere, Alistair; University of Bedfordshire; MetaMetrics Inc. (Sage, 2020-09-27)
    • Research and practice in assessing academic English: the case of IELTS

      Taylor, Lynda; Saville, N. (Cambridge University Press, 2019-12-01)
      Test developers need to demonstrate they have premised their measurement tools on a sound theoretical framework which guides their coverage of appropriate language ability constructs in the tests they offer to the public. This is essential for supporting claims about the validity and usefulness of the scores generated by the test.  This volume describes differing approaches to understanding academic reading ability that have emerged in recent decades and goes on to develop an empirically grounded framework for validating tests of academic reading ability.  The framework is then applied to the IELTS Academic reading module to investigate a number of different validity perspectives that reflect the socio-cognitive nature of any assessment event.  The authors demonstrate how a systematic understanding and application of the framework and its components can help test developers to operationalise their tests so as to fulfill the validity requirements for an academic reading test.  The book provides:   An up to date review of the relevant literature on assessing academic reading  A clear and detailed specification of the construct of academic reading  An evaluation of what constitutes an adequate representation of the construct of academic reading for assessment purposes  A consideration of the nature of academic reading in a digital age and its implications for assessment research and test development  The volume is a rich source of information on all aspects of testing academic reading ability.  Examination boards and other institutions who need to validate their own academic reading tests in a systematic and coherent manner, or who wish to develop new instruments for measuring academic reading, will find it of interest, as will researchers and graduate students in the field of language assessment, and those teachers preparing students for IELTS (and similar tests) or involved in English for Academic Purpose programmes. 
    • Research and practice in assessing academic reading: the case of IELTS

      Weir, Cyril J.; Chan, Sathena Hiu Chong (Cambridge University Press, 2019-08-29)
      The focus for attention in this volume is the reading component of the IELTS Academic module, which is principally used for admissions purposes into ter- tiary-level institutions throughout the world (see Davies 2008 for a detailed history of the developments in EAP testing leading up to the current IELTS). According to the official website (www.cambridgeenglish.org/exams-and- tests/ielts/test-format/), there are three reading passages in the Academic Reading Module with a total of c.2,150–2,750 words. Individual tasks are not timed. Texts are taken from journals, magazines, books, and newspapers. All the topics are of general interest and the texts have been written for a non-specialist audience. The readings are intended to be about issues that are appropriate to candidates who will enter postgraduate or undergraduate courses. At least one text will contain detailed logical argument. One of the texts may contain non-verbal materials such as graphs, illustrations or diagrams. If there are technical terms, which candidates may not know in the text, then a glossary is provided. The texts and questions become more difficult through the paper. A number of specific critical questions are addressed in applying the socio- cognitive validation framework to the IELTS Academic Reading Module: * Are the cognitive processes required to complete the IELTS Reading test tasks appropriate and adequate in their coverage? (Focus on cognitive validity in Chapter 4.) * Are the contextual characteristics of the test tasks and their administration appropriate and fair to the candidates who are taking them? (Focus on context validity in Chapter 5.) * What effects do the test and test scores have on various stakeholders? (Focus on consequential validity in Chapter 6.) * What external evidence is there that the test is fair? (Focus on criterion- related validity in Chapter 7.)
    • A research report on the development of the Test of English for Academic Purposes (TEAP) writing test for Japanese university entrants

      Weir, Cyril J.; University of Bedfordshire (Eiken Foundation of Japan, 2014-01-01)
      Rigorous and iterative test design, accompanied by systematic trialing procedures, produced a pilot version of the test which demonstrated acceptable context and cognitive validity for use as an English for academic purposes (EAP) writing test for students wishing to enter Japanese universities. A study carried out on the scoring validity of the rating of the TEAP Writing Test indicated acceptable levels of inter‐ and intra‐marker reliability and demonstrated that receiving institutions could depend on the consistency of the results obtained on the test. study carried out on the contextual complexity parameters (lexical, grammatical, and cohesive) of scripts allocated to different bands on the TEAP Writing Test rating scale indicated that there were significant differences between the scripts in adjacent band levels, with the band B1 scripts produced by students being more complex than the band A2 scripts across a broad set of indices.
    • Researching L2 writers’ use of metadiscourse markers at intermediate and advanced levels

      Bax, Stephen; Nakatsuhara, Fumiyo; Waller, Daniel; University of Bedfordshire; University of Central Lancashire (Elsevier, 2019-02-20)
      Metadiscourse markers refer to aspects of text organisation or indicate a writer’s stance towards the text’s content or towards the reader (Hyland, 2004:109). The CEFR (Council of Europe, 2001) indicates that one of the key areas of development anticipated between levels B2 and C1 is an increasing variety of discourse markers and growing acknowledgement of the intended audience by learners. This study represents the first large-scale project of the metadiscourse of general second language learner writing, through the analysis of 281 metadiscourse markers in 13 categories, from 900 exam scripts at CEFR B2-C2 levels. The study employed the online text analysis tool Text Inspector (Bax, 2012), in conjunction with human analysts. The findings revealed that higher level writers used fewer metadiscourse markers than lower level writers, but used a significantly wider range of 8 of the 13 classes of markers. The study also demonstrated the crucial importance of analysing not only the behaviour of whole classes of metadiscourse items but also the individual items themselves. The findings are of potential interest to those involved in the development of assessment scales at different levels of the CEFR, or to teachers interested in aiding the development of learners. 
    • Researching metadiscourse markers in candidates’ writing at Cambridge FCE, CAE and CPE levels

      Bax, Stephen; Waller, Daniel; Nakatsuhara, Fumiyo; University of Bedfordshire; University of Central Lancashire (2013-09-07)
      This paper reports on research funded through the Cambridge ESOL Funded Research Programme, Round Three, 2012.
    • Researching participants taking IELTS Academic Writing Task 2 (AWT2) in paper mode and in computer mode in terms of score equivalence, cognitive validity and other factors

      Chan, Sathena Hiu Chong; Bax, Stephen; Weir, Cyril J. (British Council and IDP: IELTS Australia, 2017-08-01)
      Computer-based (CB) assessment is becoming more common in most university disciplines, and international language testing bodies now routinely use computers for many areas of English language assessment. Given that, in the near future, IELTS also will need to move towards offering CB options alongside traditional paper-based (PB) modes, the research reported here prepares for that possibility, building on research carried out some years ago which investigated the statistical comparability of the IELTS writing test between the two delivery modes, and offering a fresh look at the relevant issues. By means of questionnaire and interviews, the current study investigates the extent to which 153 test-takers’ cognitive processes, while completing IELTS Academic Writing in PB mode and in CB mode, compare with the real-world cognitive processes of students completing academic writing at university. A major contribution of our study is its use – for the first time in the academic literature – of data from research into cognitive processes within real-world academic settings as a comparison with cognitive processing during academic writing under test conditions. The most important conclusion from the study is that according to the 5-facet MFRM analysis, there were no significant differences in the scores awarded by two independent raters for candidates’ performances on the tests taken under two conditions, one paper-and-pencil and the other computer. Regarding analytic scores criteria, the differences in three areas (i.e. Task Achievement, Coherence and Cohesion, and Grammatical Range and Accuracy) were not significant, but the difference reported in Lexical Resources was significant, if slight. In summary, the difference of scores between the two modes is at an acceptable level. With respect to the cognitive processes students employ in performing under the two conditions of the test, results of the Cognitive Process Questionnaire (CPQ) survey indicate a similar pattern between the cognitive processes involved in writing on a computer and writing with paper-and-pencil. There were no noticeable major differences in the general tendency of the mean of each questionnaire item reported on the two test modes. In summary, the cognitive processes were employed in a similar fashion under the two delivery conditions. Based on the interview data (n=30), it appears that the participants reported using most of the processes in a similar way between the two modes. Nevertheless, a few potential differences indicated by the interview data might be worth further investigation in future studies. The Computer Familiarity Questionnaire survey shows that these students in general are familiar with computer usage and their overall reactions towards working with a computer are positive. Multiple regression analysis, used to find out if computer familiarity had any effect on students’ performances on the two modes, suggested that test-takers who do not have a suitable familiarity profile might perform slightly worse than those who do, in computer mode. In summary, the research offered in this report offers a unique comparison with realworld academic writing, and presents a significant contribution to the research base which IELTS and comparable international testing bodies will need to consider, if they are to introduce CB test versions in future.
    • Researching the cognitive validity of GEPT high-intermediate and advanced reading : an eye tracking an stimulated recall study

      Bax, Stephen; Chan, Sathena Hiu Chong (Language Training and Testing Center (LTTC), 2016-07-01)
      It is important for any language test to establish its cognitive validity in order to ensure that the test elicits from test takers those cognitive processes which correspond to the processes which they would normally employ in the target real-life context (Weir 2005). This study investigates the cognitive validity of the GEPT Reading Test at two levels. High-intermediate (CEFR B2) and Advanced (CEFR C1), using innovative eye-tracking technology and detailed stimulated recall interviews and surveys. Representative reading items were carefully selected from across all parts of the GEPT High- Intermediate Level Reading Test and the GEPT Advanced Level Reading Test. Taiwanese students (n=24) studying Masters level programmes at British universities were asked to complete the test items on a computer, while the Tobii X2 Eye Tracker was used to track their gaze behaviour during completion of the test items. Immediately after they had completed each individual part, they were asked to report the cognitive process they employed by using a Reading Process Checklist, and a further (n=8) then participated in a detailed stimulated recall interview while viewing video footage of their gaze patterns. Taking into account all these sources of data, it was found that the High-Intermediate section of the GEPT test successfully elicited and tested an appropriate range of lower and higher cognitive processes, as defined in Khalifa and Weir (2009). It was also concluded that the Advanced sections of the test elicited the same set of cognitive processes as the High- Intermediate test, with the addition in the final section of the most difficult of all in Khalifa and Weir's scheme. In summary, it is apparent that the two elements of the GEPT test which were researched in this project were successful in requiring of candidates the range of cognitive processing activity commensurate with High-Intermediate and Advanced reading levels respectively, which is an important element in establishing the cognitive validity of the GEPT test.
    • Researching the comparability of paper-based and computer-based delivery in a high-stakes writing test

      Chan, Sathena Hiu Chong; Bax, Stephen; Weir, Cyril J. (Elsevier, 2018-04-07)
      International language testing bodies are now moving rapidly towards using computers for many areas of English language assessment, despite the fact that research on comparability with paper-based assessment is still relatively limited in key areas. This study contributes to the debate by researching the comparability of a highstakes EAP writing test (IELTS) in two delivery modes, paper-based (PB) and computer-based (CB). The study investigated 153 test takers' performances and their cognitive processes on IELTS Academic Writing Task 2 in the two modes, and the possible effect of computer familiarity on their test scores. Many-Facet Rasch Measurement (MFRM) was used to examine the difference in test takers' scores between the two modes, in relation to their overall and analytic scores. By means of questionnaires and interviews, we investigated the cognitive processes students employed under the two conditions of the test. A major contribution of our study is its use - for the first time in the computer-based writing assessment literature - of data from research into cognitive processes within realworld academic settings as a comparison with cognitive processing during academic writing under test conditions. In summary, this study offers important new insights into academic writing assessment in computer mode.
    • Restoring perspective on the IELTS test

      Green, Anthony (Oxford University Press, 2019-03-18)
      This article presents a response to William Pearson’s article, ‘Critical Perspectives on the IELTS Test’. It addresses his critique of the role of IELTS as a test for regulating international mobility and access to English medium education and evaluates his more specific prescriptions for the improvements to the quality of the test itself.
    • Rethinking the second language listening test : from theory to practice

      Field, John (Equinox, 2019-03-01)
      The book begins with an account of the various processes that contribute to listening, in order to raise awareness of the difficulties faced by second language learners. This information feeds in to a new set of descriptors of listening behaviour across proficiency levels and informs much of the discussion in later chapters. The main body of the book critically examines the various components of a listening test, challenging some of the false assumptions behind them and proposing practical alternatives. The discussion covers: the recording-as-text, the recording-as-speech, conventions of test delivery, standard task formats and item design. Major themes are the critical role played by the recorded material and the degree to which tests impose demands that go beyond those of real-world listening. The following section focuses on two types of listener with different needs from the general candidate: those aiming to demonstrate academic or professional proficiency in English and young language learners, where level of cognitive development is an issue for test design. There is a brief reflection on the extent to which integrated listening tests reflect the reality of listening events. The book concludes with a report of a study into how feasible it is to identify the information load of a listening text, a factor potentially contributing to test difficulty.
    • Reviewing the suitability of English language tests for providing the GMC with evidence of doctors' English proficiency

      Taylor, Lynda; Chan, Sathena Hiu Chong (The General Medical Council, 2015-05-13)
      The research project described in this report set out to identify English language proficiency (ELP) test(s) which might be considered comparable to IELTS in terms of their suitability for satisfying the General Medical Council (the GMC) of the English language proficiency of doctors applying for registration and licensing in the UK.  Through a process of consultation between CRELLA and the GMC, the specific aims of the IELTS Equivalence Research Project were established as follows:  1. To identify a comprehensive list of other available test(s) of English language proficiency and/or communication skills apart from IELTS, including any that are specifically used within a medical context (UK and international).   2. To consider how other professional regulatory bodies (both UK and international) check for and confirm an acceptable level of English language proficiency prior to entry into a technical, high-risk profession.  3. To compare the list of tests identified in (1) above to IELTS, with respect to their suitability on a range of essential quality criteria. IELTS was recognised, therefore, as constituting the criterion or standard of suitability against which other potentially suitable English language proficiency tests should be compared.  4. To identify, should one or more tests be considered as at least as suitable as IELTS, what would be the equivalent for these test(s) to the GMC’s current requirements for the academic version of IELTS, as well as how the equivalent scores identified on alternative tests compare to the levels of the Common European Framework of Reference for Languages (2001).