• Opening the black box: exploring automated speaking evaluation

      Khabbazbashi, Nahal; Xu, Jing; Galaczi, Evelina D. (Springer, 2021-02-10)
      The rapid advances in speech processing and machine learning technologies have attracted language testers’ strong interest in developing automated speaking assessment in which candidate responses are scored by computer algorithms rather than trained human examiners. Despite its increasing popularity, automatic evaluation of spoken language is still shrouded in mystery and technical jargon, often resembling an opaque "black box" that transforms candidate speech to scores in a matter of minutes. Our chapter explicitly problematizes this lack of transparency around test score interpretation and use and asks the following questions: What do automatically derived scores actually mean? What are the speaking constructs underlying them? What are some common problems encountered in automated assessment of speaking? And how can test users evaluate the suitability of automated speaking assessment for their proposed test uses? In addressing these questions, the purpose of our chapter is to explore the benefits, problems, and caveats associated with automated speaking assessment touching on key theoretical discussions on construct representation and score interpretation as well as practical issues such as the infrastructure necessary for capturing high quality audio and the difficulties associated with acquiring training data. We hope to promote assessment literacy by providing the necessary guidance for users to critically engage with automated speaking assessment, pose the right questions to test developers, and ultimately make informed decisions regarding the fitness for purpose of automated assessment solutions for their specific learning and assessment contexts.
    • Opposing tensions of local and international standards for EAP writing programmes: who are we assessing for?

      Bruce, Emma; Hamp-Lyons, Liz; City University of Hong Kong; University of Bedfordshire (Elsevier Ltd, 2015-04-24)
      In response to recent curriculum changes in secondary schools in Hong Kong including the implementation of the 3-3-4 education structure, with one year less at high school and one year more at university and the introduction of a new school leavers' exam, the Hong Kong Diploma of Secondary Education (HKDSE), universities in the territory have revisited their English language curriculums. At City University a new EAP curriculum and assessment framework was developed to fit the re-defined needs of the new cohort of students.In this paper we describe the development and benchmarking process of a scoring instrument for EAP writing assessment at City University. We discuss the opposing tensions of local (HKDSE) and international (CEFR and IELTS) standards, the problems of aligning EAP needs-based domain scales and standards with the CEFR and the issues associated with attempting to fulfil the institutional expectation that the EAP programme would raise students' scores by a whole CEFR scale step. Finally, we consider the political tensions created by the use of external, even international, reference points for specific levels of writing performance from all our students and suggest the benefits of a specific, locally-designed, fit-for-purpose tool over one aligned with universal standards.
    • Opposing tensions of local and international standards for EAP writing: programmes: who are we assessing for?

      Bruce, Emma Louise; Hamp-Lyons, Liz; City University of Hong Kong; University of Bedfordshire (Elsevier, 2015-04-24)
      In response to recent curriculum changes in secondary schools in Hong Kong including the implementation of the 3e3e4 education structure, with one year less at high school and one year more at university and the introduction of a new school leavers' exam, the Hong Kong Diploma of Secondary Education (HKDSE), universities in the territory have revisited their English language curriculums. At City University a new EAP curriculum and assessment framework was developed to fit the re-defined needs of the new cohort of students. In this paper we describe the development and benchmarking process of a scoring instrument for EAP writing assessment at City University. We discuss the opposing tensions of local (HKDSE) and international (CEFR and IELTS) standards, the problems of aligning EAP needs-based domain scales and standards with the CEFR and the issues associated with attempting to fulfil the institutional expectation that the EAP programme would raise students' scores by a whole CEFR scale step. Finally, we consider the political tensions created by the use of external, even international, reference points for specific levels of writing performance from all our students and suggest the benefits of a specific, locallydesigned, fit-for-purpose tool over one aligned with universal standards.
    • The origins and adaptations of English as a school subject

      Goodwyn, Andrew (Cambridge University Press, 2019-12-31)
      This chapter will consider the particular manifestation of English as a ‘school subject’, principally in the country called England and using some small space for significant international comparisons, and it will mainly focus on the secondary school version. We will call this phenomenon School Subject English (SSE). The chapter will argue that historically SSE has gone through phases of development and adaptation, some aspects of these changes inspired by new theories and concepts and by societal change, some others, especially more recently, entirely reactive to external impositions (for an analysis of the current position of SSE, see Roberts, this volume). This chapter considers SSE to have been ontologically ‘expanded’ between 1870 and (about) 1990, increasing the ambition and scope of the ‘subject’ and the emancipatory ideology of its teachers. This ontological expansion was principally a result of adding ‘models’ of SSE, models that each emphasise different epistemologies of what counts as significant knowledge, and can only exist in a dynamic tension. In relation to this volume, SSE has always incorporated close attention to language but only very briefly (1988–1992) has something akin to Applied Linguistics had any real influence in the secondary classroom. However, with varying emphasis historically, there has been attention (the Adult Needs/Skills model, see later) to the conventions of language, especially ‘secretarial’ issues of spelling and punctuation, some understanding of grammar, and a focus on notions of Standard English, in writing and in speech; but these have never been the driving ideology of SSE. Of the two conceptual giants ‘Language’ and ‘Literature’, it is the latter that has mattered most over those 120 years.
    • Paper-based vs computer-based writing assessment: divergent, equivalent or complementary?

      Chan, Sathena Hiu Chong (Elsevier, 2018-05-16)
      Writing on a computer is now commonplace in most post-secondary educational contexts and workplaces, making research into computer-based writing assessment essential. This special issue of Assessing Writing includes a range of articles focusing on computer-based writing assessments. Some of these have been designed to parallel an existing paper-based assessment, others have been constructed as computer-based from the beginning. The selection of papers addresses various dimensions of the validity of computer-based writing assessment use in different contexts and across levels of L2 learner proficiency. First, three articles deal with the impact of these two delivery modes, paper-baser-based or computer-based, on test takers’ processing and performance in large-scale high-stakes writing tests; next, two articles explore the use of online writing assessment in higher education; the final two articles evaluate the use of technologies to provide feedback to support learning.
    • Phatic communication and relevance theory: a reply to Ward & Horn

      Žegarac, Vladimir; Clark, Billy (Cambridge University Press, 1999-11-01)
      In Žegarac & Clark (1999) we try to show how phatic communication can be explained within the framework of Relevance Theory. We suggest that phatic communication should be characterized as a particular type of interpretation, which we call ‘phatic interpretation’. On our account, an interpretation is phatic to the extent that its main relevance lies with implicated conclusions which do not depend on the explicit content of the utterance, but rather on the communicative intention (where ‘depends on X’ means: ‘results from an inferential process which takes X as a premise’).
    • Phatic interpretations and phatic communication

      Žegarac, Vladimir; Clark, Billy (Cambridge University Press, 1999-07-01)
      This paper considers how the notion of phatic communication can best be understood within the framework of Relevance Theory. To a large extent, we are exploring a terminological question: which things which occur during acts of verbal communication should the term 'phatic' apply to? The term is perhaps most frequently used in the phrase 'phatic communication', which has been thought of as an essentially social phenomenon and therefore beyond the scope of cognitive pragmatic theories. We suggest, instead, that the term should be applied to interpretations and that an adequate account of phatic interpretations requires an account of the cognitive processes involved in deriving them. Relevance Theory provides the basis for such an account. In section 1, we indicate the range of phenomena to be explored. In section 2, we outline the parts of Relevance Theory which are used in our account. In section 3, we argue that the term 'phatic' should be applied to interpretations, and we explore predictions about phatic interpretations which follow from the framework of Relevance Theory, including the claim that phatic interpretations should be derived only when non-phatic interpretations are not consistent with the Principle of Relevance. In section 4 we consider cases where cognitive effects similar to those caused by phatic interpretations are conveyed but not ostensively communicated. © 1999 Cambridge University Press.
    • Placement testing

      Green, Anthony (TESOL International Association and Wiley, 2018-01-01)
    • Placing construct definition at the heart of assessment: research, design and a priori validation

      Chan, Sathena Hiu Chong; Latimer, Nicola (Cambridge University Press, 2020-04-01)
      In this chapter, we will first highlight Professor Cyril Weir’s major research into the nature of academic reading. Using one of his test development pro- jects as an example, we will describe how the construct of academic reading was operationalised in the local context of a British university by theoretical construct definition together with empirical analyses of students’ reading patterns on the test through eye-tracking. As we progress through the chapter we reflect on how Weir’s various research projects fed into the development of the test and a new method of analysing eye-tracking data in relation to different types of reading.
    • Preparing for admissions tests in English

      Yu, Guoxing; Green, Anthony; University of Bristol; University of Bedfordshire (Taylor & Francis, 2021-05-06)
      Test preparation for admissions to education programmes has always been a contentious issue (Anastasi, 1981; Crocker, 2003; Messick, 1982; Powers, 2012). For Crocker (2006), ‘No activity in educational assessment raises more instructional, ethical, and validity issues than preparation for large-scale, high-stakes tests.’ (p. 115). Debate has often centred around the effectiveness of preparation and how it affects the validity of test score interpretations; equity and fairness of access to opportunity; and impacts on learning and teaching (Yu et al., 2017). A focus has often been preparation for tests originally designed for domestic students, for example, SATs (e.g., Alderman & Powers, 1980; Appelrouth et al., 2017; Montgomery & Lilly, 2012; Powers, 1993; Powers & Rock, 1999; Sesnowitz et al., 1982) and state-wide tests (e.g., Firestone et al., 2004; Jäger et al., 2012), but the increasing internationalisation of higher education has added a new dimension. To enrol in higher education programmes which use English as the medium of instruction, increasing numbers of international students whose first language is not English are now taking English language tests, or academic specialist tests administered in English, or both. The papers in this special issue concern how students prepare for these tests and the roles in this process of the tests themselves and of the organisations that provide them.
    • Purposes of assessment

      Hamp-Lyons, Liz (De Gruyter/Mouton, 2016-11-01)
      Commissioned as the lead paper in the Volume, this chapter shows how the terms assessment, testing, examining and evaluation have become increasingly intertwined, and multiple new terms have emerged. The chapter takes a somewhat historical view as it lays out a picture of the ways that the field of language ‘assessment’ has expanded its knowledge and skills but also its socio-political effects. Taking the notion of ‘purposes’ as a broad organizing principle, the chapter is intended to introduce new readers in this field to the wide range of issues current in the field, and prepare them for the more specific chapters that follow.
    • Rating scale development: a multistage exploratory sequential design

      Galaczi, Evelina D.; Khabbazbashi, Nahal; Cambridge English Language Assessment (Cambridge University Press, 2016-03-01)
      The project chosen to showcase the application of the exploratory sequential design in second/ foreign (L2) language assessment comes from the context of rating scale development and focuses on the development of a set of scales for a suite of high-stakes L2 speaking tests. The assessment of speaking requires assigning scores to a speech sample in a systematic fashion by focusing on explicitly defined criteria which describe different levels of performance (Ginther 2013). Rating scales are the instruments used in this evaluation process, and they can be either holistic (i.e. providing a global overall assessment) or analytic (i.e. providing an independent evaluations for a number of assessment criteria, e.g. Grammar, Vocabulary, Organisation, etc.). The discussion in this chapter is framed within the context of rating scales in speaking assessment. However, it is worth noting that the principles espoused, stages employed and decisions taken during the development process have wider applicability to performance assessment in general.
    • Reading in a second language: process, product and practice

      Urquhart, A.H.; Weir, Cyril J. (Routledge, 2014-01-01)
      Reading in a Second Language sets the testing and teaching of reading against a theoretical background, discussing research from both applied linguistics and cognitive psychology. Where possible, it focuses on research into second language readers and distinguishes different kinds of reading, particularly expeditious as opposed to careful reading, and emphasizes the validity of each.Sandy Urquhart and Cyril Weir relate testing and teaching, discussing similarities and differences, providing a comprehensive survey of both methods with the emphasis on those which have been substantiated or supported by research evidence. Finally, the book proposes specific research topics, and detailed advice on how to construct tests of language for academic purposes and suggestions for further research.
    • Recommending a nursing-specific passing standard for the IELTS examination

      O'Neill, Thomas R.; Buckendahl, Chad W.; Plake, Barbara S.; Taylor, Lynda (Taylor & Francis, 2007-12-05)
      Licensure testing programs in the United States (e.g., nursing) face an increasing challenge of measuring the competency of internationally trained candidates, both in relation to their clinical competence and their English language competence. To assist with the latter, professional licensing bodies often adopt well-established and widely available international English language proficiency measures. In this context, the National Council of State Boards of Nursing (NCSBN) sought to develop a nursing-specific passing standard on the International English Language Testing System that U.S. jurisdictions could consider in their licensure decisions for internationally trained candidates. Findings from a standard setting exercise were considered by NCSBN's Examination Committee in conjunction with other relevant information to produce a legally defensible passing standard on the test. This article reports in detail on the standard setting exercise conducted as part of this policy-making process; it describes the techniques adopted, the procedures followed, and the outcomes obtained. The study is contextualized within the current literature on standard setting. The latter part of the article describes the nature of the policy-making process to which the study contributed and discusses some of the implications of including a language literacy test as part of a licensure testing program.
    • Reflecting on the past, embracing the future

      Hamp-Lyons, Liz; University of Bedfordshire (Elsevier, 2019-10-14)
      In the Call for Papers for this anniversary volume of Assessing Writing, the Editors described the goal as “to trace the evolution of ideas, questions, and concerns that are key to our field, to explain their relevance in the present, and to look forward by exploring how these might be addressed in the future” and they asked me to contribute my thoughts. As the Editor of Assessing Writing between 2002 and 2017—a fifteen-year period—I realised from the outset that this was a very ambitious goal, l, one that no single paper could accomplish. Nevertheless, it seemed to me an opportunity to reflect on my own experiences as Editor, and through some of those experiences, offer a small insight into what this journal has done (and not done) to contribute to the debate about the “ideas, questions and concerns”; but also, to suggest some areas that would benefit from more questioning and thinking in the future. Despite the challenges of the task, I am very grateful to current Editors Martin East and David Slomp for the opportunity to reflect on these 25 years and to view them, in part, through the lens provided by the five articles appearing in this anniversary volume.
    • The relative significance of syntactic knowledge and vocabulary breadth in the prediction of reading comprehension test performance

      Shiotsu, Toshiko; Weir, Cyril J.; Kurume University, Japan; University of Bedfordshire (SAGE, 2007-01-01)
      In the componential approach to modelling reading ability, a number of contributory factors have been empirically validated. However, research on their relative contribution to explaining performance on second language reading tests is limited. Furthermore, the contribution of knowledge of syntax has been largely ignored in comparison with the attention focused on vocabulary. This study examines the relative contribution of knowledge of syntax and knowledge of vocabulary to L2 reading in two pilot studies in different contexts - a heterogeneous population studying at the tertiary level in the UK and a homogenous undergraduate group in Japan - followed by a larger main study, again involving a homogeneous Japanese undergraduate population. In contrast with previous findings in the literature, all three studies offer support for the relative superiority of syntactic knowledge over vocabulary knowledge in predicting performance on a text reading comprehension test. A case is made for the robustness of structural equation modelling compared to conventional regression in accounting for the differential reliabilities of scores on the measures employed. © 2007 SAGE Publications.
    • Relevance Theory and the in second language acquisition

      Žegarac, Vladimir (SAGE, 2004-07-01)
      This article considers the implications of Sperber and Wilson's (1986/95) Relevance Theory for the acquisition of English the by second language (L2) learners whose first language (L1) does not have an article system. On the one hand, Relevance Theory provides an explicit characterization of the semantics of the, which suggests ways of devising more accurate guidelines for teaching/learning than are available in current textbooks. On the other hand, Relevance Theoretic assumptions about human communication together with some effects of transfer from L1 provide the. basis for a number of predictions about the types of L2 learners' errors in the use of the. I argue that data from previous research (Trenkić, 2002) lend support to these predictions, and I try to show that examples drawn from the data I have collected provide evidence for the view that L2 learning is not influenced only by general pragmatic principles and hypotheses about L2 based on transfer from L1, but that learners also devise and test tacit hypotheses which are idiosyncratic to them.
    • Repeated test-taking and longitudinal test score analysis: editorial

      Green, Anthony; Van Moere, Alistair; University of Bedfordshire; MetaMetrics Inc. (Sage, 2020-09-27)
    • Research and practice in assessing academic English: the case of IELTS

      Taylor, Lynda; Saville, N. (Cambridge University Press, 2019-12-01)
      Test developers need to demonstrate they have premised their measurement tools on a sound theoretical framework which guides their coverage of appropriate language ability constructs in the tests they offer to the public. This is essential for supporting claims about the validity and usefulness of the scores generated by the test.  This volume describes differing approaches to understanding academic reading ability that have emerged in recent decades and goes on to develop an empirically grounded framework for validating tests of academic reading ability.  The framework is then applied to the IELTS Academic reading module to investigate a number of different validity perspectives that reflect the socio-cognitive nature of any assessment event.  The authors demonstrate how a systematic understanding and application of the framework and its components can help test developers to operationalise their tests so as to fulfill the validity requirements for an academic reading test.  The book provides:   An up to date review of the relevant literature on assessing academic reading  A clear and detailed specification of the construct of academic reading  An evaluation of what constitutes an adequate representation of the construct of academic reading for assessment purposes  A consideration of the nature of academic reading in a digital age and its implications for assessment research and test development  The volume is a rich source of information on all aspects of testing academic reading ability.  Examination boards and other institutions who need to validate their own academic reading tests in a systematic and coherent manner, or who wish to develop new instruments for measuring academic reading, will find it of interest, as will researchers and graduate students in the field of language assessment, and those teachers preparing students for IELTS (and similar tests) or involved in English for Academic Purpose programmes.