Volume 6, 2006, Paper 2
Paper 2: An examination of the rating process in the revised
IELTS Speaking Test
Annie Brown
Ministry of Higher Education and Scientific Research
United Arab Emirates
This study examines the validity of the analytic rating scales
used to assess performance in the IELTS Speaking Test, through an
analysis of verbal reports produced by IELTS examiners when rating
test performances and their responses to a subsequent
questionnaire.
ABSTRACT
In 2001 the IELTS interview format and criteria were revised. A
major change was the shift from a single global scale to a set of
four analytic scales focusing on different aspects of oral
proficiency. This study is concerned with the validity of the
analytic rating scales. Through a combination of stimulated verbal
report data and questionnaire data, this study seeks to analyse how
IELTS examiners interpret the scales and how they apply them to
samples of candidate performance.
This study addresses the following questions:
- How do examiners interpret the scales and what performance
features are salient to their judgements?
- How easy is it for examiners to differentiate levels of
performance in relation to each of the scales?
- What problems do examiners identify when attempting to make
rating decisions?
Experienced IELTS examiners were asked to provide verbal reports
after listening to, and rating a set of the interviews. Each
examiner also completed a detailed questionnaire about their
reactions to the approach to assessment. The data were transcribed,
coded and analysed according to the research questions guiding the
study.
Findings showed that, in contrast with their use of the earlier
holistic scale (Brown, 2000), the examiners adhered closely to the
descriptors when rating. In general, the examiners found the scales
easy to interpret and apply. Problems that they identified related
to overlap between the scales, a lack of clear distinction between
levels, and the inference-based nature of some criteria. Examiners
reported the most difficulty with the Fluency and Coherence scale,
and there were concerns that the Pronunciation scale did not
adequately differentiate levels of proficiency.