Volume 5, 2003, Paper 3
Paper 3: 'What's your score?': An investigation into language
descriptors for rating written performance
Peter Mickan
This study addresses the problem of inconsistency in rating
IELTS exams and the need for valid criteria for rating levels of
written performance.
Determining written performance involves a series of complex
semiotic events or processes. On the part of examination candidates
these include the interpretation of prompts and the composition of
written responses. On the part of raters, the process includes
interpreting criteria for rating, making sense of candidates'
scripts and attributing rating criteria to text features in the
scripts. Each of these events contributes to variation in scoring
individual scripts. Scoring is a linguistically mediated activity
and for this reason I planned the study as an investigation into
the language features of candidates' scripts at different levels of
written performance, and to see whether those language features
might delineate performance level.
The data for this study came from students who were non-native
speakers of English studying general English and English for
academic purposes. The students wrote responses to Tasks 1 and 2 of
the IELTS General Training Writing Module. Their texts were graded
into three performance levels—basic, intermediate and upper
intermediate.
The study examined language features of the subjects' texts as
possible indicators of written language development at different
levels of performance. The analysis identified numerous linguistic
options writers chose in response to the prompts. Some of these
stemmed from misunderstandings of cultural knowledge implicit in
the topics of the prompts. Different interpretations of the task
resulted in observable differences in responses. However, the
analysis revealed that less developed texts expressed the
interpersonal function with difficulty, using familiar and personal
terminology when more formal linguistic choices would have been
appropriate. Lower level texts were limited in the organisation of
factual information, which was expressed with limited technical
terminology. More developed texts demonstrated fluency through the
use of clearly linked concepts. Even though upper intermediate
texts contained structural errors, these did not interfere with
understanding candidates’ meanings.
The analysis indicates that it is not easy to identify specific
lexicogrammatical features that distinguish different levels of
performance. It is the sum of language features integrated
textually which create successful scripts. This suggests that
assessment, which focuses on isolated language elements such as
vocabulary or sentence structures, detracts from the semiotic
processes of composing texts and interpreting and therefore rating
texts. The study proposes that as rating is a complex,
meaning-making activity, raters respond first and foremost to texts
as a whole rather than to individual components. A possible
explanation for this is that raters have expectations which stem
from sociocultural conventions in terms of text types and social
purposes of texts. Assessment is a response to the conventional
combinations of linguistic elements in texts. These elements are
numerous and vary considerably across texts.
The analysis of lexicogrammatical elements of texts provides
evidence of how social purposes are realised in texts and raises
questions about how these influence rating processes. This study
suggests further research into the use of holistic descriptors of
texts for scoring written texts.