How good is teacher assessment?

Stefan Johansson discusses the role of teacher assessment in primary school and evaluates its effectiveness as a measure of pupil achievement.

Teacher sitting with pupil in classAlthough there are numerous ways of assessing pupil knowledge and skills, teacher assessment is the most common way to do this in primary school, at least in European countries.

Teachers’ assessments are not used only for reporting and documenting pupils’ results: assessment is a prominent aspect of teaching, crucial for promoting learning, and perhaps particularly so in the primary school years. Because of the vital role played by teachers in assessment, there is often particular interest in the trustworthiness of teachers’ jugdements.

For high-stakes assessment and other summative assessments, the validity of teacher assessment has been questioned, and standardised tests have sometimes been preferred. Indeed, if summative assessments differ between teachers, it is likely that teachers’ formative feedback will be different too, since in practice these concepts often work together.

Our current research tries to seek answers about the strengths and weaknesses of teacher assessment in primary school. This research uses data from one of the international studies of reading literacy, PIRLS (Progress in International Reading Literacy Study) 2001. In connection to the PIRLS 2001 study, Sweden had a national extension where teachers assessed their pupils’ knowledge and skills in the Swedish language. Instead of writing a qualitative judgement about their pupils’ ability, teachers quantified their judgements on a 1-10 scale in relation to 12 predefined aspects of reading and writing. To investigate the trustworthiness of teachers’ assessment, their judgements were related to their pupils’ reading test results in PIRLS, and to background information supplied by pupils, their parents and teachers.

In order to investigate the quality of teacher assessment, the pupils’ results on the PIRLS 2001 test were then compared to the teacher judgements. One question was concerned with how teachers assessed their own pupils’ knowledge and skills (i.e. how well they manage to rank pupils within their own class), while the other was concerned with whether teachers assessed pupils similarly across classes (i.e. that their average assessment corresponded to the average PIRLS result). The analysis was conducted with some 11,000 pupils and their teachers in grades 3 and 4 in primary school. The correlation was about 0.65 in grade 3 and about 0.60 in grade 4.

These results accord reasonably well with other correlational studies of the same relationship, where the researchers typically consider teacher judgements trustworthy. However, the current analysis also reveals that teachers assess their pupils’ gender and socioeconomic status, which the PIRLS test does not. These factors could be an indication of motivation, effort and behaviour which should not be assessed. What might be even more threatening to equality in assessment is our finding that teacher assessment was not consistent across classes. A class with a high result on average in the PIRLS test could be assessed lower by their teacher than a middle- or low-performing class could be by their teacher. This means that the correlation between PIRLS scores and teacher scores across classes was very modest.

The results also showed that teachers in the third grade exhibited a higher correspondence between their judgements and pupil achievement within classes than did fourth grade teachers. Fourth grade teachers had typically taught their pupils for only about one semester. The trustworthiness of teacher judgements seems to be related positively to the length of contact with their pupils. Perhaps it is like visiting the doctor.  Whom do you trust: the doctor who knows your full medical history, or the doctor who assesses you ‘cold’?

Finally, coming back to the question of whether it is possible to trust teachers’ assessments or not: we find that when teachers have spent a fair amount of time with their pupils then yes, they are in a good position to assess their pupils’ knowledge and skills. However, the level of grades or judgements they make may vary across classes, even when pupils’ levels of achievement are similar. Instruments for calibrating teacher assessment therefore seem crucial if equality in assessment is to be achieved.

Stefan Johansson works at the Department for Education and Special Education at the University of Gothenburg where he lectures on assessment in education, teaches courses on applied statistics and supervises undergraduate students. His research interests focus on issues of validity in educational assessments and the effects of teacher competence, pupil gender and socioeconomic status on different forms of assessment.

