An evaluation of an online comparative judgement platform on teacher workload and pupils’ English outcomes in Years 7, 8 and 9

Description of the innovation

During intervention lessons, pupils were supported to make multiple comparative judgements of both older pupils’ work and the work of their peers. This was done using No More Marking software in four lessons over two cycles of work.

Summary of the evaluation

The study involved a total of 24 classes of Year 7–9 pupils from four urban secondary schools with lower than national average proportion of disadvantaged pupils. Two classes from each year group in each school took part. Assignment to treatment was carried out at a whole class level using key stage 2 (KS2) writing scores to minimise the difference in prior attainment between control and intervention cohorts.

The pre-test, immediate post-test and delayed post-test (one month after completing the second cycle of work) were all questions in the style of GCSE English Paper 1 Section B – a descriptive piece of writing based on a visual stimulus.

Summary of results

The study found that the use of No More Marking by pupils for two cycles of key stage 3 (KS3) descriptive writing lessons over a period of one to two months, led to pupil outcomes in descriptive writing that are comparable to the use of conventional teacher marking (delayed testing effect size = -0.06, n = 466). However, there was variation in effect sizes at the delayed post-test between boys (-0.23), girls (+0.14) and disadvantaged pupils (-0.20). In almost all cases the effect size was larger for the delayed post-test than the immediate post-test.

The study also found that the use of this intervention reduces teacher perception of their workload (t-test p-value < 0.001) compared to the work involved in conventional marking and feedback. Qualitative pupil responses indicated a greater enjoyment of lessons than normal for the intervention cohort. Pupil responses also indicated that, despite the withdrawal of (often labour-intensive) conventional feedback provided by the teacher, the intervention cohort felt equally able to both describe what a good piece of work looked like, and to produce a better quality piece of work in the future. These results are important because they could have positive implications for teacher retention.


Lead school

  • Notre Dame Catholic High School, Sheffield

Main findings

  • overall delayed testing effect size = -0.06
  •  -0.23 for boys, +0.14 for girls and -0.20 for disadvantaged pupils

