The numbers game

As Ofqual launches a consultation on setting standards for the new GCSEs, Ben Jones considers the complexities of establishing a new system.

Exam paperEarlier this month Ofqual published its consultation on Setting the Grade Standards of new GCSEs in England, the first of which – English, English Literature and mathematics – will be awarded in summer 2017.

While some aspects appear already decided – currently a numerical, 9-point grade range and the ‘pass grade’ to be more demanding than the current grade C ‘to reflect that of high-performing jurisdictions’ – there is much to be resolved.

Phase 1: Stats and standards

At first reading the proposals appear uncontroversial and unproblematic, a good basis for building a new consensus. But beneath the surface, competing – even conflicting – approaches to setting grade standards exist. These reflect different definitions of standards, and the interests and perspectives of the key stakeholders (e.g. government, the regulator, teachers, students, parents, further and higher education, employers and exam boards). Each of these has legitimate expectations of what the grade standards mean and how they are defined, and exam boards, which have to implement whatever is decided, have a key role to play.

Take grade 5, for example, the likely new ‘good pass’ grade. Teachers and employers will legitimately want to know what level of student performance is required to be awarded it. The Secretary of State will want the grade to be benchmarked against the assessments of (high-performing) international competitors. Exam boards, FE colleges and others will want to ensure standards are comparable across successive series. It is to be hoped that these – and other – requirements coincide, but there is no guarantee that they will, especially in the early series of the new qualification.

"Changing the grade range doesn't necessarily mean better outcomes any more than measuring temperature in Celsius rather than Fahrenheit changes the heat of the environment."

The proposal to statistically link the standard of the old to the new specifications, probably at three grades C=4, A=7, 1=F (or G), is sensible, as it provides important points of comparison in a significantly changing landscape. Nevertheless, the performance standard at all the new grades is unknown, and will likely improve in subsequent years as students and teachers become more familiar with the new requirements. Thus, although teachers especially legitimately expect to be informed of the performance standard of the new grades – e.g. what will a grade 5 “look like”? – it is impossible to produce accurate descriptions in the initial years. It is not until stability has returned that these can be precisely derived.

It is, however, in the second year of awarding that the focus will fall.

Phase 2: Stability

At this stage, there are two main concerns about possible scenarios. All standard setting – whether statistical or judgmental – is ultimately comparative in nature and the consultation refers to several bases of comparison: previous cohorts’ outcomes, international benchmarking, reference testing, grade descriptors. But the similarity here is relatively superficial, and attempting to force what in reality are very different approaches into one single standard-setting process risks creating a hodge-podge system rather than a coherent whole.

Working out gradesThere is a strong case for maintaining a consistent approach across all the grades – certainly the key ones – within a given exam series. (The experience of how A* is awarded at GCE should give one pause for thought at least.) There are, however, suggestions that grade 5 will be treated differently, anchored to an international benchmark (the derivation of which itself is not uncontroversial). Such a mixed economy risks introducing uncontrollable and unpredictable relationships between adjacent grade boundaries and outcomes, and thus increases the chances of students being awarded inappropriate grades. There is no reason why grade 5 (or indeed any grade) could not be compared to the international benchmark scale; it does not have to be tied to it. Moreover, tying a grade to a specific place on an international benchmark does not improve the test nor, of itself improve learning.

The second concern regards whether what is decided in terms of grade standards will be fit for purpose. In particular, do the grades differentiate student ability sufficiently across the grade range (i.e. no excessive bunching at particular grades) and do they minimise the chances of students falling into the wrong grade bracket (by ensuring suitable grade widths in terms of marks)?

A couple of final thoughts.

If the last 15 years have taught us anything it is that, even with the best of intentions and maximum forethought, modifying qualifications is fraught with problems, many unforeseen and unanticipated. It will require hard work and expertise to create a new system that will work effectively, and serve students and other stakeholders well.

Improving the assessment and changing the grade range do not necessarily mean better performance outcomes, any more than measuring temperature in Celsius rather than Fahrenheit changes the heat of the environment. Making the assessment harder does not, in itself, raise (educational or performance) standards. Of course it is in all our interests that learning and performance outcomes improve as a result of these reforms, and these are recognised in the grades awarded. The proposals in the consultation make a good start, but to say that reform on this scale is complex is perhaps an understatement. Cooperation across the industry, guidance from assessment experts and the regulator, and a pragmatic approach will help ensure that we get our numbers right first time, and make the transition as smooth as possible.

Ben Jones

Share this page