Big data or small data?

We hear a great deal about big data and its potential to revolutionise education. It's time to analyse the reality: what data should we be looking at, and is this potential ready to be realised?

big dataIn England, we are blessed with a phenomenal amount of publicly available data about our education system released by the Department for Education (DfE) and its partner agencies. The most impressive source of information is the National Pupil Database (NPD) which securely holds data at a pupil level on every child in publicly funded education.

The potential applications of the NPD are vast, and the DfE recently broadened the criteria allowed for using NPD data so that it now covers usage “for the purpose of promoting the education or well-being of children in England”. School-level and local authority level data, which is available to download from, also provide tremendous potential.

Interrogating such datasets can provide valuable insights. For example, we have used NPD data to project demand for post-16 provision for children with learning difficulties, and to understand why students drop out of school at 17. Our Skills Route project uses value-added data from performance tables to provide students with post-16 careers advice, and won the recent Education Open Data Challenge.

Keeping things small

While these large-scale, nationwide datasets are powerful and vital for central policy and accountability purposes, they are arguably not the most important sources of data in our education system. And here is where we must exercise caution in our excitement about ‘big data’.

In terms of informing day-to-day pedagogical decisions, there is no substitute for schools’ own in-house tracking data and individual question-level analysis. While the datasets may be smaller, and therefore their technical requirements lower, this is the only real-time way of identifying the children in need of more support, and in which particular areas of the curriculum. This is why in Wandsworth, where I work, virtually all primary schools use analysis from Optional Year 3, 4 and 5 tests which allow them to set up targeted interventions for individuals or groups of pupils, before they reach the end of the primary phase. Outcomes at Key Stage 2 have benefitted as a result and Wandsworth now ranks amongst the top 5 local authorities in England for progress from Key Stage 1 to 2 in each of reading, writing and maths.

The Future…

As online testing and technologies like tablet PCs become more commonplace in the classroom, the potential for efficiently gathering ever bigger sets of data will increase and, in hand, so will opportunities for personalised learning. Longer term, there may be opportunities for seamlessly linking educational outcomes to other sources of personal data such as health, nutrition and exercise, but privacy concerns and data sharing practicalities are likely to mean this aspiration is some way off.

However, before we start craving more and more data, we need to focus on getting the most out of the data we already have. Teachers often tell us that they are already swamped with too many sources of data; the way to get the most out this data is by improving how it is consolidated, communicated and visualised, and in turn understood and applied by those in the classroom.

There is much talk about big data in education, and there is no doubt that it has potential to revolutionise teaching and learning, but the key to generating actionable insights is having the skills to interrogate and understand data, both big and small. As public funding continues to be scaled back, getting the most out of the data we already have, when combined with deep practitioner knowledge, will ensure we focus resources on the pupils that need it most.

Steve Preston is Director of MIME Consulting, an educational data consultancy, as well as Performance Information Manager in the London Borough of Wandsworth.

Share this page