Many ways to skin a cat: Delivering the same school workforce data for different users
I’m Callum, I head up a data science team at the Department of Education (DfE), where we’ve been working on adapting data products to increase their usefulness and value for different user groups – ‘democratising our data’.
Government publishes a lot of data. In 2017 we published almost 3,600 different datasets, on everything from the number of unduly lenient criminal sentences to the wholesale price of fruit and vegetables. So the data is out there – theoretically anybody with a computer can go on the Government’s statistics page and access the publication (typically a static PDF with high level descriptions of the key figures and trends, and an accompanying underlying dataset from which the graphs and trends were derived). But how easy is it for them to actually use the data, in a form that’s appropriate to them to get the answers they need?
Before I get in into my data story we’ll take a bit of a diversion from Government data – I want to briefly mention a model around how people consume data. For any one piece of data put out in the big wide world to be consumed, this (unnamed) theory says that you’ll have three types of consumer (see here and here for the blogs from which I’ve taken some of the wording from):
- Skimmers – those who only want to briefly engage with easily accessible content, getting the most newsworthy ‘who, what, why, where, when, how’ answers straight away
- Dippers – engaging more, but casually, to get ‘the important information’ behind the skimmers’ headlines
- Divers – those who want the ‘full story experience’, complete with background information such as methodologies and assumptions, or underlying data
So, what data am I talking about, and how does our work relate to ‘data democratisation’ and ‘skimmers, dippers, and divers’? On the first Thursday of every November DfE collects the ‘School Workforce Census’ (SWFC) – data on every single teacher in the country – where they’re teaching, what they’re teaching, and if there are any vacancies in their schools being some of the main areas. This is then cleaned and compiled into a statistical that’s typically published in the following June or July. As with other government statistical publication the SWFC’s two key products are a static PDF report (see 2016’s here), and an accompanying underlying dataset, published at school, Local Authority, and regional level (see 2016’s here). The PDF report caters for the skimmers – containing high level findings like ‘between 2011 and 2016, the rate of entry into teaching has remained higher than the percentage of qualified teachers leaving the profession’, and the underlying dataset could be exploited, visualised, modelled, or filtered by a ‘diver’ if they wanted to really get into it. But what about our dippers, who want to get into the detail, but not load the data into Excel, R, or Python, or our less technical divers, who want to merge and compare data within the dataset, but don’t have the computing skills to allow them to? Maybe even what about the skimmers, who just wanted/needed that one headline figure for a school, Local Authority, or region?
We began exploring different options for putting this data in the hands of users in ways which were more useful for them and that allowed them ultimately, to make better decisions. I won’t go into the details of our design and build process, because that’s not the theme of this blog, but instead talk about our two new data products and how they more effectively put data at the hands of the end user – the citizen.
Our first product (see here) is one that puts the SWFC data firmly in the hands of the ‘divers’. Previously, using the underlying Excel spreadsheet wasn’t particularly conducive to getting an overview of one school, let alone compare two or more schools. However, we knew anecdotally, that users, particularly schools, were interested in being able to see how their workforce compared against schools of similar characteristics. So, the tool lets them do that. They’re able to either compare themselves against schools that are similar to them on any combination of 19 characteristics or select specific schools to compare themselves against. By making comparisons of the data easier for schools, we’re hoping that coupled with they’re contextual knowledge of their school and schools local or similar to them, they’re able to use this tool to improve the deployment of their workforce. However, workforce planning isn’t a trivial task, so presenting the data in this format is designed to help the divers access detailed, granular information easily.
We’ve developed our second product in conjunction with Microsoft – firstly it’s a good example of public/private collaboration! The product is a dashboard (see here for Microsoft’s blog about it and a link at the bottom to the dashboard) containing Local Authority and regional aggregated data about schools within each geographical boundary. This is one very much for the dippers – it doesn’t provide headline statements which appeal more to skimmers, but it does give easily accessible higher level time series and comparisons, without requiring too much work and technical knowledge to draw the findings out. Users (of which we foresee there being a wide range, both internal and external) are able to access data on workforce volumes, the characteristics of teaching and support staff, and absences and vacancies. There’s no new data here to what’s already been published, it’s just presenting it in more accessible manner.
That’s the clincher – making our data as accessible as possible for our key users, to allow them to draw as many useful conclusions as they can from it in a manner suitable for them. We’re definitely not finished with this process, and we’ll be looking to refine and develop our product offering through feedback from users. However, the availability of powerful visualisation and data sharing tools has given us a fantastic opportunity to be innovative in the way we put our data in the hands of the citizen – it’s their data, it’s only right we keep ‘democratising’ it.