Michael Gove won new friends at a 'hack weekend' on pupil data. Tony Parkin reports
As the nation forlornly pinned its hopes, via television, on the Men's Wimbledon final, the men who watch sheep were sitting attentively in the Westminster Hub in the Haymarket, ignoring the tennis.
They were at the 'show and tell' at the end of Rewired State's National Pupil Data Appathon, listening to Zoe Rose, a young information architect tell them you don't improve sheep by weighing them. And foremost among those men who watched sheep, though not sheepishly, was Michael Gove MP, secretary of state for education, along with a group of advisers and Department for Education (DfE) officials.
The usual pizza marathon and sleepless night of the opening Saturday at a 'hack weekend' was followed by the Sunday afternoon session given by some slightly disappointed developers. But while Andy Murray was snatching defeat from the jaws of victory at Wimbledon, the Rewired State crew achieved the exact opposite. And the unexpected presence of the secretary of state at the 'show and tell' fazed them not at all.
With the benefit of hindsight, it would have been harder to choose a worse weekend to run a hack event. Conflicting events in San Francisco, and by both Google and Mozilla in London, were not known about when the date was set. And they helped thin the ranks of Rewired State developers to a handful. Who knows, there may even have been Grand Prix and Wimbledon fans in the hacker community who may otherwise also have attended. Happily the Young Rewired State crew turned up in their usual numbers – perhaps also because pupil data is of direct relevance to them, and of course they could use the hack day to start trying to find themselves, in all possible meanings of that phrase.
DfE data custodians included both positive and wary
The DfE data custodians were clearly in two minds about the whole concept. While some were clearly enthusiastic about the concepts of transparency and open data promoted in the Cabinet Office White Paper, others were very wary. Short of National Health Service records it is hard to think of more potentially sensitive data than the DfE's pupil level database.
A huge dataset, it contains the individual performance of each and every child passing through our educational system. It can be used as a stick with which to beat schools, to highlight their lack of 'value-add', or it can provide a school with powerful information to help them personalise the learning, and get the best out of every child – and teacher. So perhaps the element of caution was not unsurprising given the prospect of scare-mongering Dail Mail headlines and teenage hackers being able to identify their classmates. Which, of course, is exactly what some of the teenage YRS crew attempted.
But did they succeed? Well, one at least probably did. Kush had the advantage of having taken Latin and getting an 'A' at a school where the other three candidates in the subject were girls. Since the pupil data supplied included school identifiers, plus pupil gender, subjects and grades, that particular task wasn't too challenging.
However, in the discussion as the event drew to a close the feeling seemed to be "So what?" The young people and the parents in the room seemed quite relaxed about the idea. To be able to narrow down the records to identify one student in this way you needed to know so much about that person that the dataset was unlikely to then reveal anything that you didn't already know. And if, say, you were a prospective employer checking up on a potential employee's grades, if they had 'adjusted' the grades on their CV you wouldn't be able to identify them anyway.
One young developer astutely pointed out that disguising the school a pupil attended would be sufficient to guarantee anonymity. But the sceptics in the room, knowing that beating schools with the data was probably one of the DfE objectives, shook their heads knowingly, thinking "That'll never happen."
Data needs to be a little more free to create useful apps
So the developers' work had been made extremely difficult by the challenging conditions that cautious DfE officials had imposed on their access to the data. The hackers were not allowed to download the data locally, and were confined to handling a mere 20 records at a time, which had the unfortunate impact of tying one hand behind their backs, and making the chance of coming up with a functioning tool in a weekend nigh on impossible.
The mind-boggling size of the dataset also threw many of the novice hackers – there were millions of records and thousands of fields, many of which were empty as they weren't applicable to particular individuals. Undaunted, during the hacking sessions a number of the developers created tools that helped them access the data and clean it up into a more presentable and accessible form.
The fact that most developers hadn't managed to achieve what they had hoped to was not unusual for a hack day, but this time the odds really had been stacked against them, and the pickings were slimmer than usual. But it is to Michael Gove's great credit that, even though warned of this fact in advance, and offered an alternative feedback session later in the week, he still chose to turn up for the session. And not only turned up, but engaged, listened attentively, asked questions and above all did not patronise these young people who had given up a weekend for the joy of handling pupil data.
For openers Edward Woodhead gave a quick overview of what exactly an API is (application programming interface), with a great analogy linked to to food distribution. I did get confused when it came to the bit about restaurateurs and chefs, but I think that was my lack of knowledge about the food industry rather than a weakness in the explanation. This was the first of several analogies that were to illuminate the show and tell... sheep and a death-match still lay ahead.
Postcodes don't have to be a lottery when there's an app to hand
Edward Woodhead also showed a working Schoolfinder application, that would inform residents via their postcode of the relative performance of their nearby schools. This neat idea highlighted the fact that these were real developers drawn from the field, and not from an education background. Clearly no-one had the heart to show Edward to DfE's Schools and Statistics site, already live on the web, aimed at this requirement, but he is to be congratulated on identifying and successfully coding a tool that aimed to do something for which there is clearly an identified demand.
Information architect Zoe Rose turned to the sheep analogy in her impassioned analysis of the learning from the weekend. Her own project idea had been stymied by finding that the restricted data access prevented her following anonymised students through time – she had hoped to look for trends that helped with assisting individual improvement. Pointing out that quite often you don't know what valuable gems are hidden in the data, she said the key challenge is try and find patterns and trends, and then work out what they mean.
Zoe Rose seemed to know a lot about sheep, just as Edward had known food distribution, and built up a picture of the importance of understanding how to derive relevant information by describing data-handling by sheep farmers. Gathering data is not enough. "You don't improve sheep just by weighing them," said Zoe Rose. It's a line often used in education, though usually about pigs, when criticising politicians who seem to believe that ever more extensive and repetitive assessment and testing of pupils is essential.
Many eyes glanced nervously towards Michael Gove at this point, who seemed unfazed and intrigued. Especially when she went on to say that knowledge of the weight of a sheep, and of its fleece, can be crucial when matched with other data that helps inform a farmer who is trying to improve wool yields. Some of us were wondering if she was going to move on to talking about sorting sheep from goats at this point, but wisely she avoided the temptation to raise the contentious grammar school issue when it was all going so well (covered in Debbie Davies' video below).
Matthew Pickering was up next, and clearly knew the way to a minister's heart. He had built a tool which looked for relationships between a pupil doing well in one subject, and their performance in other subjects. He started by showing that a high performance in maths was an excellent indicator of probable high performance in other subjects, then switched to Latin and showed that it was also a reliable indicator of excellence – smart move, bright lad. Of course we could all think of factors that helped explain that particular data finding, but top marks to Matthew for knowing how to tailor your presentation for your audience. Michael Gove was positively beaming now he was off the unfamiliar territory of sheep and on to the more familiar ground of Latin.
Like Zoe Rose, a number of the other developers had not been able to turn their ideas into digital flesh. Paul Rissen had at least had chance to explore the data structures, and outlined the potential for linking with Edubase and other data. Issy had wanted to explore the performance of pupils who had moved between private and state schools, but unfortunately some of the key data for such a project was held by the DfE and had not been included in the released data. At least one of the assembled special advisers looked particularly rueful at this omission.
Tom, Harry and crew had developed a rudimentary tool linking the GCSE subjects people had studied which led to success in subjects at 'A' level, but had unfortunately managed to delete the crucial code just prior to the presentation. They were able to show just enough of it, hastily rebuilt, for us to grasp the concept.
The 'Crèche', as the Young Rewired State's younger members are lovingly called, as ever provided information and amusement in equal measure at the end of the session. Overwhelmed by data overload, they had nonetheless built a couple of tools that helped create an URL to extract the data, and tidy it up. Their tour de force though was Versus, a World of Warcraft-inspired head-to-head Death Match between rival schools. This came with a sort of Top Trumps game-play based on schools' relative performances in English, maths etc, though disguised by suitable gamer language. Feed in the school postcodes, choose the 'realm of interest', and watch the mighty animated GIFs fight to the death. Only later did they admit that the code wasn't quite finished so in the demo the one on the left always won.
Superb tea and cakes were followed by an equally tasty discussion about the implications of the weekend. While no new apps were forthcoming, it was clear that the cause of open data had taken a great step forward, as had the reputation of Michael Gove for many in the room. As he left, after a relaxed speech of thanks, the secretary of state happily posed on a flight of stairs for photographs with the developers to whom he had listened attentively for over more than two hours.
Arranging them on the stairs, Zoe Rose was told by the photographer to "stand next to Michael and be Mrs Gove for the photo". Looking at the bright and blooming developer, Michael Gove laughed and quipped, "I should be so lucky!" He might be in trouble with Mrs Blurt for that line, but nobody cringed, or looked embarrassed, because he had been accepted as by this group who had spent a weekend having fun hacking around with the DfE's treasured pupil data. Truly Michael Gove's finest hour (actually, two).
Developers with Gove, Zoe with sheep – Emma Mulqueeny
YouTube Video: Debbie Davies
Feedback – Tony Parkin