Is the digital past at risk from time meddlers?

In the 1965 Doctor Who story The Time Meddler, William Hartnell’s Doctor takes on a meddling monk, portrayed by the reliably brilliant Peter Butterworth.

Although hanging out in 1066 the Monk is a time traveller and plans to use an ‘atomic cannon’ to avert the battles of Fulford and Stamford Bridge by wiping out the Viking fleet of Harald Hardrada. His aim is to ensure that King Harold Godwinson, not William of Normandy triumphs at Hastings and that history is thereby rewritten – the Hundred Years War avoided and “jet airliners by 1320”. Of course Hartnell’s Doctor swiftly puts at stop to this “disgusting exhibition” and the Monk’s plan to disrupt history is foiled. The past is immutable.

But history, or rather the body of evidence we use to construct history, isn’t immutable. It’s prone to fire, flood and other kinds of natural and unnatural disaster. Collecting books in a library so they can be cared for and consulted together is a fabulous idea unless you lose the library. We’re reasonably comfortable with the idea that there might be gaps in the documentary record. What’s a little more unusual is if people start filling them in.

In 2008, the National Archives announced that it had foiled an attempt by a historian to insert faked documents amongst files in its collections. The historian cited the documents he had himself manufactured in support of his claims that Heinrich Himmler was murdered by British intelligence and that the Duke of Windsor (the former Edward VIII) was instrumental in the fall of France in 1940. This was a highly unusual case and although tampering with the documentary record in this way has always been possible – medieval monks have been accused of beefing up references to Jesus in their copies of Josephus – normally libraries and archives have trouble with people making off with documents rather than adding them.

The digital realm seems to offer a solution to at least this latter problem: the infinite duplicability of digital material seems to offer a situation in which data can never really be lost. In practice this is simply not the case. Luke McKernan has shown how ephemeral YouTube content can be and the great work of the self-proclaimed “rogue archivists, programmers, writers and loudmouths” at Archive Team only emphasises that when no one gets to a closing site in time, its data is gone for good. And with dynamic content much harder to capture than static text and images, many archived sites are shadows of their former selves. They are representations of a digital artifact but the artifact itself has in some sense been lost. In some cases an archived website is a facsimile akin to a photocopied parchment.

We need to be mindful of these discrepancies. But there are others. As I write this, while it is possible to delete a tweet it’s not possible to edit one – and for good reasons. On Wikipedia I can see every edit that’s ever been made to a page. And digital archives such as those held at Kew or in Washington DC use a range of techniques to ensure the integrity and fixity of their data. But no such checks exist on data before it has been archived and the context of a digital item is consequently even more crucial than a physical one, while simultaneously being less straightforward to determine. I can edit this blog post at any time without leaving a record. In the UK, the PM’s speeches online may well be edited to remove party political content. The White House will tidy things up when the President fumbles exactly how Aretha Franklin told us to find out what respect meant to her. So when we read a transcript online is it a transcript or a ‘transcript’? In film preservation, we can trust the AFI and the BFI to hold original 1977 prints of Star Wars. But most of us have to make do with compromised, edited versions. Special Editions aren’t terribly useful if the question we’re trying to answer relates to the content or reception of the original.

What is the case is that the ease with which a digital artifact can be reproduced also means that it is far easier for a source item which has been tampered with to propagate its inaccuracy. Sometimes these propagated inaccuracies are completely innocent. Monica Green and her colleagues have described how an image of leprosy from James le Palmer’s Omnia Bonum became widely used (by the Museum of London, documentarians and in journals such as Nature and Past and Future) to illustrate plague, to mildly red faces all round. Once separated from its metadata, the nature of a digital image can become mutable. We’re a long way here from Stalinist photo editing – although I do like the Telegraph’s version of Kim Jong-Un as astronaut produced in ironic celebration of the Korean Central News Agency’s reputed fondness for a nice bit of Photoshop.

For the digital present, the advice to historians remains as it has always been: know your sources. Rigorous methods will protect the historical narrative from time meddling.

Jo Pugh is a doctoral student working with the University of York and the National Archives. His work focuses on how researchers navigate large digital collections of cultural heritage material.