Capitol building, Washington, D.C.

Models for preserving news archives that long served the industry leave digital content in peril

Edward McCain, digital curator of journalism at the Donald W. Reynolds Journalism Institute, spoke about the growing loss of born-digital content at Dodging the Memory Hole: Beyond NDNP, a meeting of concerned archivists, journalists and other stakeholders held on Sept. 16 in Washington, D.C. Below are his remarks.

How many of you read George Orwell’s novel Nineteen Eighty-Four in school?

Winston Smith, the story’s protagonist, works in London at the Ministry of Truth rewriting news articles, revising history in support of the ever-changing Party narrative. Newly written articles, doctored photos and other tampered evidence replace originals; previous documentation must be erased. The author describes the process:

“In the walls of the cubicle there were three orifices. … The last was for the disposal of waste paper. Similar slits existed in thousands or tens of thousands throughout the building, not only in every room but at short intervals in every corridor. For some reason they were nicknamed memory holes. When one knew that any document was due for destruction, or even when one saw a scrap of waste paper lying about, it was an automatic action to lift the flap of the nearest memory hole and drop it in, whereupon it would be whirled away on a current of warm air to the enormous furnaces which were hidden somewhere in the recesses of the building.”

Today we face a very real memory hole of our own making, especially when it comes to journalism. The past five decades have manifested a sea change in the way news content is produced, distributed and preserved. The move from analog to digital has disrupted the print and broadcast revenue models and seems likely to do so for the foreseeable future. Just as these legacy economic models have been upended, the old ways of preserving and maintaining access to journalistic content are in flux. The story of the Tucson Citizen’s shutdown in 2009 provides an interesting case of how the born-digital memory hole manifests itself today.

In her epitaph of the Tucson Citizen, editor Jennifer Boice wrote: “A newspaper doesn’t close, it dies, and the death leaves a hole in the community.” The Gannett Corporation closed the Tucson Citizen after months of discussions and heated controversy. The librarians had only a few hours to prepare for the abandonment of the morgue they had spent decades nurturing.

After many other options were pursued, the Arizona Daily Star, another local paper, offered to take the Citizen’s morgue, as there might be some uses for it — such as for the Arizona statehood centennial supplement in 2012. The Citizen’s library had coverage of Arizona dating back to when the area was still the Arizona Territory. Its news library was a singular source of information about southern Arizona history, with stories and photos about Native Americans including Geronimo and the Chiricahua Apache people, cavalry campaigns, raids by Pancho Villa, the shootout at the OK Corral and rapid changes to the Arizona people and landscape over a period of 139 years.

Sadly, the Citizen’s digital content sustained significant damage during the paper’s shutdown process. A power surge fried a server with videos linked to Citizen stories. Photos linked to stories went missing. Human error played a role in the accidental erasure of a server that indexed stories. The resurrected Citizen website uses a database that was not the original online version of the paper. It’s doubtful that an authentic version of the original website still exists or could be reconstituted.

The loss of the Tucson Citizen and significant aspects of what could be its archive represents a great loss for not only the city of Tucson, the state of Arizona and our nation, but also serves as a clear example of how models for preserving news archives that served us for years now leave digital content in peril.

The long list of defunct newspapers — and their potential archives — continues to grow almost daily. How many community histories will vanish down the memory hole with them?

Dangers to digital content come in many forms. Take broken or missing links, the cause of the all too familiar “404 Page Not Found” error message. According to a recent GDELT Project/Internet Archive estimate, as many as 14 million news articles have been lost due to link rot in the past six months. To put that in context take the total output of The New York Times over the last half century – then double it.

One of the most devastating cases of born-digital news loss happened at the Columbia Missourian, the daily paper serving mid-Missouri, published by the Missouri School of Journalism students and faculty. In 2002 the out-of-date server hosting the Missourian’s obsolete content management system crashed. The backup system failed. In the end, the newspaper’s digital inventory of 15 years of stories and seven years of photojournalism were gone forever.

That painful lesson wasn’t lost on the Missouri School of Journalism. In collaboration with MU Libraries and the new Donald W. Reynolds Journalism Institute, a new position was created to advance the knowledge and practice of preserving born-digital news content. In July of 2013, I was fortunate to be named MU’s first digital curator of journalism.

Since then, we’ve launched the Journalism Digital News Archive (JDNA) systems change agenda, a comprehensive strategy to address the complex and dynamic set of problems driving the loss of born-digital news content. One of JDNA’s key programs is the Dodging the Memory Hole initiative. Thus far, we’ve had two Dodging the Memory Hole events, the first at Missouri and the second in North Carolina. We also have a third event planned at UCLA in 2016. But at its core, Dodging the Memory Hole’s purpose is to foster outreach and awareness, which is why I’m here with you at this moment.

Today, I want to spend some time with you exploring what I see as an opportunity to make meaningful progress in preserving an important part of American heritage: its born-digital news content. In the next 90 minutes or so I want to invite you to envision how we can leverage the existing NDNP infrastructure to go beyond that 1923 date (and I know some of you already are) to incorporate contemporary journalism. Can we go even beyond collecting and preserving pre-print PDFs and find ways to save the massive amount of online news that has been largely neglected since it emerged in the 1990s?

I implore you to keep an open mind, to remember the importance of the work we will be doing together today and, please, let’s also have some fun as we tap into our collective imagination this morning in service of saving America’s “first rough draft of history” in the digital age.

Related Stories

Expand All Collapse All

Comments are closed.