EDWARD MCCAIN: [Slide 1, 00:00] How many people here saw the film “Spotlight”? [Looks at hands raised.] I mean, I would think so. As you were sitting there, did you notice how many times — and how central a role — the news archive, the morgue, played in that movie? Would that story have been the same? No.
How many of you have had — and that was largely a paper archive because of the time period, but — how many of you have lost some digital content? From your phone, from your computer? It’s so easy, and it happens so fast — and sometimes, you don’t even know that it’s happened. The same situation applies to our content management systems and the other systems we use for creating and distributing electronic digital news content today.
[Slides 2-3, 01:05] A little example right here in our backyard — and glad Tom Warhover’s not here because he accuses me of making him the poster child for our cause: There was a a server crash. The Missourian newspaper offices are just across the street. It’s a daily paper that serves the community; it has a long history that goes back to the founding of the school. So, there was a server crash in 2002, and in a matter of a few seconds, 15 years of news content was obliterated. It was gone, irretrievably.
Now, part of the problem was that the content was held in an obsolete software system. And, I mean, it was really obsolete. It was so obsolete that even if you had the data, you probably couldn’t make this thing work again. The company that made it did not have a way to bring the stuff back to life. And so, for all intents and purposes, that digital record of mid-Missouri for those 15 years is still gone. We do have a print version of it, but how many of you have tried to, lately, search through a paper archive or microfilm? Anybody want to volunteer for that duty? [laughter] Right. For better or for worse, these digital systems are really great for providing access, they’re great for creators, but we haven’t quite figured out what to do with them for the long term.
[Slide 4, 03:03] One little myth I’d like to dispel today: Your content management system is not an archive. It never will be an archive; it’s not made to be an archive. It’s perfectly fine that it’s not an archive, but it is not an archive, and you can back it up, and it’s still not an archive.
[Slide 5, 03:21] Where I used to work, The Tucson Citizen. It closed in 2009. [Slide 6, 03:30] I have a visual representation of the morgue, the news library. The physical news library looked like this afterward. I can’t show you, but, believe me, the digital archive … you might as well have set a match to this, and maybe sprayed water all over it. [Slide 8, 03:56] Because what happened was — and you can go back and look today — that when they closed the paper, they shut down the servers, they shut down the different indexes and systems, and when they tried to bring them back up, guess what? They were broken, and they’ve still not been able to fix that. So, you don’t have those links to photos and videos; the other resources are not connected. We basically have no idea what that archive or what that newspaper originally looked like. There’s just no way to go back there and do that.
[Slide 9, 04:29] In terms of the values of of news archives, Tom Warhover, my poster child for this, made a statement about it. Obviously, if you don’t have an archive, in terms of your journalistic content you’re going to suffer. My belief is that you’re also throwing away something that in the future is going to be very valuable to society and potentially — and this is one of the things that we’re working on — I think you’re also throwing away some money. Publishers, come on: Now I’ve got your attention. Don’t throw away any money.
KATHERINE SKINNER: [Slide 10, 05:19] Right on. So, what we’re talking about here isn’t just the Missourian; it’s not just the Tucson Citizen. It’s a huge problem, and it’s one that is systemwide. We’re not looking at a single-stakeholder problem here; we’re looking at a multi-stakeholder problem. It’s not just the publishers that are having issues. It’s not just the CMS runners, the folks who are doing the platforms. It’s not their problem entirely, or their solution entirely. And then, you’ve got this whole set of stakeholders over on the side: the libraries, the archives, et cetera, which historically have performed a function of taking care of, archiving and making accessible the news records.
Each of these stakeholders, plus others — press associations, et cetera — have some role to play in the crisis that’s unfolding. But it is a crisis, and it’s not just two newspapers out there that have had this kind of failure; it’s all kinds of newspapers, and it’s a quiet failure. So, unlike physical archives, where you can actually look at it and go, “Oh, OK, so we’ve lost half of our work here to whatever disaster,” you don’t know that it’s gone — you don’t know that it’s going — until it’s already happened. Many, many newspapers — many, many news sources — have lost content that we don’t know about right now, and it is this very quietly unfolding crisis that really starts to matter when we try to piece together history or try to understand the legacy of particular journalistic moves, business model changes. I mean, have there been any of those in the last, say, 15 years? Just a few. It would be nice to be able to actually get back and look at that, analyze that and say, “Oh, hey, look at this experiment over here; look at this innovation over here. Look at how those wound up leading to the type of environment that we have today.” We aren’t going to have that kind of record because of what’s happening today, again, quietly and behind the scenes.
[Slide 11, 07:20] So Edward and I are part of a team effort that is trying to bring together not just the journalists, not just the librarians and archivists who used to take care of this in a print world, and not just the platforms, but all of those folks together, plus folks who need to use this content — the genealogists and the researchers alike — to figure out how to mend the relationship breaks and the technological issues that are here both in the social and technological infrastructure.
MCCAIN: So you may recognize this room. This room was a little newer; this was in 2011. Probably the first such widespread gathering that included the news industry because we had been talking about this at least since 2009; the Library of Congress had been having conversations about it. But one of the things that we’re conscious of here at the journalism school is that you’ve got to have all the people at the table, and guess what? The people with the content, they weren’t at the table. And they’re hard to get to the table.
SKINNER: They had other things to worry about, like surviving. [laughs]
MCCAIN: Like keeping the lights on. [Slide 12, 07:20] The expression “trying to sell insurance to a drowning person”? Sometimes that’s the position we’re in, and we’re trying to get ahead of that so that we catch them before they jump in the water. It may be too late.
But this group of publishers, editors … on the left there, Marc Wilson from TownNews.com, who’s really been an advocate for us, one of our champions. He runs TownNews.com, which is a big content management system. They’ve got almost 2,000 newspapers in the United States, and he’s telling me that for just their e-content pages, right now they’re losing 2.4 million pages a year.
[Slide 13, 09:30] This was the distribution of attendees to Dodging the Memory Hole in 2014. We realize that we need more journalists and more IT people to be joining with us in this, and we’ve made efforts to reach out to those communities.
SKINNER: [Slide 14, 09:49] So, the purpose of the 2014 meeting — which, again, took place here — was to bring all the stakeholders together and start to talk about the problem area. Establishing that there is a problem and starting to talk about some ways of moving forward to do something about it. Edward and RJI had generous funding from the Mizzou Advantage fund, which is a local foundation fund that allowed that gathering to take place. Simultaneously, I run a small nonprofit that builds bridges, basically, between libraries, archives and museums. I’m fighting for the survival of an almost extinct species that doesn’t realize it’s going extinct and is now in competition with groups like Google.
My work dovetailed into Edward’s, in part, because I got an NEH award to host a gathering that was very synergistic, and so we worked together on these applications, and we had the funding for two events. We got together and we said, “All right. We could just host two events, and that’d be great; bring the stakeholders together, have these conversations, start to build relationships. But could we do more?”
What we decided is that the real challenge from the 2014 event, where we’d come together and start to build that foundation for collaboration, and the 2015 event, which we were going to hold in Charlotte, North Carolina — it happened this past May — was to turn potential energy, which we knew we could generate — I mean, we had a bunch of smart people coming together into a room; we knew we could generate a lot of passion, a lot of interest, a lot of enthusiasm. How do you take that potential energy and turn it kinetic when all of us were going to go back to our daily lives?
So what we decided was we can force collaboration by moving from taking action to jumping in. What I did was I stood here at the end of the event, and I said, “All right, so here are some of the ideas that we’ve discussed.” I put them up on a slide, and I said, “All right, who’s going to volunteer?” I used peer pressure to elicit from the audience actual contributions that promised on video that they would do something about these different problem areas. We wound up with six of these. One of them was around an environmental scan to understand better what is actually happening both on the news side and on the library archive side. How much are we losing? Can we start to really chart that out and do it in a state-by-state or national way? Others looked at workflow; what do we already have existing technologically that could be put to work on this space?
MCCAIN: You’ve got to watch this one because she will call you out and make you do things. [SKINNER laughs.] That’s great; that’s her role. But prior to that in the groups that we had together, we had people agreeing, getting consensus about what are our priorities, and how important is this really? When you say that, then you’re in the position where somebody can call you and say, “Well, then, OK.”
SKINNER: “Your turn.”
MCCAIN: “You’ve got to do something.”
SKINNER: [Slide 15, 12:56] So, the challenge that we were working with was, how do you move from act to impact? For both Edward and I, we come from a history that is overlaid with some social movement kind of work. For me, one of the things that I wanted to bring to this multi-stakeholder problem was an understanding of what we know right now from sociology — that’s my background — what we know from social movements, from various efforts both on the business side and on the public good side to help guide the way we collaborate.
What we’re going to turn to here at the end of this conversation is a particular model — and this is just one model. It’s a buzzword right now; this really became popular around 2011, and right now, this is the thing if you’re working in social collaboration spaces on environmental issues, or on reducing teen pregnancy, or name your issue. There are lots and lots of collective impact issues going on—
MCCAIN: Fortunately, it actually works.
SKINNER: That’s why there are a lot of these going on. It’s not a magic bullet, it’s not something that was invented in 2011; it is a buzzword that has been applied to it. [Slide 16, 14:03] It’s a packaging of social methodologies for wide-scale collaborations, which happen, again, both on the business side and on the public good side. [Slide 17, 14:15] So, some of the things that we’ve tried to bring, recognizing that we only have limited funding, limited ability to bring people together, is some of the basic tenets of collective impact methodology.
The thing that differentiates it from a lot of other things are these five pieces. These five pieces are key in multi-stakeholder initiatives, which is what we’re talking about when we’re talking about trying to preserve the news, or archive the news, or have this kind of record. Everybody has to come together from all of these different stakeholder perspectives and agree on a common agenda. This is not unlike what we were just hearing a couple minutes ago about the business world in trying to bring together newspapers from lots of different geographical regions to say, OK, what’s a common agenda, and how are we going to move that forward? Establishing that common agenda and making it not a platitude — platitudes don’t work — but something that you can actually measure progress toward is really crucial here.
Then establishing, No. 2, the shared measurement systems that you need in order to really manage the collaboration, show progress and keep buy-in on the parts of all the different people that are engaged in something like this.
Then the third magic bullet in a collective impact kind of environment is to have these mutually reinforcing activities. Each of the stakeholders are going to go back — to take our very concrete example of these folks who have been engaged in this digital news, and how do we make it persist over time, issue — each of us go back to our normal day-to-day lives; the news editors that we had from The Dallas Morning News and other places went back to their daily lives. I went back to my life working with libraries, archives and museums. We had educators in the room, we had press associations in the room.
MCCAIN: I think a good example is Ben Welsh from the Los Angeles Times, who got together at the conference with Herbert Van de Sompel. They looked at the situation, and they built a tool, a plugin for WordPress that can help you archive your WordPress things.
SKINNER: Yep. So, on each of these fronts we’re making progress, and our progress — instead of competing, which is what often happens with innovative work. Especially when you’re funding lots of innovation, which does happen, especially on the academic side of the realm, one of the things that happens is all of those innovations are coming in competition with each other, and the only way that one can emerge is if there is a network — not just a bright light bulb that goes off over somebody’s head, but an actual network — to distribute and really incorporate that innovation into a normal workflow. The mutually reinforcing activities are really emphasizing the ways that the things that Ben is doing implicate others within the cycle has been really important to this.
Other things that are, just to breeze through, continuous communication. If you don’t have continuous communication between a multi-stakeholder effort, then nothing’s going to happen. This can mean lots of meetings — hopefully not lots of meetings that don’t actually make progress, and that’s where the real key comes in. Then, finally, you’ve got to have backbone support. Somebody’s got to have their eye on the collaboration as their purpose or else that collaboration is not going to be healthy, it’s not going to flourish; it’s going to get misdirected, and it’s going to get tabled and moved to the background.
MCCAIN: We’re trying to take on some of that role, but we’re looking for help.
SKINNER: [laughs] We’re looking for lots of help.
MCCAIN: For that backbone.
SKINNER: Right. [Slide 18, 17:42] So, we’re two relatively small organizational pods that are trying to help bridge a lot of different players into a set of relationships that are uncomfortable. When you talk about multi-stakeholder initiatives, especially in something as fraught as journalism and the changing industries and the changing technologies and everything else that we’re all dealing with, it requires us to really build bridges and keep those communication lines open so that as each one of our stakeholder communities starts to make progress toward goals, as each one of our stakeholder communities starts to realize, OK, so now we’ve got AMPs; what do we do with that? What does that mean within this context? Can we take that AMP and start to save yet? Is there a conduit for that? As these conversations are going on, one of the powerful things that happens is the innovations start to be adopted not just by one stakeholder group, but by all of the stakeholder groups simultaneously, and that winds up reinforcing the activities that are at hand. A lot of what we’re trying to do right now is set the stage for that to happen.
[Slide 19, 18:50] And when that happens, what you get is this; “cascading levels of collaboration” is what they call it in collective impact. So you have a backbone. It doesn’t have to be an organization; it can be a person, a facilitator, it can be a group. It doesn’t have to be standalone. But you have this networking backbone that helps you keep all of these players in check and in concert with one another as they’re doing their daily jobs, as they’re doing their specialized projects. Then, eventually you get all of this cascade of work that’s going on on the periphery that winds up contributing back to the overall initiative. [Slide 20, 19:28] So in terms of collaboration, we don’t get to have a whole room full of superheroes. If we did, maybe we could do things a little bit differently, and we could really—
MCCAIN: We have them right here.
SKINNER: [laughs] Yeah, that’s us.
MCCAIN: We can do it!
SKINNER: Right. But since we don’t have the magical uniforms and the ability to transform ourselves into these key figures that can move and change things, we really have to bond together and figure out ways to make it worth all of our while, to stay engaged so that the historical record actually does exist, so that we do have the ability, as Edward started this conversation, when something like “Spotlight” happens, we’re able to go back and actually see the history of events as journalism has shown it as a rough draft of history.
MCCAIN: [Slide 21, 20:24] Our organizations — JDNA, which is based here at RJI, and Educopia Institute — it’s been a really fruitful collaboration. We look forward to doing much more. I do think this room is full of people who can be our champions, who can bring this message and build awareness. We’re at that stage now where I think this community can make a difference. Thank you.
SKINNER: Thank you so much.
JIM FLINK: Thank you, and we do have one quick question here.
AUDIENCE MEMBER 1: Hi, guys. So, this seems like an infinitely solvable problem. There are two companies in the world, three companies in the world, four companies in the world. This is a storage issue. At the end of the day, we need to store this stuff. So what I want to know is, how are you approaching Amazon, Microsoft and Google?
MCCAIN: With open arms. [laughs] Or maybe it’s more like—
SKINNER: We’re on our knees.
AUDIENCE MEMBER 1: So understanding that the cloud is not free—
SKINNER: It’s actually not just storage. Can I give you the two-second spiel on why it’s not just storage? It’s a combination of things. You can store everything, and that’s great, but we’ve got too much stuff, so storage doesn’t work. What we have to have is good metadata and good ability to actually locate those things that we’ve stored, good ability to tell what file formats they’re in so that we can migrate them forward as we need to or build emulation tools that can actually render them. You think about how much has gone obsolete — I mean, just technological obsolescence over the course of the last 15 years — and then look at the speed of that; there are lots of computer scientists who’ve done these beautiful graphs on how quickly. It hasn’t been something where it started off fast and then we plateaued; it has moved from almost a plateau up exponentially where we’re just moving like that. [snaps fingers] So it’s not just storage.
MCCAIN: The New York Times is storing, I don’t know how much content, from the ‘90s that is no longer accessible because it’s in HTML2 or 3. Just as an example.
SKINNER: That would take a lot of work—
AUDIENCE MEMBER 1: So, to go back to … this is a storage issue that companies like Amazon, Google, and Microsoft have solved. They’ve got it solved. We’ve got to get this partnership going because you’re right; we’re losing our digital past.
SKINNER: The partnership does need to be there, but I will again challenge that statement that it has been solved. It has not been solved, and internationally it hasn’t been solved. We’ve been working with the Library of Congress, with the British Library, Internet Archive … there’s a huge international movement around even just the basics of web archiving, like what Internet Archive does, and even that isn’t solved. We’ve got big problems to solve.
FLINK: Since Bill’s from Microsoft, and since Nick’s from Google … I think we could have some substantive conversations today. I have to take a break, so I think I’m going to go ahead and do that, but if you do have a question, please, please ask. And thank you guys very much; their contact information is up on the screen.
Edward McCain, RJI and MU Libraries
Edward McCain is digital curator of journalism at the Donald W. Reynolds Journalism Institute and University of Missouri Libraries. In this capacity, he founded the Journalism Digital News Archive (JDNA) agenda and its related “Dodging the Memory Hole” outreach initiative. JDNA’s purpose is to preserve and ensure access to born-digital journalism. Through the JDNA agenda, McCain proposes to address the complexities of preserving born-digital news by recognizing and engaging the stakeholders and the systems they employ in order to define a pathway toward long-term, sustainable change. His research has been supported by the Mizzou Advantage and the John S. and James L. Knight Foundation.
Katherine Skinner, Educopia Institute
Dr. Katherine Skinner is the executive director of the Educopia Institute, a not-for-profit educational organization that builds networks and collaborative communities to help cultural, scientific and scholarly institutions achieve greater impact. In this role, she supports three membership communities –BitCurator Consortium, Library Publishing Coalition and MetaArchive Cooperative – and a thriving research division that focuses on digital publishing, access, preservation, and sustainability issues for libraries, archives and museums. Skinner received her Ph.D. from Emory University. She has co-edited three books and has authored and co-authored numerous reports and articles. She is currently principal investigator for research projects on continuing education (Nexus, Mapping the Landscapes), digital preservation (ETD plus, Chronicles in Preservation) and scholarly communication (Chrysalis). She regularly teaches graduate courses and workshops in digital librarianship and preservation topics, and provides consultation services to groups that are planning or implementing digital scholarship and digital preservation programs.