If files aren’t findable, they might as well be gone
I heard about a mysterious disappearance not long ago. The folks in the newsroom at the Columbia Missourian mentioned that some of the newspaper’s published image files were not showing up as expected in our Digital Asset Management (DAM) system.
This situation got my attention. As a journalist and digital news curator, I don’t like the idea of the digital first draft of history vanishing into thin air. Chances are, if those image files weren’t findable in the DAM, they might be lost for good. And even if those digital objects weren’t deleted or corrupted, if they couldn’t be easily found, they might as well be gone.
Newsrooms with good solid information about the past can provide valuable context for current news stories. News archives are one way a news organization can demonstrate added value for its readers, listeners or viewers.
In this age of digital information, the history of our society will mostly be captured by the zeros and ones contained in JPEGs, MOVs and other media formats. If future historians are going to know about us, they will need persistent and trustworthy records. Something that news archives can deliver, if we care enough to keep them.
A DAM, unlike a content management system (CMS), is designed to provide long-term access to digital content such as photos, audio and video. As our MU-based research team observed in the “Endangered But Not Too Late” digital preservation report last year, a DAM is a much better choice for storing and finding news content for the long haul.
With that in mind, I put on my detective hat and began some digital sleuthing.
I started looking for clues. To begin, I learned that the missing digital assets were photos that had been published online but not in the print edition. Due to certain technical constraints, only photos that appear in print are queued up to be ingested into the DAM. As a result, web-only photos need to be manually uploaded directly to our DAM by the assistant directors of photography, student journalists who help staff the Missourian.
I thought it might help if there was a way to find out which files had been uploaded and by whom. I found out that, since I had an administrative account, there was a pretty easy way to generate a report that would tell me what I needed to know. In addition, tech support set up things to make it easy for me, as an administrator, to search the DAM for all the assets uploaded by a particular user group, such as the assistant directors of photography. By doing that search and sorting by the date ingested, I could see that the files uploaded by the assistant directors were actually in the system. But if I logged in as a non-photographer, those files didn’t seem to be visible. That was my cue to look at the differences in the settings of both accounts.
When I checked the non-photographer account, I could see that there were relatively few ways to sort the assets. The default sort order was set to Publication Date. By browsing hundreds of recent digital assets, I could see that at a certain point in the list that the Publication Date field was empty. Many of them turned out to be from the assistant directors of photography. When the non-photographers logged in, the system defaulted to sorting by Publication Date and the files without that crucial metadata traveled so far down the list that they seemed to have disappeared.
But there was more to do to ensure that images and other records could be sorted and found by Publication Date. Again, working with support from our DAM vendor, we made some small but meaningful changes. First, we found a way for the assistant directors to update the Publication Date. In addition, other accounts were allowed to access additional methods to sort by the date the asset was ingested, which often ended up being within a day or two of the actual date published. As time permits, those accounts will be able to use the search and sort functions to find assets by ingest date and then do a bit of research and assign an actual Publication Date to the images.
A word about DAMS
When it comes to organizing photos, audio and video, a digital asset manager is pretty hard to beat. While finding strings of text is relatively easy to do because you can search using the actual words you are looking for, trying to locate an image or sound can be much more challenging. Traditionally, DAMs tried to solve this problem by attaching certain words or sets of words — metadata — to media files using a database. Even just naming the media files in a consistent way can be considered metadata and can be a big help. With the advancements in machine learning and artificial intelligence, some modern DAMs are able to identify human faces and many other kinds of objects without textual metadata being attached to them.