The Archives of the International Informatics Institute


> Article Index









To get IN3 updates, enter your email:

Multimedia Databases

By Jack Powers

Published: December 15, 1995

Who needs everything on-line? What’s the real value of digitizing all of your text and graphics for instant retrieval at the touch of a button? What kind of market is there for “re- purposed content?”

I’ve been spending a lot of time lately thinking about publishing databases, both for internal production use and for external re-selling schemes. It occurs to me that, for most publishers, putting everything on line is more trouble than it’s worth. Text archiving is easy enough; it’s the pictures and other non-textual elements that are the problem.


The biggest problem with multimedia archiving--photos, illustrations, animation, video and sound--is that the computer doesn’t know what it has. With a collection of text files, you simply dump data into an indexing program and you’ll be able to retrieve any strong of characters you need. With images, the computer can capture some things automatically like the input date, the graphic format and the image size and resolution, but no computer can tell the difference between a picture of Sigourney Weaver and a picture of Dennis Weaver.

In order for a database to be useful at all, a human photo editor has to review each image and create a list of keywords that are attached to each media file on disk. The searching software doesn’t care about the image files at all, it simply scans the text that is attached to each image.

Multimedia databases fail when those keywords are input badly, when there are not enough words to fully describe each item, or when the words that are there don’t match the words that searchers are likely to ask for. There’s nothing more frustrating than a database that has the items you’re searching for but has them cataloged in such a way that no combination of query terms you dream up will retrieve them. The images are just as irretrievable on-line as they used to be when they were stuffed into filing cabinets. Some image databases like stock photo CD-ROMs and web servers try to bash the keyword problem with brute force, listing dozens and even hundreds of terms for each image. In the stock photo business, vendors are now competing on the size of their keyword files as much as they are on the quality of their pictures.

Clever databasing software makes building the keywords easier. Editors can pick the terms from a menu, which reduces simple typing mistakes, and they can build thesaurus files that automatically cross reference a term: “cow,” for example, links to “farm animal,” “rural scene” and “dairy” without manually adding those words.

Still, building a database of non-textual items is skilled, labor-intensive work: looking carefully at each item, picking the right term, proofreading and correcting the input, and verifying the copyright and royalty status of each image. And it goes on forever, with new items dumped into the inbox every week on top of the old work that goes back years and years.


Why should we do all of this work? For some jobs like catalog, reference and direct mail work, investing the time and money up front makes sense because you know you’ll be using the images over and over again, and the indexing is generally very simple: product number, description, category. Additionally, having the full resolution prepress images on-line saves time and money in this kind of production.

But in more creative applications like publishing and advertising, multimedia databases have more subtle, less measurable advantages. Internally, they make the jobs of the creative staff more efficient: they don’t need to spend as much time hunting up images and flipping through contact sheets. They also make it easier to create better layouts since lots of appropriate images are all just a keystroke away.

Measuring creative benefits like these is a judgment call. Large photo departments that have installed picture databases report increases in photo editor productivity of 20% to 30% or more, and, like desktop publishing, most database users would not want to give up their image machines. In multimedia projects, a good database can make customized and multiplatform production easier, but most publications are made up of mostly new images, and it’s hard to pinpoint a hard dollar savings from putting all the old stuff on-line.


Conventional graphic arts industry wisdom has it that printers will manage the image databases for their customers. They have the high res files, they have the computers and they know how to do production work. The problem is that the printer usually has the images that are least needed in a database since prepress files contain the images already used in print, not the images that will be used to make new print. Maybe there’s some value in the reprint business, but the big creative payback comes from giving the designers and art directors total access to all available images on file, not just the stale ones that will probably never be used again.

Printers also have too much resolution. All the customer needs to capture the creative advantage is the low resolution view file (which the prep shop often provides free of charge with every scan). A responsive, well-indexed image database doesn’t need high res prepress files at all: after the creative work is done, it’s a lot easier and cheaper to dig the slide out of a shoebox and pouch it to the printer. Most photos will have to be re-scanned for size and crop anyway, so you may as well save the substantial hardware investment needed for a full res database.


In the panic over new media, many publishers are scanning and indexing everything so that they can “re-purpose” their content. But who the hell wants to look at 10-year-old magazine photos, and how much can you charge for the illustrations you’ve already printed or for the ones that your art director rejected for the initial publication? World class photo libraries like the Bettmann Archives, Time-Life and the Associated Press will eventually be on-line and available to the public, but unless you’ve got a uniquely compelling set of images, “re-purposing” may just be an expensive way to avoid nurturing your core publishing business.


A multimedia database is useful if you want to use more images more freely in your work. Scan everything (or use a PhotoCd), but at viewfile not production resolutions. Be attentive to the power and importance of excellent indexing. Forget prepress, except if you’re using the same images over and over. And be critical about recycling, since having the content on-line is the easy part; creating a good publishing idea still takes talent.\\



© 1995-2002 International Informatics Institute, Inc. All Rights Reserved. Read our privacy guidelines. Contact us.