Image Handling workflow and analysis

DRAFT

Assume: We should try to constrain this analysis to collection item photos, but there is a broader context here.

Currently: Collection Item data extracted from FMPro and attached to image metadata (IPTC) using a script.  They want the metadata to be attached to the image, but it can get out of sync with what is in the system of record (the collection management system).

Currently: Only smaller JPEG derivatives are attached to FMPro records.  Image goes in Research Hub.  One big batch taken by sneakernet (NLG) to CDL/Merritt.  Scripts create derivatives that use watch folders to put image file path in the CollectionItem table; FMPro then renders the path as an image.

Some requirements:

  • They need to see images in CSpace, but they could be derivatives only or ingests of URLs that create JPEGs in CSpace.
  • They want to limit the redundancy of effort here.  They have Orlando and Andrew who can write scripts and also have people who know Adobe Bridge, etc.
  • They want but won't get a full digital asset management system.
  • TIFFs need to be accessible somewhere for local access (for publications, calendars, web site).
  • They will be using an Amazon CloudFront CDN for their Drupal web site.  
  • They care a lot about the specifics of derivatives created.
  • They can have multiple images per object and need to identify the "preferred" image.

Tools at our disposal:

  • CSpace UI: "Link to External Image" can do ingest.  (There a service call for that.)
  • CSpace UI: "External URL" does not display image.  (It is probably possible to display an image in CSpace just given a URL, but that would be significant work.)
  • Bulk Media Uploader: a web app that might be helpful.
  • ImageMagick

Questions to BAM/PFA:

Q: Are you currently planning to put TIFFs of collection items on the CDN, or just derivatives?

Andrew: "To Grace: Please correct me if I am wrong. In terms of putting files in CDN for new web site, we'll put high resolution images files in CDN, and Drupal will create the derivatives automatically from the high resolution images fils. Then Drupal will put the derivatives back to the Pantheon or leave them in the CDN depending on the performance. Therefore, the high resolution images are TIFFs or JPEGs?"

Orlando: "While it would be good to have Large TIFF images on ResearchHub and have high derivatives copied to CDN, it's important to have very clear the issue about the cost implications as well as workflows implications.  I would really like to see a front-end DAM system using Amazon Web Services as the back-end."

Grace: "As Orlando advises, our current intention is to store high/full-resolution jpg derivatives on the CDN and allow drupal to create it's own derivatives for public consumption."

Q: What are you attaching image metadata to -- TIFFs, Jpegs, all of the above?

Andrew: "I think we are attaching image metadata to TIFFs and JPEGs. Moreover, please help me to answer what metadata we're attaching."

Michael: "To answer part of your question, for the art collection images we are using a Java script that Andrew wrote to embded metadata drawn from our filemaker database. The script writes to XMP fields in the Dublin Core namespace. If you need a specific list I can get you our mapping tomorrow, just let me know. This applies only to master TIFFs and those derivatives already created in-house and stored on Research Hub, and only to the art collection images."

Michael: "I added an excerpt from our metadata mapping as a comment on the wiki page. It shows the filemaker data that Andrew's program pulls as well as the XMP fields that it writes to the images. To clarify a bit, some of the tags include text prefixes that we use in Research Hub, especially in the DESCRIPTION tag, where we concatenate lots of data. Rick, last year we came up with new mappings to replace the ones that Rick Rinehart had developed. There's more here than RH can handle yet, so the data is richer...."