Initial observations based on team discussion on 20 June 2013

?

Notes from John Lowe

Data model, data content

  • In current CineFiles, user can see most values ("related records") in the "main screen", don't need to switch to a tab...some screens have a button to bring up a separate screen for "relations".
  • Some readonly fields show data from other tables, e.g. "other directors for a film"
  • Entity-Relation diagram
    • Make a big one. Glen will send Lam the file to render to PDF. We will post it here.
  • Vocabularies are all stored in individual tables with an FK relation to collectionobjects (=documents) ...LCSH sub heading, ISO country names, person names, organization names.
  • pfa_denorm2: maybe use this db for migration? (about 500k to 1M rows).
    • (John: but is all the data there? It seems a lot of the metadata about the data is not: e.g. dates of update, who did the update?)
    • 2 scripts: one looks for records that have changed...one regenerates denorm altogether.
  • films table is a controlled vocabulary of film titles.
  • reports? several "courtesy reports" exist.
  • hierarchical searching? not much!
  • images
    • 500K images (each page is an image);
    • image creation has slowed down...
    • pages (images) are stored on the webfarm, organized in the files system by docid. Look ma, no database! just a directory structure.
  • relatively static collection, at the moment.
  • They may want a DAM...
  • documents are the collectionobjects
  • is a page a media record? if so, how do we order pages...?
  • are films citations?

Migration observations, options, and suggestions (technical)

  • Consider some sort of intermediate warehouse for use in the migration (see pfa_denorm2 above)
  • Could we simply generate exactly the same 3 denormalized tables from CSpace, and then have the existing portal "just work"? use existing jsp, which is based mainly on views?
    • Perhaps we should try this: copy the tables to pfa-dev (when it exists), and get the exising JSP webapp working there.
    • Then the task of supporting the public portal comes down to making denormalized versions as we are doing for Delphi and (probably) UCJEPs.
  • Customization
    • lots of sysbase stored procedures => need for event handlers and batch procedures
    • Glen will get stored procedures and triggers and put them someplace for viewing

Customer involvement

  • this is really a research database. specialized. used by film researchers, not casual IMDB users!
  • who will we be training?
    • we dunno! people glen trained are gone; there are a few part-time replacements
    • Chris: looks like Nancy and one other person who is helping with the cataloging; no other SMEs, users.

Other speculation

  • ResearchHub
    • What about a retrospective / lessons learned on the ResearchHub effort?
    • Can see an eventual need for "integration" with ReasearchHub.
    • They had lots of fields, we mapped them to 8.