Legacy CineFiles data sourcing in Sybase
This documents how the CineFiles web site (a series of JSPs) gets data from Sybase.
CineFiles web site data documentation
Searching documents and films is supported by two Java programs:
1) DocList.java (src/edu/berkeley/mip/cinefiles/entity) constructs queries against views in the cinefiles schema in Sybase.
2) FilmList.java (src/edu/berkeley/mip/cinefiles/entity) constructs queries against views in the cinefiles schema in Sybase.
These views in the cinefiles schema in turn refer to
a) other views in the cinefiles schema (which eventually point to tables in the pfa schema)
b) tables in the pfa_denorm2 schema (which are populated by pfa_denorm2_restore.sh, see below)
c) tables in the pfa schema (transactional data tables)
Display of Film and Document citations is accomplished in:
3) DocDetail.java, which calls the Sybase stored procedure doc_detail_summary (in the cinefiles schema), which makes calls to numerous views in the cinefiles schema. These in turn query back to tables in the pfa schema.
4) FilmDetail.java, which calls the Sybase stored procedure film_detail_summary (in the cinefiles schema), which makes calls to numerous views in the cinefiles schema. These in turn query back to tables in the pfa schema.
Four tables and two views in the pfa_denorm2 schema are used in this process.
The two views (www_docs_dv and www_filmx_dv) are built against the transactional tables in the pfa schema.
a) www_docs_dv is a denormalized view of documents
b) www_filmsx_dv is a large table that contains the cartesian product of films and repeating information: country, language, subject, and genre. (Note that several film_id's are excluded from the view because they contain many languages and countries, resulting in millions of rows. These are added in from text files by the pfa_denorm2_restore.sh shell script, see below). The cartesian product pattern is used to support the search page (applying an AND search on a multi-field search page).
Two of the four tables in the pfa_denorm2 schema are populated by steps in the pfa_denorm2_restore.sh shell script which copy out the www_docs_dv and www_filmsx_dv views to disk, sort them; add records for the excluded films; and load them into corresponding tables (doclist from www_docs_dv and filmlist from www_filmsx_dv).
Two of the four tables in the pfa_denorm2 schema are populated by a stored procedure, pfa_denorm_alltitles (also in the pfa_denorm2 schema) that is invoked by the pfa_denorm2_restore.sh shell script.
A second shell script, pfa_denorm2_refresh.sh, was developed to do a faster nightly update to the tables in pfa_denorm2. This complex process is not being used now given the light amount of data entry taking place. It will not be rebuilt for the CollectionSpace deployment.