Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 47 Current »

This is a top level page for documentation related to web applications that UC Berkeley is developing for its CollectionSpace deployments.

Extending CollectionSpace with Django-based webapps

The UC Berkeley CollectionSpace deployment team has extended the functionality of CollectionSpace using lightweight web applications built within the Python-based Django framework. Berkeley has created a reusable Django project that authenticates against CollectionSpace, providing a starting point for further app development. With this code, available in GitHub, Berkeley has simplified the creation of custom CollectionSpace webapps.

At this point, UC Berkeley has produced a dozen such webapps. Screenshots for a few of them are shown below:

  • A public search portal for the UC and Jepson herbaria collections. The UCJEPs search portal opens up the herbaria's vast collection of plant specimens to researchers and other interested parties.
  • A point-and-click interface to a variety of "non-CSpace" reports for the UC Botanical Garden. (login required)
  • A quick-and-easy interface for PAHMA to upload images and associate them to collection objects.  (login required)
  • A quick-and-easy generalized search interface (to be used by PAHMA and UCBG initially.)  (NB: login required in both cases)

 

 

A few details about these webapps

The UCJEPS search portal queries a Solr datastore that holds a nightly export of data from CollectionSpace. The speed of Solr and the denormalized nature of the datastore allow the portal to return 100s of records in seconds.

Included within the search portal is a second webapp called imageserver. Imageserver makes authenticated requests to the UCJEPS CSpace instance to access specimen images. For each search result returned, i.e, for each specimen (cataloging) record that matches the query parameters, A REST Services call retrieves images from related media handling records.

The Botantical Garden's iReports webapp is a "standard component" which will eventually become part of the basic suite. It provides a means to access the iReports for an institution which require parameters that CSpace proper cannot provide (i.e. non-CSID values such as text input).

The "Bulk Media Upload" webapp addresses a long-standing need to be able to upload batches of image files and connect them to collection objects. While the approach taken for the implementation supporting PAHMA requires adhere to specific conventions (e.g. the image files must be named using the exact value for the Museum Number of the object), the actual application is tiny and easy to apply to other deployments.

Background

John Lowe developed a set of applications in 2012-2013 in Python using CGI for PAHMA in order to meet some rapidly evolving needs related to the major move that museum is conducting.  In about March 2013, the UCB team decided that it was time to select a more enabling framework for web applications and build an environment that would provide an excellent platform for web applications that connect to our CollectionSpace instances.  The framework selected was Django. Richard Millet then built a project using Django's "authentication backend" to permit apps to authenticate with CollectionSpace servers. That project is called cspace_django_project.

The cspace_django_project serves as the starter project for local, custom CSpace-Django projects. Using Git and GitHub, local CollectionSpace instances can fork the code to their own repository, clone it, and create a custom project – containing one or more web applications – by making modifications to the clone. The cspace_django_project, a fork of which will reside unchanged in each deployer's repository, can serve as the conduit for general bug fixes and enhancements.

A set of wiki pages, linked to in the next section, documents the procedures involved in extending a CollectionSpace instance using Django webapps.

Installation

Functionality requirements

  • Security: Web apps must prevent SQL injection.  Must run under https.
  • Some applications will require login with CSpace credentials.  Others will be public portals that will use a proxy CSpace login with appropriate permissions.
  • Searching
    • Will be supported using Solr, Postqres, or NXQL queries as appropriate..
    • Performance will certain need to be considered in how queries are done.
    • We will need to allow hierarchical searching (e.g., "find all specimens within the genus Phlox", "find all artifacts from Colombia").
    • Term completion or type-ahead will be needed in some search fields.
  • Images: images may or may not need to be publicly accessible (depending on the webapp). This issue will require some analysis.
  • Save results as data file: list results should be available as a text (.csv) file for download.

Open questions

  • Do web apps run on the CollectionSpace application server or on separate VMs?
  • Do we query the Nuxeo database directly or build out a snapshot elsewhere?
  • Django has an ORM.  Should we use it, or write raw SQL?  There seems to be some significant discussion of the advantages and disadvantages (e.g. vs. SQLAlchemy, here, here).
  • If performing SQL queries directly, credentials need to be proxied and secured.  Postgres views can provide some isolation of the data.
  • Do we need to perform pagination of large search results?  What do other sites do?
  • For our first prototype application, should we demonstrate hierarchical searching, or should we start with the simplest scenario
  • When there are multiple images related to a collection object, should we show only the "preferred" image (PAHMA customization?) or show them in some order with a prev-next widget?

The following links illustrate some of the efforts to implement Django-based webapps that support CSpace deployments

  • Preliminary wireframes for a prototype UCJEPS portal, with notes about UCBG differences.
  • Design and implementation of the Bulk Media Upload facility, currently used by PAHMA.
  • "Generalized Web Portal" design and implementation.

 

 

  • No labels