Essig Public Facing Browser Specs

Based on features at http://essigdb.berkeley.edu

Advanced Search

At a high level overview, we will need to provide an advanced search interface to:

  • Cataloging Objects
  • Default Taxonomy Authority
  • Default Collecting Event Authority (or whatever authority is overloaded to act as the Collecting Event Authority, since this is not yet implemented in CollectionSpace)
  • Person Authority
  • Media Handling records (assuming the label images in the Essig system will be mapped to media handling records)
  • Accessions

These searches typically search on fields of the record type being searched on, though occasionally search on fields in related records, particularly searches on related taxonomy records. For example, finding all species under a particular class would require not only searching for taxon terms with a rank of species, but also with a broader context up the hierarchical chain n levels with a rank of class that matches the search string. These searches also provide options for the ordering of search results, which again usually depend upon fields of the record type being searched on, but occasionally depend on fields of related records to the record type being searched on. The CollectionSpace REST API is sufficient for advanced searches and orderings on fields of the record type being searched on, though incorporating these related record searches and orderings (as well as related records through n number of relations) will be a pain point for the development of the search functionality.

Search Results Lists

These advanced searches return search results lists, which can be:

  • Downloaded as tab-delimited files
  • Mapped with BerkeleyMapper
  • Drilled down into to reveal a specific record's details

Some of these search results lists include not only information from the record being queried, but details from related records as well, again, particularly related taxonomy records. Additionally, for searches on the taxonomy authority, the fields to display in search results can be configured per search - typically the search results include a species' order, family, genus, species, subspecies, the taxonomy's author and year, and a camera icon indicating if there is a photo in CalPhotos for the particular species. However, on the search page, the user can select to include the species' suborder, superfamily, subfamily, tribe, and subgenus.

Record Detail Views

Displaying a specific record's detail view will be much like querying on a particular CSID, and potentially fanning out if necessary to retrieve additional information from related records. There seem to be few hitches here.

Browse

In addition to the advanced search interface, we will have to provide a browse interface to:

  • Cataloging Objects
  • Default Taxonomy Authority

The browse interface will be built on the Taxonomy hierarchy. One specific pain point will be navigating broader and narrower terms. For example, when browsing the Cataloging Objects, we will want to first display all taxon authority records with a rank of class and their narrower terms with a rank of order that also have an affiliated cataloging object down the taxonomic hierarchy variable levels (in some instances, taxonomy hierarchy levels such as 'tribe' are skipped, while in others these levels are included). Only occasionally when no further levels are defined is a Cataloging Object directly related to a taxonomic term with the rank of class. More commonly, a Cataloging Object is directly related to a taxonomic term with the rank of genus, species, or subspecies. Furthermore, Essig makes extensive use of counting the number of cataloging objects related to each level of the hierarchy. Here we will run into the same problem, except not only will we have to find one related cataloging record down the hierarchical chain (sufficient to establish that the class or order should be listed in our browse interface), but we will have to count all related cataloging records down the hierarchical chain. This will require some significant fan-out. Alternatively, these values could be calculated nightly via cron jobs. In fact, the browse lists themselves could be derived nightly, which is how the current Essig system works.

Label Maker Utility

We will also have to create a Label Maker utility, which is similar to querying on Cataloging Objects and formatting the results in a specified manner. This seems to be fairly straightforward querying on specific fields of the Cataloging Object. Though Essig staff will be able to create labels inside of CollectionSpace, sometimes entomologists from other institutions will be working on Essig field projects. The current Essig workflow supports this activity.

Taxonomy Supersoup

Finally, we will have to query what the current Essig system is calling the 'Taxonomy Supersoup'. I am unclear where the taxonomy supersoup comes from - the current system allows the user to specify a namesoup, country, and collector, then queries the table eme_supersoup for the namesoup, country, eme_table, and collector_list, grouped by and ordered by the same fields. This seems to be an aggregate of several tables - though exactly how and what kinds of records are aggregated is not immediately clear.

The taxonomy supersoup aggregates together taxonomic terms, locations and collectors into a single searchable field to facilitate broader keyword search. Because the current Essig specimen table is highly denormalized, it is simple to create this index field (probably on a nightly basis via a cron script).

Taxonomic hierarchy

With regards to the taxonomic hierarchy that will likely cause the most problems for us, hidden, calculated fields that are updated on create/update of cataloging records (or even more specifically, on change of the determination history name field) that keep track of the taxonomic hierarchy could help to alleviate the fan-out.