Collections Systems Integration Use Cases

University of California Berkeley

Research and system integration for biodiversity sciences

  • Provide link to image of object stored in separate image dissemination system (e.g., CalPhotos)
  • Provide link to separate system that records protein sequences about a particular system (e.g., GenBank)
  • Display images stored in separate image dissemination system (e.g., CalPhotos)
  • Allow other systems to link to specific object record in CollectionSpace.  E.g., CalPhotos or GenBank can have a link that pulls up CollectionSpace record for a specific specimen.
  • Allow other systems to query object records in CollectionSpace by scientific name.  E.g., CalPhotos, ITIS, Encyclopedia of Life could query a CollectionSpace instance for all specimens whose scientific name is 'Abronia umbellata ssp. breviflora'
  • Integration with a field information management system.  E.g., the Moorea-Biocode project is developing a field collecting system (FIMS) that allows scientists from different research teams to collect specimens, take images, identify geolocations and collectors.  That information is currently gathered using a variety of means (Excel spreadsheets, digital camera image metadata, GPS) and goes through an import process into the Biocode database.  There are complications of course: The different kinds of information from one collecting event can come in separately.  Some information might need updating.  Field scientists are working in conditions that are challenging in many ways!
  • Integration with downstream laboratory information management system (LIMS for tissue sampling and genetic data).  E.g., the Moorea-Biocode project is partnering with Biomatters to integrate the Biocode specimen and field data with Geneious, a set of DNA and protein sequence analysis tools.  The process of creating tissue samples for laboratory analysis is managed in the Biocode database.  Matching up racks of tissues to specimens is a challenge.
  • Data export to other what natural scientists call distributed databases.  Museums routinely ship data in certain formats to data aggregators to help provide data sets that cover larger regions or for the benefit of consortia.  Data formats and methods of accessing data are constantly in flux and vary from domain to domain and provider to provider.
  • Ability to import and host other data sets resulting from research or provided by other scientists.  E.g., the UC and Jepson Herbaria combines their collections data with other herbaria in California to create a searchable state-wide collection set.  This information is not hosted in SMaSCH (used the the UC and Jepson Herbaria) but in a separate set of tables that can be queried via a separate interface.
  • Related to this is the need to allow for metadata synchronization among systems that contain metadata for specific collected objects (e.g., between the collection management system, the digital asset management system, the data broker, GenBank, and so on).  This is a big challenge and is the subject of various major grant proposals.