Natural Science Taxonomy Use Cases

Though they are being developed on a separate page, these taxonomy use cases are organized under the general category of Vocabulary and Authority and should relate closely to Authorities and Controlled Vocabulary Use Cases.

University and Jepson Herbaria (UC Berkeley)

Organization and containers:

Specimens in the UC and Jepson Herbaria are usually mounted on specimen sheets which are organized in folders and filed in cabinets that are ordered by taxonomy.  Folders can contain one or more specimens sheets and are labeled by the accepted taxonomic determination of the specimens contained within the folder.  Depending on the specimens contained in the folders, they can also have modifiers such as “cf.”, “aff.”, or “ indet.”

Scientific Taxonomy

The SMASCH database uses approximately 86,200 verified and 11,600 unverified taxon names for approximately 545,700 specimens.  In addition, the database contains approximately 32,100 synonyms and 26,700 common names which are maintained outside of taxonomy.  The database also maintains many taxon name forms. For the example of Quercus stellata var. margarettiae:

  1. Full name: Quercus stellata Wangenh. var. margarettiae (Ashe ex Small) Sarg.
  2. Short name: var. margarettiae (Ashe ex Small) Sarg.
  3. Name without author: Quercus stellata var. margarettiae
  4. Name without ex author: Quercus stellata Wangenh. var. margarettiae (Small) Sarg.
  5. Name with rank: Quercus stellata var. margarettiae

The example above illustrates the use of infraspecific rank and authors for plant taxonomy.

Here are some other use cases that could be developed:

  • Collection management system stores taxonomy locally.
  • Collection management system refers to taxonomy information stored in a repository such as uBio or GlobalNames, employing services offered by those repositories.
  • Collection management system relies on accepted terms and relationships from an external provider for 99% of its specimens and determinations.  However, it employs local terms and identifications for specializations and where local scientists are performing research that might lead to refining the domain-specific taxonomy.
  • Taxonomic ordering: The order in which taxonomies present terms at different ranks (levels in the hierarchy) is important and not necessarily alphabetical.
  • When a specimen is identified (or re-identified) as belonging to a taxa (at any level in the hierarchical tree), the "object name" effectively becomes that taxa name (in whichever form of name is appropriate, e.g., the binomial name such as Homo sapiens, or the appropriate identification - see Taxonomic Identification below.
    When the curator or museum scientist is entering the object, s/he can search or browse the hierarchial taxonomic tree, expanding and collapsing levels until the appropriate identifier is located; object is associated with that taxa; user can annotate the identification with additional information.
  • When the user is changing the identification, the previous state is maintained somewhere.
    In this domain, the identification needs to managed seprately from the taxon ID or code.  That is, it's likely that the identification will need to be treated as a separate association between object and taxa (ID) given the amount of change that happens here.  (This needs elaboration by Susan or Lam).
  • When searching for the appropriate taxon, the system must handle relationships of different terms, e.g., synonyms and common names, but it must be clear what is considered the accepted or vaid term.
    Batch updates of identifications and changes in classification should be facilitated, ideally at the user interface level though this may be impractical and would require specific permissions.  E.g., Jim Patton wants to update some of the higher classification of mammals. He wants to move all the New World taxa now in the family Muridae to Cricetidae.
  • User (whether museum scientist or the public user) should be able to see the hierarchial taxonomic tree while viewing the specimen
  • User should be able to browse or search the tree in order to find or explore the collection.  E.g., I'm looking at a specimen of this family-genus-species.  Now show me a list of all specimens in that same family.
  • Multiple classifications: Is it a requirement that multiple versions of the tree are needed in order to maintain history?

Here are some service-based use cases that were identified by John Wiezscorek (MVZ) in February 2007 in a discussion about a name resolver service:

  • Get all synonyms for name x at same rank as x
  • Get accepted valid name for name x
  • Get all accepted valid names for children of rank y for name x
  • Get accepted valid parent name for name x at rank y
  • Get full accepted valid name hierarchy of name x starting at the same rank as x

Taxonomic Identifications

Identifications or determinations are maintained as annotations for the UC and Jepson Herbaria.  One or more taxon names are applied to a specimen, but one name is accepted.  An identification can be made up of a single taxon name or two or more names in the case of hybrids.

  1. Single taxon names: The majority of specimens will have a single taxon name applied to it.
    1. Single part names: For ranks of genus and above, the taxon name is a single part name.  Examples:
      1. Betula for the birch genus
      2. Liliaceae for the lily family
      3. Marchantiophyta for liverworts
    2. Two part names: Bionomials are used for taxon names for ranks of species and above, and genus and below.  Examples:
      1. Arctostaphylos uva-ursi
      2. Monstera deliciosa
    3. Three part names: Trionomials are used for taxon names for ranks below species. Examples:
      1. Iris tenuissima subsp. Purdyiformis
      2. Carex setacea var. setacea
      3. Phacelia tanacetifolia subvar. Tenuisecta
  2. Multiple taxon names:  Hybrids can occur naturally or arise from horticultural activity.  They can be given a normal botanical name or a special name.  Specially named hybrids are also known as “named hybrids”.  Hybrids can also be named by listing the parents with the “×” character separating the parent names.  Interspecific hybrids occur between different species of the same genus while intergeneric hybrids occur between two different genera.  Examples:
    1. Named Hybrids
      1. Quercus × subconvexa J. M. Tucker is a Quercus durata × Quercus garryana hybrid
      2. Salix × rubens Schrank is a Salix alba × Salix fragilis hybrid
    2. Unnamed hybrids:
      1. Quercus douglasii Hook. & Arn. × Quercus × alvordiana Eastw. × Quercus john-tuckeri Nixon & C. H. Müll. (In this case one of the parents is a named hybrid.)
      2. Ceanothus incanus Torr. & A. Gray × Ceanothus thyrsiflorus Eschsch. × Ceanothus velutinus Hook. var. hookeri M. C. Johnst. × Ceanothus foliosus Parry

Here are some additional requirements related to taxonomic identification:

  • a specimen with multiple identifications/determinations (e.g. based on field id, expert id, dna analysis)
  • an identification/determination with multiple scientific names (e.g. A X B, A or B, A and B) (not named hybrids)
  • an identification/determination that is a temporary/unpublished/working name (e.g. Bolitoglossa sp. nov. weird feet, Liolaemus sp. nov. neo G)
  • an identification/determination with modifiers (e.g., cf., aff., sp.)
  • it's fairly common for specimens to only have a "generic" id (e.g., identified only to family)
  • it is possible for a specimen to have multiple "current" determinations -- from the collector, from another scientist studying the specimen, from a DNA sequencing lab.  The convention (someone please confirm or update!) is that the collector is the system of record for the scientific determination.

Here is some additional schema information regarding an identification service:

  • specimen id number - id number of the specimen being identified
  • determination agent id -  id number of the person who made the determination (may include order if determined by group)
  • determination date - date determination was made
  • determination type - qualifies determination (e.g. field id, molecular data, ID of kin)
  • accepted determination flag - flags "accepted" identification
  • scientific name - scientific name for the identification, including modifiers, etc.
  • taxon id - id number of taxon name(s) used for the identification
  • determination remarks - additional notes/remarks

Lam wrote:

Smasch handles the above except for determination agent, date, and type.  there are fields to keep track of modification date and modification agent, but not the identification authority.  As I recall from the demo, identification/taxonomy is handled similarly in Specify and SMASCH.  Arctos handles the above and also uses taxa formulae for creating the scientific names for the determination.