Vocabulary Service Questions

Questions

How do we model the use-of/reference-to a vocabulary item/taxon?

We want to be able to use a vocabulary term (especially for flat vocabularies that are basically enumeration values) in fields of a form (e.g., for the responsible department or person on a Loan). This raises the question of whether, how, and where, we should maintain information that relates the vocabulary to a given entity and/or field. We need such information if we wanted to find all uses of some vocabulary item, across the entire application.

The dumb and brittle approach is to hard code the search to check all the fields where it is used.
Another approach is to model all uses of vocabulary like RDF as a relationship. This makes broad searching easy, but it will likely slow down structured search, and somewhat complicates the model (compared to a direct reference from the entity schema).
Yet another approach is to describe where the vocabulary can be used, allowing for a reflection-like mechanism that search can use to construct field-based search through an entity. This makes the main information modeling simpler, but makes the search model more complex. OTOH, if our users demand a traditional structured search model in which they (can) specify each field to search within, then we'll have to do this anyway.
Where the reference is already within an association (e.g., in a semantic index, or in a Determination service that formally describes a specimen as being of a given taxon), then it would be awkward to impose another level of indirection by associating the vocabulary item with the association, and in turn to the specimen.
OTOH, if we are importing data that is more of the nature of text than associations to vocabularies, and for which the desire is to impose an authority for consistency, then we may want to model the association separately, for several reasons:
1. The process of association may be subject to refinements over time, resulting in a revision of the association.
2. Several algorithms may be used, and may yield varying results that should be modeled independent of one another, and independent of any association made by users.
3. The original text imported should be preserved.

For now, we will assume that fields can be direct references to a vocabulary value, and that we will solve the search problem as part of the general approach to search and reflection. In addition, we will separately consider the problem of indexing a collection (which builds associations between objects and entries in a vocabulary/authority), which has more in common with the Life Science Determination case, than with the case of a structured reference in an object, to some vocabulary.

Can we model rank as an attribute of a Vocabulary Item?

It seems as though rank is a quality of a vocabulary item, however one of the changes a research might make is to move a taxon within a taxonomy. This raises the question of whether that is still the original taxon in a new rank, or whether it is a new taxon. If it is not intrinsic to the taxon/item, then it has to be part of the relation that links it to others, which is kind of messy.

In general, I [Patrick] would argue that a taxon is defined by rank as well as name, publication, etc. There are both species and genus taxa for "bufo", for "buteo", etc., so the rank is distinctive, IMO. This simplifies at least one part of the model.