UCJEPS-CollectionSpace data mapping, v2.0
With CollectionSpace 2.0, the schema have changed, and therefore so will the data mapping and customizations we are employing in the UCJEPS deployment of CollectionSpace 1.0. This is not a final data mapping for the University and Jepson Herbaria. This is just a next iteration. Changes from the previous mapping are in blue.
This iteration will focus on object cataloging data, persons, organizations, taxonomy and SMaSCH vouchers (which are really text annotations), and the relationships needed to support those. Mapping information about Loans In is included.
Object cataloging data
Field mapping
SMaSCH |
CollectionSpace object schema |
CS User Interface Title (** = different from UI) |
Note |
accession.accession_id |
object_number |
Accession Number ** |
Change field title |
institution.inst_name |
collection |
Collection |
Dropdown (Collection values) |
accession.coll_num_prefix + coll_number + coll_num_suffix |
other_number |
Number |
Moved to Field Collection Number. |
"collector number" |
other_number_type |
Number Type |
Not needed if we move this to Field Collection Number. |
accession.notes |
comments |
Comments |
|
objkind.kind |
form (Object Description Information) |
Form |
Dropdown (Form values) |
taxon_fullname.fullname |
title |
Taxon ** |
Change field title |
accession.phenology (decoded) |
phase (Object Description Information) |
Phenology ** |
Change field title. Dropdown (Phenology values) |
accession.early_jdate (calc) |
field_collection_date_earliest |
Field Collection Date Earliest |
Custom |
accession.late_jdate (calc) |
field_collection_date_latest |
Field Collection Date Latest |
Custom |
accession.datestring |
field_collection_date |
Field Collection Date |
We are converting this to the structured date format |
committee.committee_abbr |
field_collection_collector |
Field Collection Collector |
|
accession.loc_lat_decimal |
field_loc_lat_decimal |
Field Location Latitude Decimal |
 |
accession.loc_long_decimal |
field_loc_long_decimal |
Field Location Longitude Decimal |
|
accession.loc_place (calc) |
field_collection_place (?) |
Field Location Verbatim** |
Will map to verbatim field in schema extension for now. When the Place Authority is available, we will work on creating authority records as well. |
accession.loc_county |
field_loc_county |
Field Location County |
Custom |
accession.loc_state |
field_loc_state |
Field Location State |
Custom |
accession.loc_country |
field_loc_country (temp custom) |
Field Location Country |
Custom |
accession.loc_elevation |
field_loc_elevation (temp custom) |
Field Location Country |
Custom |
accession.coll_num_prefix + coll_number + coll_num_suffix |
? |
Field collection number |
|
committee.committee_abbr |
? |
Field collection collector |
|
"catalog date" |
date_association |
Date Association |
Fixed text. Do not load into 1.0. May end up in Associated Date information eventually, or create date/modify date. |
accession.catalog_date |
catalog_date |
Date Text |
Fixed text. Do not load into 1.0. May end up in Associated Date information eventually, or create date/modify date. |
SQL query
view: cspace_collobj_2_0
Notes
- loc_county, loc_state, and loc_country will not be supported by dropdowns for now
- Collecting event information is being added as custom temporary fields. CollectionSpace will have a separate authority system for collecting events and collectors, so these mappings will change in the future.
Loans Out
Field mapping
SMaSCH |
CollectionSpace loans out schema |
CS User Interface Title (** = different from UI) |
Note |
loan_event.uc_loan_num + loan_event.jeps_loan_num |
Loan out number |
 |
smasch has two types of loan identifiers. |
loan_event.curr_inst |
Borrower |
 |
Probably should keep this as borrowing institution (organization authority) |
loan_event:Â Â Â Â Â Â Â Â Â |
Borrower's contact |
 |
Probably should keep this as borrowing agent (person authority) |
loan_event.inhouse_notes + loan_event.noteworthy_inclusions |
Loan out note |
 |
 |
loan_event.loan_status + |
Loaned object status |
 |
It looks like this field is not currently active. smasch keeps track of two status types: |
Loan date? |
 |
Loan Out Date |
Does SMaSCH have a loan date? |
SQL query
view:
Notes
...
Object to Loan Out Relationships
- On the way
Data sets for dropdowns and controlled lists
- Collection values:
- Jepson Herbarium
- University Herbarium, UC Berkeley
- University of California
- Form values:
- Illustration
- Mounted on Paper
- Photocopy
- Photograph
- Stored in a Box or Bag
- Phase (Phenology) values:
- Cone
- Flowering
- Flowering/Fruiting
- Fruiting
- Spores/Sporangia
- Vegetative (non-reproductive)
- Loaned Object Status values:
- Active (Unknown)
- Active (All Out)
- Active (All In)
- Active (Partial)
- Active (Discrepancy)
- Cancelled (Unknown)
- Cancelled (All Out)
- Cancelled (All In)
- Cancelled (Partial)
- Cancelled (Discrepancy)
- Other Number Type values:
- Other herbaria accession number
- Copied from accession number
- Exsiccatae number
- Project number
- Genbank number
- Destructive sampling number
- Internal cross reference
- Miscellaneous
Name Authority: Person
Field mapping
SMASCH |
cSpace Person Schema |
CS User Interface Title (** = different from UI) |
Note |
---|---|---|---|
person.person_id |
Legacy ID |
 |
borrower is not part of smasch; newly created to generate pk id for borrowers in loan_event table. |
person.first_name + |
Display Name |
 |
 |
person.first_name |
Forename |
 |
 |
person.other_name |
Middle Name |
 |
 |
person.last_name |
Surname |
 |
 |
person.bio_notes |
Biographical Note |
 |
 |
person.birth_year |
Birth Date |
 |
 |
person.death_year |
Death Date |
 |
 |
loan_event.curr_for_position |
Occupation |
 |
 |
loan_event.curr_inst |
Group |
 |
 |
'collector' + '; name source: ' + data_source.data_src_name  |
Name Note |
 |
 |
SQL query
View: cspace_person_2_0
ETL
Talend job GET_PERSON_V1 (Hoffman):
- Input: Query view
- See  SMaSCH data mapping notes for some special data handling
- Output: XML file for loading (batches of 3000)
- Output: List of CSIDs (to facilitate deleting)
- Output: CSV file of basic identifiers (for record keeping and as input to one of the Org loading jobs).
- 8945 rows
Notes
TBD
Name Authority: Organization
Field mapping
SMASCH |
cSpace Organization Schema |
CS User Interface Title (** = different from UI) |
Note |
---|---|---|---|
committee.committee_id |
Legacy ID |
 |
 |
committee.committee_abbr |
Display Name |
 |
 |
committee.committee_name |
Long Name |
 |
 |
committee.committee_abbr |
Short Name |
 |
 |
institute.inst_state + |
Foundation Place |
 |
 |
committee.committee_func |
Function |
 |
 |
SQL query
view: cspace_org_2_0
ETL
Talend job GET_ORG_V2 (Hoffman):
- See SMaSCH data mapping notes for some special handling
- Input: Query view
- Input: ucjepspersonout-ids.txt from GET_PERSON_V1 (to get refnames of organization contacts from person authority)
- Output: XML file for loading (batches of 4000)
- Output: List of CSIDs (to facilitate deleting)
- Output: CSV file of basic identifiers (for record keeping).
- 49845 rows
Talend job (Cheng):
- Input: Text file from Dick Moe, non-colliding-collectors.in, containing names of seaweed collectors
- 1546 rows
Notes
...
Organization Contact schema
Field mapping
SMASCH |
cSpace Contact Schema |
 |
CS User Interface Title (** = different from UI) |
 |
Note |
---|---|---|---|---|---|
institute.email |
Email |
 |
 |
||
institute.fax |
Fax |
 |
 |
||
institute.url |
URL |
 |
 |
||
institute.address1 + |
Place |
 |
 |
||
"street" |
Address Type |
 |
 |
||
institute.phone |
Phone |
 |
 |