With this iteration, we are targeting the data migration mappings for the migration of SAGE data into a CollectionSpace instance, based on the version 2.0 release of CollectionSpace. This is an interim release Changes from the previous mapping are in blue.
Person
Field mapping
SQL
View: cspace_person_2_0
Notes
- Temporarily excluding shortIdentifers 0, 60397 because they are not unique in the view. They are associated with multiple Orgs (which are going into the Groups field). Break into a repeating group and merge in separately.
Talend/ETL
bgimportperson1 0.2: Revising earlier talend job (based on 1.8) to change refname format, remove ampersand handling, and include collectionspace_core:uri element
Organization
Field mapping
SQL
View:
Notes
- Skipping for now
- Multiple records per org because many orgs have multiple contacts
- Probably need to extend Org to have repeating PersonContacts information
Talend/ETL
Taxonomy
Field mapping
SQL
View: cspace_taxonomy
Notes
- Not getting author shortIDs in view, just names. For now putting them in Taxon Notes. Adjust view or get them separately in SQL.
- Do we need to adjust the format of scientific names?
- Lots of other Sage data for Names to add
Talend/ETL
bgtaxonimport2 0.2
Accessions
Field mapping
SQL
View: xdb_accession_qv
Notes
- Need dropdown values for form
- Dead Flag Using expression (row1.dead_flag==0) ? "no" : "yes" but should be "alive" "dead"
- Need to add dead_date, structured
- Need to add accession date
- Need hybrid flag
- Need requested by
- Is Source from a vocab? Need field (wait for Donor block being added for UCJEPS)
- Add More Notes to Accession Notes
- Geog location is often Earth so we know geography needs some work
- For now, putting Source in Brief Description along with Material type
- Put Accession Notes in Comments: Rename field
- What is Plant Status?
- What is Data Quality?
- record_status drop down "1"
- Do we need to fix the form of taxonomic names?
Talend/ETL
bgaccessionimport 0.1
Garden Locations
Field mapping
SQL
View: cspace_location
Notes
- Type (storage location type) field expects a dynamic controlled list with URN-formatted values. For now just loading the raw data fields from the locationType field. Nothing will display.
Talend/ETL
bgimportgardenlocation 0.1
Plant Locations (Movement and Location procedure)
Field mapping
SQL
View: xdb_plant_location_qv
Notes
- xdb_plant_location_qv.action_code map to Reason For Move, and rename? Right now it's a numeric code in Sage. What are decode values? Ignoring for now.
Talend/ETL
bgplantlocations 0.1
Relationships between Accessions (Cataloging) and Plant Locations (Movement)
Field mapping
SQL
Text file: plantlocids.txt, produced in Talend bgplantlocations job
Notes
- Text file has CSID from Plant Location record and CSID from Accession record
Talend/ETL
bgplantlocations 0.1
- Talend job creates two XML outputs (cataloging-to-movement, movement-to-cataloging)