PAHMA data migration ETL questions

This is a page for questions and observations we come up with.

  • For things like dimensions units, we need to import the short ID value, right?  How do we ensure that the values allowed by the customized CSpace screen includes all the values needed?  Presumably, Michael looked at TMS to get these.  Do we need to do a doublecheck?
    • Could you clarify what you mean by short ID value?
    • I populated the CSpace dimension unit controlled list based on values in TMS (TMS.DimensionUnits) plus units we might use in the near-term future.
    • A double-check is always a good idea!
  •  Here is an example of import-XML on a record (catalog number 5-15957):
          <collectionobjects_common:dimensions>
            <dimensionGroup>
              <dimension>Length</dimension>
              <valueDate></valueDate>
              <measuredPart></measuredPart>
              <measurementUnit>centimeters</measurementUnit>
              <value>130</value>
              <valueQualifier></valueQualifier>
            </dimensionGroup>
    On the pahma.cspace, you can see the "130" and "cm (centimeters)" but not "Length" on the "Dimensions" line.  Looks like either the short ID or long display name will work for the choice-list ... however it seems to be case-sensitive (I've verified the case-sensitive nature w/ record 3-26953a that uses "Height" instead of "height" for dimension type).

PAHMA data mapping discussion - Oct 27, MTB, Yuteh, John K, Chris H

  • Persons and Organizations: Persons affiliated with Tribal orgs.  Need to create org records for the Tribes first (either in TMS or in CSpace).  Then get refname/unique TMS id into table so we can map persons to tribal orgs.
  • Tribal orgs might be a separate Org authority because they have 30 more fields or so.
  •  ConstituentID in TMS person should be sufficient for CSpace person and org records.
  • Dates on person and organization will need to be structured dates instead of calendar dates.  Good thing they should be allowed in multi-value/repeatable rows in 1.13
  • Some person records were marked as Delete in the Kettle jobs.  Michael wants to reevaluate that.  If there are relationships between objects or places or procedures and those persons, then can't delete.  Maybe use Term Status field to flag these or make them inactive.
  • Estates and families: Import as person records?
  •  Audit: TMS has an audits table with 400K records. Could build a procedure or wait for CSpace to build this, or park data in a table so it's available for testing.
  • When Michael identifies fields that need to be added, he's indicating that in his FM table in the mapping_by_field column.
  • We will try to use core, anthropology, and pahma schema for new modifications.  We have to make sure we can get 3-schema configurations working.  We might not refactor NAGPRA claim at this time.  We will test this out on 1.12 when stable in some kind of lab session.  Probably do this on person and organization records.