Mapping PAHMA data to CSpace

... note — this document is very rough and will be revised repeatedly in the next few weeks ...

There are several data sources at PAHMA that should be mapped to CSpace:

  • TMS (our primary CMS)
  • TMS Thesaurus (a repository of hierarchical controlled vocabularies)
  • Osteoinformatics (a Filemaker DB containing detailed information on human skeletal remains)
  • Sites (a Filemaker DB with GIS information for over 10,000 sites and proveniences)
  • Tribes (a Filemaker DB with rich information on native American tribes, including contact information)

TMS

There are 352 tables (comprising 3418 columns) in the PAHMA-TMS instance. Of these 3418 columns:

  • 833 columns are empty or essentially empty:
    • 781 are empty
    • 50 are essentially empty, as they have only a single (meaningless) default value
    • 3 are essentially empty, as they have only two (meaningless) default values
  • 1604 columns contain internal information needed by the TMS application
    • 924 columns contain either primary or foreign keys (some of which probably should be kept)
    • 249 columns contain revision dates (which should be migrated as versioning/auditing data)
    • 138 columns contain system timestamps (which should be migrated as versioning/auditing data)
      • 110 of these are not empty, and so may need to be migrated

... to be continued ...

Besides those columns with information pertaining to attribution/revision/auditing, there are approximately 662 columns that contain data that will need to be mapped to CollectionSpace. Of these, 389 columns appear to be straight-forward mappings to existing CSpace fields, while the remaining 273 columns have been flagged as being potentially problematic.