QA: Data model/design checks

This QA work is done by a member of the CollectionSpace team, not community members.

Prepare for Untangling

Gather QA JSON configs

Make a new branch of GitHub - collectionspace/cspace-config-untangler: Generate CollectionSpace data overviews from profile/tenant configs

Make a directory for the new release under: untangler/data/config_holder/community_profiles.

Download each config into that directory and name following the pattern:

profile_#-#-#.json

Use the version number for the domain profile. For example for CS release 7.1, we are at anthro_6-0-2, herbarium_2-0-0, and core_7-1-0.

https://anthro.qa.collectionspace.org/cspace/anthro/config
https://bonsai.qa.collectionspace.org/cspace/bonsai/config
https://botgarden.qa.collectionspace.org/cspace/botgarden/config
https://core.qa.collectionspace.org/cspace/core/config
https://fcart.qa.collectionspace.org/cspace/fcart/config
https://herbarium.qa.collectionspace.org/cspace/herbarium/config
https://lhmc.qa.collectionspace.org/cspace/lhmc/config
https://materials.qa.collectionspace.org/cspace/materials/config
https://publicart.qa.collectionspace.org/cspace/publicart/config

Add new release to CCU config

Edit the :releases setting in lib/cspace_config_untangler.rb to add new release. The new release is the hash key, and its previous release is the hash value.

Change your active configs

If the new release under QA is 7.2, then:

ccu profiles switch_release -r 7_2

IMPORTANT: Make sure any configs in the config directory are safe to be deleted (i.e. that they exist somewhere in config_holder) before running this command!

Lint/reformat the hideous one-line JSON files

Untangler will work without doing this, but if you need to look at the configs or diff with previous version to troubleshoot or verify anything, it’s a nightmare if you haven’t done this.

You definitely need to do this before committing/pushing your branch!

Untangler has a command to do this:

ccu profiles readable -p all

This reformats the profiles in untangler/data/configs. Make sure to copy/paste the reformatted versions back to the appropriate directory in untangler/config_holder/community_profiles so the reformatted versions will be put under version control.

General field/data model checks

Generate QA reports:

ccu reports qa -r 7_2

This will:

  • Generate the “all_fields” CSV for the release version, with structured date fields collapsed

This frequently blows up for some reason or other.
For 7.1, it was because there is a new audit record type defined, but it has no fields. Pray to god if you are not Kristina that she improved the code before you became responsible for doing this… For 7.2 it worked with no issues. Cross your fingers!

  • Generate qa_all_fields_#_#.csv

    • This flags new fields in new? column and adds the dumbfieldname useful for pivot table checks

  • Generate qa_deleted_fields_#_#.csv

    • This should be nothing but headers unless fields were supposed to be deleted

  • Generate qa_changed_fields_#_#.csv

    • This will include the full row from the previous version all fields CSV and the current/under QA all fields CSV if any data in the row has changed

    • Go through this carefully to verify all changes are expected/intentional

  • Generate qa_nonunique_field_names.csv

    • This reports duplicate field names within the same record type. These will work if the fields have different xpaths, but can be confusing in CSV templates and discussions with clients.

    • Look for any with status = new and suggest less confusing name.

  • Checks for duplicate field paths (name and xpath the same)

    • These will not work and need to be fixed!

    • (Currently the check will run. None like this exist. If any are found you will need to implement code to generate appropriate output)

  • Checks for fields with xml_path (i.e. xpath) depth greater than 4.

    • Deeper xpath nesting is not technically a problem for the CollectionSpace application, but deeper data structures are going to require additional work in the cspace-config-untangler and collectionspace-mapper.

Inconsistency/pattern check

What/why

A lot of things are already inconsistent within CollectionSpace, and it adds confusion/complexity that isn’t actually necessary on top of all the complexity that is.

In general we are trying to move toward things being more consistent.

How

Summarize newfields table with pivot table set up like:

Limit the new? filter to y

Then it is a matter of scanning through this table, looking for oddities. What is or is not odd depends on stuff like:

  • what changes/additions were specified for the release

  • which fields already existed

  • a bunch of other hand wavy things

Basically you are looking for things like:

You may need to flip back to the newversion tab to verify only 7 profiles have these fields in the media records (since here we are limited to new fields). If so, and you can’t find evidence that two specific profiles were not supposed to get these fields, then this is an inconsistency. Make a ticket in the release board/sprint!

When you are looking at newversion to verify context, also check stuff like:

  • datasource (optionlist, vocab, or authority)

  • ui_info_group

Things do not always have to be the same, but we do NOT want to introduce accidental, gratuitous inconsistency.

Other

  • Check for typos in field names, labels, and xml_path

  • Check that ui_info_group, ui_path, and ui_field_label are being extracted correctly

    • If any are not, this might indicate issues with messages for fields. Or you may need to tweak code in the cspace-config-untangler to fix the ui label extract. Problems with this are often related to form field info not matching up with field definition info because the ns_for_id isn’t getting set correctly.

Generate and validate record mappers from new configs

Record mappers are JSON files containing the information needed to programmatically data into CollectionSpace XML records. See data documentation for more details.

This process is documented in the cspace-config-untangler application documentation.

You do not need to generate a mapper manifest until the release goes live and its mappers need to be added to the CSV Importer.

This process will highlight anything preventing creation of usable mappers. Usually this is related to whether a CS XML namespace has been identified properly for each field.

This may require updates to the UI config to express field namespaces in a consistent way, or it may require tweaks to the untangler code to properly pull out the namespaces.