Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

Importing media files with Media Handling procedure records

...

If you are using the CSV Importer at importer.collectionspace.org, in order for the Services API to ingest the file, the URI for each media file must be publicly accessible on the Web and cannot require any authentication credentials.

You can achieve this by moving your files to a Google Drive folder, Dropbox folder, AWS S3 Bucket, etc., and making the permissions of that folder/bucket (and all files in it) public (or “available if you have the URL”, if that is an option) for the duration of the ingest.

Note

We do not recommend putting files into Google Drive for ingest via the CSV Importer. The Google Drive documentation is not very transparent about quotas/limits placed on file downloads, but we have had users report issues ingesting files from a Google Drive. The pattern we see is: The first X files ingest fine, but all the rest fail as the CSpace API’s request for the file is rejected by Google Drive. The number of successful files varies, but is relatively small (<25).

If you are using a locally-provided instance of the CSV Importer, you may be able to provide URIs using the file:// protocol to ingest locally-available files. Consult with the person who administers your CollectionSpace and CSV Importer for details.

...

Deleting will work with a CSV that includes more data, but the Processing step will take longer.

Deleting field values when updating existing records

In order to prevent accidental data destruction, if you leave a cell in your CSV blank, the CSV Importer does nothing to existing data in the corresponding field in the existing CollectionSpace record.

If you actually want to completely delete the value of a field in an existing CollectionSpace record, you must deploy the bomb 💣:

Copy/paste the bomb emoji 💣 into any cell of your CSV to complete clear that field in that record on update.

The bomb must be the only thing in the CSV cell. No other characters before or after it. This ensures you can round-trip a note such as “The artist known as 🧶💣 wraps urban objects and structures in knitted nets” without accidental data loss. The exception to this is in multivalued fields, where you can do something like: “value|💣|value”. The value of the 2nd instance of the field would be removed, since after splitting on the multivalue delimiter “|”, the bomb is the only thing in the 2nd instance.

Info

We chose to use the bomb emoji for this because:

  • It is pretty difficult to accidentally insert a 💣 into your data;

  • If you do, it is very easy to see that you have done so;

  • It should evoke the appropriate amount of caution, because using it is potentially quite destructive; and

  • It amused us

Bombing boolean fields (like “Preferred for lang” in authority records)

This applies to boolean values in general, most of which are formatted in the application web forms as checkboxes, but may also show up as radio buttons or other widgets.

If you create an authority manually in the application and do not check the “Preferred for lang” box, the application produces the following in the underlying XML record:

<termPrefForLang>false</termPrefForLang>

We’ll call this False Value.

If you batch import authority terms via the CSV Importer and you do not fill in the termPrefForLang field in the CSV, the resulting XML record has:

<termPrefForLang/>

We’ll call this NULL Value.

Using 💣 in a boolean field results in a False Value, not a NULL Value.

Image Added

Image AddedImage Added

Bombing multivalued fields

Given an existing Object record with:

...

And CSV data like:

Code Block
| objectNumber | objectName           | objectNameCurrency | objectNameLevel   |
|--------------+----------------------+--------------------+-------------------|
| mvbombtest   | name 1|name 2|name 3 | current|💣|unknown | 💣|group|subgroup |

Results in:

...

NOTE: You will get unknown_option_list_value warning because the bomb is not a known term for objectNameCurrency or objectNameLevel fields, but this is expected.

If I manually reset the record to the original values, but this time ingest the following CSV:

Code Block
| objectNumber | objectName           | objectNameCurrency | objectNameLevel   |
|--------------+----------------------+--------------------+-------------------|
| mvbombtest   | name 1|name 2|name 3 | 💣                 | 💣|group|subgroup |

We get:

...

objectNameCurrency has been treated as a single-valued field, so only the first value has been removed. To completely clear the Currency column, the objectNameCurrency value in the CSV would need to be: 💣|💣|💣

For sake of example, I went in and manually edited this record to have:

...

Now, if I ingest the following CSV:

Code Block
| objectNumber | objectName           | objectNameCurrency    | objectNameLevel   |
|--------------+----------------------+-----------------------+-------------------|
| mvbombtest   | name 1|💣|name 3|💣 | current|💣|unknown|💣 | |💣|subgroup|💣  |

We get:

...

For a number of reasons, the CSV Importer cannot currently identify fully empty rows in a repeating field group and intelligently remove them. You would need to manually remove those empty Object name rows in the application.