Data Cleanup that can be automated. See also Objects data cleanup completed by hand and notes on OC objects data cleanup.
- ARTIFACT CLASS, WORK TYPE, COLLECTION CATEGORY: Many uncategorized objects can be categorized using the attributes "Category" and "Subcategory." See Classification Walkover - simple (simple), and a more detailed, object-by-object walkover which shows how many objects are in each artifact class/work type.
- DIMENSIONS: - In OC, "Dimensions" is a single field containing up to three values. In CS, there will be multiple fields for dimensions that will be named according to type (e.g., height, width, depth, diameter, etc). We will need to separate each of the three values into its own field.
- Assigning measurement types:
- In most cases, each value will go in its own field WITHOUT a measurement type. This is because, while most OC records list the values in a specific order (height, width, then depth), many records do not follow this rule.
- Fields that contain all three characters (H, W, and D): We can assign a measurement type to the value based on the character.
- Fields that contain the word "Diameter" or "Dia": We can assign a measurement type to the value based on that word. (Given that the letter "D" repeats in both "D" for "Depth," as above, and in "Diameter," will it be possible to automate measurement type?)
- CS Field: Dimension Measured By: Is there a way to pull this information from "Object Histories"?
- Formatting measurement values:
- In all cases we can convert fractions (1/2) into decimals (.5). If we do so, we will have to delete the space before the decimal, unless the fraction is the first value in the list of 2-3 lengths listed in the "dimensions" field.
- Many decimal values are displayed in increments smaller than .25 (e.g. .125, .375), and many values include fractions in increments of smaller than 1/4 (e.g., 3/8 or 5/8). Should we round these up or down?
- Assigning measurement types:
- REGISTRAR STATUS: Delete this field entirely.
- PHYS_REMARKS: This is a hidden data field on OC -- it appears in the database and in reporting, but cannot be edited through the OC user interface. Over 5000 records contain information in this field. See report.
- ALL information in this field could be moved to ATTRIBUTES, except for "Markings." This information could be moved to the field that is currently, in OC, called "Content Remarks."
- In reporting, Phys_remarks info appears in the same cell. The attribute value appears after the attribute type. This is the complete list of attribute types:
- Format Gauge:
- ModelNumber:
- Material:
- Markings:
- Serial Number:
- Weight:
- If one of these attribute fields contains multiple values, these are recorded in the report as separated by semicolons. The fields themselves are separated from other data in the field by a single space. Example:
- Material: Tin; porcelain; wood ModelNumber: 1912
- ONLY within the attribute field "materials," there are additional cases in which values are often separated by commas. Example:
- Material: wood, metal, glass
- "Markings" often contains a comma as a part of the value, rather than as a symbol that denotes two separate values. Example:
- Markings: 'Nite Lite' Exclusively Distributed by LECO Electric Manufacturing Co., Florida, NY. Copyright by Kagran Corporation. Made in Japan.
- CONDITION and ARTIFACT NEEDS:
- Adding value to the "Condition" field:
- Sometimes data that should be in "Condition" appears in the "Artifact Needs" field. See artifact needs report|download/attachments/67862544/artifact+needs.xls?version=1&modificationDate=1303153563000\.
- Where the exact phrase "Exhibitable/Needs Work" appears in "Artifact Needs," delete the phrase and the change value in "Condition" to 1.
- Where the exact phrase "Needs No Work" appears, delete the phrase and change the value in "Condition" to 0.
- Where the exact phrase "In Jeopardy/Unstable" appears, delete the phrase and change the value in "Condition" to 3.
- Where the exact phrase "Not Exhibitable/Stable" appears, delete the phrase and change the value in "Condition" to 2.
- These phrases occasionally appear in the field "Artifact Needs" with other text. The other text should not be deleted.
- Adding values for "Marking" and "Tagging"
- "Marking," "Tagging," "Needs marking," "Needs tagging," "Marking and tagging," "Marking & tagging," "Needs marking and tagging," "Needs marking & tagging."
- Keep remaining values in "Artifact needs" and keep field.
- Adding value to the "Condition" field:
- ACCESSION DATE: Delete this field entirely.
- ATTRIBUTES_:_ (Data below comes from looking at each field on this report of all objects with all attribute fields.)
- CATEGORY: Delete this attribute field after we finish the classification walkover.
- COMPONENTS: If we are using Excel to store data at any point in migrating this field: Be aware that Excel's auto-formatting will convert all page number values up to 12 p. (1 p. - 12 p.) automatically into times (1:00 PM - 12:00 PM). Simply reformatting the cells after exporting the data will not restore the data to its original form (instead, 12:00 PM becomes "0.5," and so on).
- CREDIT LINE: Delete this attribute field (contains no data).
- DIMENSIONS: Move this data over to the "Dimensions" field in "Cataloging" on CS. Since "Dimensions" is repeatable on CS, we do not need to replace the data that is already in the field. (Note: Some of the data contained in this field repeats data in the "Dimensions" field in the Objects > Basic tab. Some of it does not, and instead provides different measurements for an additional component of the object--e.g., the packaging for a toy, where the "Dimensions" field in Basic contains the dimensions of the toy itself.)
- DISPLAY DATE, COPYRIGHT DATE, CREATION DATE, and PUBLICATION DATE: Many object records contain multiple dates.
- EXTENT: Delete this attribute field (contains no data).
- OTHER PHYSICAL DETAILS: Some data remains that could be moved automatically:
- Alternate dimensions (eg: "Bowl: Diameter 5.5 x 2; Cup: 4.25 x 2.25 x 2.25," or "Box: 1 x 7.25 x 7.25")
- Clothing sizes (eg: "Adult size medium (38-40)," "Children's large.")
- PUBLISHER: Delete this attribute field (contains no data).
- SUBCATEGORY: Delete this field after completing the classification walkover.
- SUBJECT: In all but a few cases this field is blank or contains irrelevant info. I have already moved any relevant data into other fields in OC, save 1989.26 (in progress). After 1989.26 is completed, delete this attribute field and all the data in it.
- TECHNIQUE: Can replace all instances of "lithograph" with "lithography.
- History.
- LOANS: In order to NOT migrate the "Loans" history data, we'll have to delete only the "history" records with a type_id of 1 (Loans).
- LOCATION: Standardize format for "location."**** Mis-formatting 1: "MST 7:5:2." There are 1,827 entries that are improperly formatted in this way. Delete the space between MST and the number string and add in its place a colon.
- Mis-formatting 2: "MST: 7:5:2." There are nearly 200 entries that are improperly formatted in this way. Delete the space between MST: and the number string.
- Another formatting change: Notes in parentheses after the number string should to taken out of parentheses and put as a note or specification to the location code.