Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

Code Block
borderstylesolid
titleobjnames.jsp
SELECT    LTRIM(ObjectName), COUNT(LTRIM(ObjectName))
FROM      Objects
WHERE     ObjectName IS NOT NULL AND RTRIM(ObjectName) !=''

GROUP BY  LTRIM(ObjectName)
ORDER BY  LTRIM(ObjectName)

At the beginning of this exercise (January 13, 2011) there were 34,187 distinct object names among approximately 640,000 records (the initial query did not exclude virtual and template records).

Excluding virtual and template records, there were 33,148 distinct, non-null object names used for 536,372 objects.

As of January 31, 2011, there were  27,999 distinct, non-null object names used for 535,967 objects (named object count was reduced as meaningless names such as "fragment" or "object" were eliminated).

The strategy we decided to follow was this:

  1. Create a new field (MTB_ObjectName) in which the revised object names can be staged and manipulated (complete)
  2. Create a new field (MTB_ObjectNameState) to hold certain specific information contained in the current object names (details below) (complete)
  3. Populate MTB_ObjectName with the contents of ObjectName WHERE IsVirtual=0 AND IsTemplate=0 (complete for 631,930 rows)
  4. Apply naming rules to the data in MTB_ObjectName:
    • Remove all capitalization (except from proper nouns) — not sure how to avoid the proper nouns (complete for 535,969 rows, incl. proper nouns)
      Code Block
      UPDATE    Objects
      SET       MTB_ObjectName = LOWER(MTB_ObjectName)
      WHERE     MTB_ObjectName IS NOT NULL
      
    • Remove leading and/or trailing spaces and newlines (complete)
      Code Block
      UPDATE    Objects
      SET       MTB_ObjectName = RTRIM(LTRIM(MTB_ObjectName))
      WHERE     MTB_ObjectName IS NOT NULL
      
    • Remove the following from object name (unless part of a compound noun):
      • Number/quantity
      • Color
      • Culture
      • Material
      • Dimensions
    • Move modifiers (e.g., "?", "frag."/"fragment", "incomplete", "broken") to MTB_ObjectNameState (which can be concatenated with MTB_ObjectName to make the displayed object name)
      Code Block
      UPDATE    Objects
      SET       MTB_ObjectNameState = 'fragments'
      WHERE     MTB_ObjectName LIKE '%frags.' and MTB_ObjectNameState IS NULL
      
      UPDATE    Objects
      SET       MTB_ObjectName = REPLACE(MTB_ObjectName, ' frags.', '')
      WHERE     MTB_ObjectName LIKE '%frags.'
      
    • Limit object name to 30 characters or less (unless a translation is involved; see next bullet point)
      Code Block
      SELECT    ObjectName, LEN(objectname) AS 'LENGTH', COUNT(ObjectName) AS 'COUNT'
      FROM      Objects
      WHERE     LEN(ObjectName)>30 
      GROUP BY  ObjectName
      ORDER BY  LEN(ObjectName) DESC
      
    • For non-English object names (e.g., "huipil"), provide a parenthetical English translation after the preferred name (e.g., "huipil (blouse)"). The 30 character limit applies to a single object name, so a name with a translation would be allowed up to 30 + 30 = 60 characters
      • Question: could such non-English object names be italicized?
      • Answer: no, not without great difficulty.
  5. Once this is done and the Registrars are happy with the results:
    • ObjectName will be pre-pended to Description (in which cases?)
    • ObjectName will be replaced with contents of MTB_ObjectName
    • MTB_ObjectNameState will remain as-is

The following strings were removed from MTB_ObjectName:

  • very
  • small
  • fragment[.][,]
  • fragments[.][,]
  • fragment[s] of [a]
  • [;][,][:] frags
  • yellow, purple, pink, blue, orange, brown
  • 1, 2, 3, 4, 5, 6, 7, 8, 9 (beginning of string only, followed by space)
  • two, three, four, five

The following strings still need to be removed from MTB_ObjectName:

  • Ragged fragment of
  • Wooden fragment of
  • Torn fragment of
  • Two fragments of
  • Three fragments of
  • Tray of fragments of
  • Bundle of fragments of
  • Partially repaired fragments of
  • broken, 
  • brk.
  • Object:
  • Gray
  • Grey
  • Red