PAHMA Organization & Person Authority data mapping

This page contains information about PAHMA's data migration of Organizations and Persons, and related info.
See parent page on PAHMA data migration ETL work.

Organization extract

field mapping

TMS table name

TMS field name

CSpace field name

notes

Constituents

Active

termStatus

transform, as per query below

Constituents

BeginDate

foundingDate

where > 0

Constituents

Biography

historyNote

 

Constituents

ConstituentID

shortIdentifier

 

Constituents

CultureGroup

group

multi-valued field; can be cleaned manually after import?

Constituents

DisplayName

displayName

where firstname is null and lastname is null and institution is null

Constituents

EndDate

dissolutionDate

where > 0

Constituents

Institution

displayName

where firstname is not null and lastname is not null and institution is not null

Constituents

Remarks

NEW: orgNote

 

ConTypes

ConstituentType

NEW: orgType

join to Constituents on ConstituentTypeID

sql query

view: cs_organizations (refreshed 2012-04-19)

SELECT
  c.constituentID orgID,
  (case when c.active = 1 then 'accepted' else 'under review' end) termStatus,
  (case when c.firstname is not null and c.lastname is not null and c.institution is not null then c.institution else c.displayname end) displayName,
  (case when c.institution is null then c.displayname else c.institution end) longName,
  (case when c.begindate > 0 then c.begindate end) foundingDate,
  (case when c.enddate > 0 then c.enddate end) dissolutionDate,
  c.culturegroup "group",
  c.biography historyNote,
  c.remarks orgNote,
  ct.constituenttype orgType
FROM
  constituents c, contypes ct
WHERE
  c.constituentTypeID = ct.constituentTypeID
  AND c.constituentTypeID > 1
UNION ALL
SELECT
  max(c.constituentID) orgID,
  'accepted' termStatus,
  c.institution displayName,
  c.institution longName,
  null foundingDate,
  null dissolutionDate,
  null "group",
  null historyNote,
  null orgNote,
  'Institution (unspecified)' orgType
FROM
  constituents c
WHERE
  c.constituentTypeID = 1
  AND c.institution is not null
  AND c.active = 1
  AND c.institution NOT IN (SELECT (case when c2.firstname is not null and c2.lastname is not null and c2.institution is not null then c2.institution else c2.displayname end) FROM constituents c2 WHERE c2.constituentTypeID > 1)
GROUP BY c.institution
;

notes

  • we need 2 new custom fields for the Organization schema: orgNote (Remarks), orgType (constituentType).
  • in the query above, the first part selects regular organization constituents and the second part selects person constituents that have an organization listed in the institution field (which cannot be found in a regular organization constituent).
  • all fields in Constituents table not listed in query above can be ignored.
  • ConstituentID (orgID in query above) should be imported into 'shortIdentifier'.
  • constituentTypeID = 0 will be cleaned up before extract and should be ignored by query above, per Michael (11/10/2011).
  • for import into 'displayName', constituents.displayname is generally preferable to constituents.institution, but does not work in case where displayname is the person's (not the organization's) name (which seems to be case where firstname and lastname are both not null).
  • ConXRefs table includes some relationships between Constituents, but not very many and they can be manually entered later, per Michael (11/10/2011).
  • Family (constituentTypeID=16) and Estate (constituentTypeID=12) records can be treated just like other organization records, per Michael (11/10/2011). (since they generally have both an institution and a first/last name, we will create separate and related organization and person records from them.)
  • Tribes (constituentTypeID of 3, 7, 13) should be imported into a new Tribe schema. Should/must this schema be created before import, or can it be created later? check with Chris?

Person extract

field mapping

TMS table name

TMS field name

CSpace field name

notes

Constituents

Active

termStatus

 

Constituents

ConstituentID

shortIdentifier

 

Constituents

DisplayName

displayName

 

Constituents

NameTitle

title

data is messy (130 distinct entries); what about cases like "Friend of ", "Mother of ", etc.?

Constituents

FirstName

foreName

 

Constituents

MiddleName

middleName

 

Constituents

LastName

surName

 

Constituents

Suffix

nameAdditions

 

Constituents

BeginDate

birthDate

where > 0

Constituents

EndDate

deathDate

where > 0

Constituents

MTB_Gender

gender

where in ('Male', 'Female')

Constituents

Nationality

nationality

 

Constituents

Position

occupation

 

Constituents

Biography

bioNote

 

Constituents

CultureGroup

group

 

Constituents

Remarks

nameNote

 

sql query

view: cs_persons (refreshed 2012-04-19)

SELECT
  c.constituentID personID,
  (case when c.active = 1 then 'accepted' else 'under review' end) termStatus,
  c.displayname displayName,
  (case when c.nametitle like '%of%of%' or c.nametitle like '%of' then null when charindex(' of', c.nametitle) > 0 then substring(c.nametitle, charindex('of', c.nametitle) + 3, len(c.nametitle)) else c.nametitle end) title,
  c.firstname foreName,
  c.middlename middleName,
  c.lastname lastName,
  c.suffix nameAdditions,
  (case when c.begindate > 0 then c.begindate end) birthDate,
  (case when c.enddate > 0 then c.enddate end) deathDate,
  (case when lower(c.mtb_gender) in ('male','female') then lower(c.mtb_gender) end) gender,
  c.nationality nationality,
  c.position occupation,
  c.biography bioNote,
  c.culturegroup "group",
  c.remarks nameNote
FROM
  constituents c
WHERE
  c.constituentTypeID = 1
  AND c.displayname is not null
UNION ALL
SELECT
  c.constituentID personID,
  (case when c.active = 1 then 'accepted' else 'under review' end) termStatus,
  coalesce(c.firstname, '') + ' ' + (case when c.middlename is not null then c.middlename + ' ' else '' end) + coalesce(c.lastname, '') + (case when c.suffix is not null then ' ' + c.suffix else '' end) displayName,
  (case when c.nametitle like '%of%of%' or c.nametitle like '%of' then null when charindex(' of', c.nametitle) > 0 then substring(c.nametitle, charindex('of', c.nametitle) + 3, len(c.nametitle)) else c.nametitle end) title,
  c.firstname foreName,
  c.middlename middleName,
  c.lastname lastName,
  c.suffix nameAdditions,
  null birthDate,
  null deathDate,
  null gender,
  c.nationality nationality,
  c.position occupation,
  c.biography bioNote,
  c.culturegroup "group",
  null nameNote
FROM
  constituents c
WHERE
  c.constituentTypeID > 1
  AND c.lastname is not null
  AND c.firstname is not null
  AND coalesce(c.firstname, '') + ' ' + (case when c.middlename is not null then c.middlename + ' ' else '' end) + coalesce(c.lastname, '') + (case when c.suffix is not null then ' ' + c.suffix else '' end)
     NOT IN (SELECT coalesce(c2.firstname, '') + ' ' + (case when c2.middlename is not null then c2.middlename + ' ' else '' end) + coalesce(c2.lastname, '') + (case when c2.suffix is not null then ' ' + c2.suffix else '' end)  FROM constituents c2 WHERE constituentTypeID = 1)
;

notes

  • in the query above, the first part selects regular person constituents and the second part selects organization constituents that have a person listed in the name fields (which cannot be found in a regular person constituent).
  • there will be a few duplicate person records created by the second part of the query, but not enough to worry about, IMHO.
  • all fields in Constituents table not listed above can be ignored.
  • ConstituentID (personID in query above) should be imported into 'shortIdentifier'.
  • constituentTypeID = 0 will be cleaned up before extract and should be ignored by query above, per Michael (11/10/2011).
  • nameTitle values like "%of" (e.g., "Sister of", "Mother of", etc.) should not be imported into the 'title' field, as they are already included in the 'displayName' field, per Michael (11/17/2011). After culling these out, there are about 50 distinct entries in this field.
  • 'group' will eventually be changed to reference new Culture Authority (subclass of Concept Authority), per Michael and the schema documentation, but for now we can import string values into this field. ditto for 'nationality' and 'occupation'.

Org-Person Relationship extract

field mapping

TMS table name

TMS field name

CSpace field name

notes

Constituents

ConstituentID

shortIdentifier

identifies Organization

Constituents

ConstituentID

shortIdentifier

identifies Person

sql query

view: cs_personorgrelationships

SELECT
  max(coalesce(cOrg.constituentID, c.constituentID)) orgID,
    c.constituentID personID
FROM
  constituents c LEFT OUTER JOIN (SELECT c2.constituentID, (case when c2.firstname is not null and c2.lastname is not null and c2.institution is not null then c2.institution else c2.displayname end) orgname FROM constituents c2 WHERE c2.constituentTypeID > 1) cOrg ON c.institution = cOrg.orgname
WHERE
  c.constituentTypeID = 1
  AND c.institution is not null
  AND c.active = 1
GROUP BY c.constituentID
UNION ALL
SELECT
  c.constituentID orgID,
  max(coalesce(cPerson.constituentID, c.constituentID)) personID
FROM
  constituents c LEFT OUTER JOIN
  (SELECT c2.constituentID, coalesce(c2.firstname, '') + ' ' + (case when c2.middlename is not null then c2.middlename + ' ' else '' end) + coalesce(c2.lastname, '') + (case when c2.suffix is not null then ' ' + c2.suffix else '' end)  personName FROM constituents c2 WHERE constituentTypeID = 1) cPerson
  ON coalesce(c.firstname, '') + ' ' + (case when c.middlename is not null then c.middlename + ' ' else '' end) + coalesce(c.lastname, '') + (case when c.suffix is not null then ' ' + c.suffix else '' end) = cPerson.personName
WHERE
  c.constituentTypeID > 1
  AND c.lastname is not null
  AND c.firstname is not null
GROUP BY c.constituentID
;

notes

  • in the query above, the first part selects person constituents that include a related organization (in the 'institution' field) and the second part selects organization constituents that included a related person (in the 'lastname' and 'firstname' fields). both parts of the query attempt to find a matching constituent record ID for the related organization or person, before falling back on using the same ID as the main constituent record. in either case, the ID should identify an Organization or Person that has been imported via the Organization or Person extracts above.
  • true first-class relationships between constituents are not yet possible; they are on the roadmap (see CSPACE-3938) but not likely to be implemented soon. MMI is adding customized fields for this purpose (see http://wiki.collectionspace.org/display/deploy/MMI+Name+Authority+schema+extension+notes+v1.12).
  • we can import related Person records into 'contactName' (repeatable) field for an Organization, but this field should probably be re-labeled to clarify that it stores not just "contacts" but any related persons.
  • there is no existing field in the Person schema for storing related Organizations.
  • the full refName (not just shortIdentifier) should be used to reference the Person record in the import, per Aron (11/3/2011).

Person-Person Relationship extract

field mapping

TMS table name

TMS field name

CSpace field name

notes

Associations

ID1

reference to one Person

 

Associations

ID2

reference to other Person

 

Relationships

relation1

?

 

Relationships

relation2

?

 

sql query

view:  cs_personpersonrelationships (refreshed 2012-04-20)

SELECT
 oa.ID1 constituentID_1,
 oa.ID2 constituentID_2,
 r.relation1 + '-' + r.relation2 relationshipType
FROM
 Associations oa,
 Constituents con1,
 Constituents con2,
 Relationships r
WHERE
 oa.tableID = 23
 AND oa.ID1 = con1.constituentID
 AND oa.ID2 = con2.constituentID
 AND oa.relationshipID = r.relationshipID
;

notes

  • relationship type is included in this query, but we don't have a good place to put it in CSpace (sad)

Contact Information extracts

sql query - Address

view: cs_contactaddress

SELECT
  a.constituentID,
  (case when a.active = 0 or a.addresstypeid in (4,5,7,9) then 'previous' else 'current' end) addressType,
  a.streetLine1 addressPlace1,
  a.streetLine2 + (case when a.streetLine3 is null then '' else ', ' + a.streetLine3 end) addressPlace2,
  a.city addressMunicipality,
  a.state addressStateOrProvince,
  a.zipCode addressPostCode,
  (case when a.countryID = 0 then '' else c.country end) addressCountry
FROM
  ConAddress a LEFT OUTER JOIN Countries c ON a.countryID = c.countryID
;

sql query - Email

view: cs_contactemail

SELECT
  e.constituentID,
  e.emailAddress email
FROM
  ConEMail e
;

sql query - Telephone Number

view: cs_contacttelephone

SELECT
  p.constituentID,
  p.phoneNumber telephoneNumber
FROM
  ConPhones p
WHERE
  lower(coalesce(p.description, 'phone')) not like 'fax%'
;

sql query - Fax Number

view: cs_contactfax

SELECT
  p.constituentID,
  p.phoneNumber faxNumber
FROM
  ConPhones p
WHERE
  lower(coalesce(p.description, 'phone')) like 'fax%'
;

sql query - Contact Note

(for manual update only)

SELECT
  a.constituentID,
  (case when a.remarks is null then '' else remarks + '; ' end) + (case when coalesce(a.begindate,'0') = '0' then '' else a.begindate end) + (case when coalesce(a.begindate,'0') = '0' and coalesce(a.enddate,'0') = '0' then '' else ' - ' end) + (case when coalesce(a.enddate,'0') = '0' then '' else a.enddate end) contactNote
FROM
  ConAddress a
WHERE
  a.remarks is not null or (a.begindate is not null and a.begindate != '0') or (a.enddate is not null and a.enddate != '0')
;

notes

  • put 'previous' and 'current' indicators in the existing addressType field, per Chris (12/7/11)
  • okay to ignore TMS.addressTypeID field (all but a few address records have addressTypeID = 0, and none have address type of 6 or 7 (web address); do we need to retain info on 'physical address' vs. 'mailing address), per Michael (11/17/11)
  • okay to ignore email description field; very little data here, usually indicating name of person associated with email address for organization, and it can be handled manually after import, per Michael (11/17/11)
  • we need new custom field for contactNote (concatenation of remarks, beginDate, endDate); this data will be updated manually, per Michael (12/9/11)

Alt Names extract

sql query - Org Alternate names

view: cs_organizationaltnames

SELECT
  c.constituentID orgID,
  (case when n.institution is not null then n.institution else (case when n.nameTitle is null then '' else n.nameTitle + ' ' end) + (case when n.firstName is null then '' else n.firstName + ' ' end) + (case when n.middleName is null then '' else n.middleName + ' ' end) + coalesce(n.lastName,'') end) longName,
  coalesce(n.remarks,'') + (case when n.cultureGroup is null then '' else '; ' + n.cultureGroup end) nameAdditions
FROM
  Constituents c, ConAltNames n
WHERE
  c.constituentID = n.constituentID
  AND c.constituentTypeID > 1
;

sql query - Person Alternate names

view: cs_personaltnames

SELECT
  c.constituentID personID,
  (case when n.nameType is null then 'alternate name: ' else lower(n.nameType) + ': ' end) + (case when n.firstName is null and n.lastName is null then n.institution else (case when n.nameTitle is null then '' else n.nameTitle + ' ' end) + (case when n.firstName is null then '' else n.firstName + ' ' end) + (case when n.middleName is null then '' else n.middleName + ' ' end) + coalesce(n.lastName,'') + (case when n.suffix is null then '' else ' ' + n.suffix end) + (case when n.position is null then '' else ', ' + n.position end) + (case when n.cultureGroup is null then '' else ', ' + n.cultureGroup end) +  (case when n.remarks is null then '' else ' (' + n.remarks + ')' end) end) altNameNote
FROM
  Constituents c, ConAltNames n
WHERE
  c.constituentID = n.constituentID
  AND c.constituentTypeID = 1
;

notes

  • import Org alternate names into the 'mainBodyGroup' repeating field group
  • import Person alternate names into a new custom repeating field altNameNote, per Michael (11/17/11).