PAHMA Organization & Person Authority data mapping
This page contains information about PAHMA's data migration of Organizations and Persons, and related info.
See parent page on PAHMA data migration ETL work.
Organization extract
field mapping
TMS table name |
TMS field name |
CSpace field name |
notes |
---|---|---|---|
Constituents |
Active |
termStatus |
transform, as per query below |
Constituents |
BeginDate |
foundingDate |
where > 0 |
Constituents |
Biography |
historyNote |
|
Constituents |
ConstituentID |
shortIdentifier |
|
Constituents |
CultureGroup |
group |
multi-valued field; can be cleaned manually after import? |
Constituents |
DisplayName |
displayName |
where firstname is null and lastname is null and institution is null |
Constituents |
EndDate |
dissolutionDate |
where > 0 |
Constituents |
Institution |
displayName |
where firstname is not null and lastname is not null and institution is not null |
Constituents |
Remarks |
NEW: orgNote |
|
ConTypes |
ConstituentType |
NEW: orgType |
join to Constituents on ConstituentTypeID |
sql query
view: cs_organizations (refreshed 2012-04-19)
SELECT c.constituentID orgID, (case when c.active = 1 then 'accepted' else 'under review' end) termStatus, (case when c.firstname is not null and c.lastname is not null and c.institution is not null then c.institution else c.displayname end) displayName, (case when c.institution is null then c.displayname else c.institution end) longName, (case when c.begindate > 0 then c.begindate end) foundingDate, (case when c.enddate > 0 then c.enddate end) dissolutionDate, c.culturegroup "group", c.biography historyNote, c.remarks orgNote, ct.constituenttype orgType FROM constituents c, contypes ct WHERE c.constituentTypeID = ct.constituentTypeID AND c.constituentTypeID > 1 UNION ALL SELECT max(c.constituentID) orgID, 'accepted' termStatus, c.institution displayName, c.institution longName, null foundingDate, null dissolutionDate, null "group", null historyNote, null orgNote, 'Institution (unspecified)' orgType FROM constituents c WHERE c.constituentTypeID = 1 AND c.institution is not null AND c.active = 1 AND c.institution NOT IN (SELECT (case when c2.firstname is not null and c2.lastname is not null and c2.institution is not null then c2.institution else c2.displayname end) FROM constituents c2 WHERE c2.constituentTypeID > 1) GROUP BY c.institution ;
notes
- we need 2 new custom fields for the Organization schema: orgNote (Remarks), orgType (constituentType).
- in the query above, the first part selects regular organization constituents and the second part selects person constituents that have an organization listed in the institution field (which cannot be found in a regular organization constituent).
- all fields in Constituents table not listed in query above can be ignored.
- ConstituentID (orgID in query above) should be imported into 'shortIdentifier'.
- constituentTypeID = 0 will be cleaned up before extract and should be ignored by query above, per Michael (11/10/2011).
- for import into 'displayName', constituents.displayname is generally preferable to constituents.institution, but does not work in case where displayname is the person's (not the organization's) name (which seems to be case where firstname and lastname are both not null).
- ConXRefs table includes some relationships between Constituents, but not very many and they can be manually entered later, per Michael (11/10/2011).
- Family (constituentTypeID=16) and Estate (constituentTypeID=12) records can be treated just like other organization records, per Michael (11/10/2011). (since they generally have both an institution and a first/last name, we will create separate and related organization and person records from them.)
- Tribes (constituentTypeID of 3, 7, 13) should be imported into a new Tribe schema. Should/must this schema be created before import, or can it be created later? check with Chris?
Person extract
field mapping
TMS table name |
TMS field name |
CSpace field name |
notes |
---|---|---|---|
Constituents |
Active |
termStatus |
|
Constituents |
ConstituentID |
shortIdentifier |
|
Constituents |
DisplayName |
displayName |
|
Constituents |
NameTitle |
title |
data is messy (130 distinct entries); what about cases like "Friend of ", "Mother of ", etc.? |
Constituents |
FirstName |
foreName |
|
Constituents |
MiddleName |
middleName |
|
Constituents |
LastName |
surName |
|
Constituents |
Suffix |
nameAdditions |
|
Constituents |
BeginDate |
birthDate |
where > 0 |
Constituents |
EndDate |
deathDate |
where > 0 |
Constituents |
MTB_Gender |
gender |
where in ('Male', 'Female') |
Constituents |
Nationality |
nationality |
|
Constituents |
Position |
occupation |
|
Constituents |
Biography |
bioNote |
|
Constituents |
CultureGroup |
group |
|
Constituents |
Remarks |
nameNote |
|
sql query
view: cs_persons (refreshed 2012-04-19)
SELECT c.constituentID personID, (case when c.active = 1 then 'accepted' else 'under review' end) termStatus, c.displayname displayName, (case when c.nametitle like '%of%of%' or c.nametitle like '%of' then null when charindex(' of', c.nametitle) > 0 then substring(c.nametitle, charindex('of', c.nametitle) + 3, len(c.nametitle)) else c.nametitle end) title, c.firstname foreName, c.middlename middleName, c.lastname lastName, c.suffix nameAdditions, (case when c.begindate > 0 then c.begindate end) birthDate, (case when c.enddate > 0 then c.enddate end) deathDate, (case when lower(c.mtb_gender) in ('male','female') then lower(c.mtb_gender) end) gender, c.nationality nationality, c.position occupation, c.biography bioNote, c.culturegroup "group", c.remarks nameNote FROM constituents c WHERE c.constituentTypeID = 1 AND c.displayname is not null UNION ALL SELECT c.constituentID personID, (case when c.active = 1 then 'accepted' else 'under review' end) termStatus, coalesce(c.firstname, '') + ' ' + (case when c.middlename is not null then c.middlename + ' ' else '' end) + coalesce(c.lastname, '') + (case when c.suffix is not null then ' ' + c.suffix else '' end) displayName, (case when c.nametitle like '%of%of%' or c.nametitle like '%of' then null when charindex(' of', c.nametitle) > 0 then substring(c.nametitle, charindex('of', c.nametitle) + 3, len(c.nametitle)) else c.nametitle end) title, c.firstname foreName, c.middlename middleName, c.lastname lastName, c.suffix nameAdditions, null birthDate, null deathDate, null gender, c.nationality nationality, c.position occupation, c.biography bioNote, c.culturegroup "group", null nameNote FROM constituents c WHERE c.constituentTypeID > 1 AND c.lastname is not null AND c.firstname is not null AND coalesce(c.firstname, '') + ' ' + (case when c.middlename is not null then c.middlename + ' ' else '' end) + coalesce(c.lastname, '') + (case when c.suffix is not null then ' ' + c.suffix else '' end) NOT IN (SELECT coalesce(c2.firstname, '') + ' ' + (case when c2.middlename is not null then c2.middlename + ' ' else '' end) + coalesce(c2.lastname, '') + (case when c2.suffix is not null then ' ' + c2.suffix else '' end) FROM constituents c2 WHERE constituentTypeID = 1) ;
notes
- in the query above, the first part selects regular person constituents and the second part selects organization constituents that have a person listed in the name fields (which cannot be found in a regular person constituent).
- there will be a few duplicate person records created by the second part of the query, but not enough to worry about, IMHO.
- all fields in Constituents table not listed above can be ignored.
- ConstituentID (personID in query above) should be imported into 'shortIdentifier'.
- constituentTypeID = 0 will be cleaned up before extract and should be ignored by query above, per Michael (11/10/2011).
- nameTitle values like "%of" (e.g., "Sister of", "Mother of", etc.) should not be imported into the 'title' field, as they are already included in the 'displayName' field, per Michael (11/17/2011). After culling these out, there are about 50 distinct entries in this field.
- 'group' will eventually be changed to reference new Culture Authority (subclass of Concept Authority), per Michael and the schema documentation, but for now we can import string values into this field. ditto for 'nationality' and 'occupation'.
Org-Person Relationship extract
field mapping
TMS table name |
TMS field name |
CSpace field name |
notes |
---|---|---|---|
Constituents |
ConstituentID |
shortIdentifier |
identifies Organization |
Constituents |
ConstituentID |
shortIdentifier |
identifies Person |
sql query
view: cs_personorgrelationships
SELECT max(coalesce(cOrg.constituentID, c.constituentID)) orgID, c.constituentID personID FROM constituents c LEFT OUTER JOIN (SELECT c2.constituentID, (case when c2.firstname is not null and c2.lastname is not null and c2.institution is not null then c2.institution else c2.displayname end) orgname FROM constituents c2 WHERE c2.constituentTypeID > 1) cOrg ON c.institution = cOrg.orgname WHERE c.constituentTypeID = 1 AND c.institution is not null AND c.active = 1 GROUP BY c.constituentID UNION ALL SELECT c.constituentID orgID, max(coalesce(cPerson.constituentID, c.constituentID)) personID FROM constituents c LEFT OUTER JOIN (SELECT c2.constituentID, coalesce(c2.firstname, '') + ' ' + (case when c2.middlename is not null then c2.middlename + ' ' else '' end) + coalesce(c2.lastname, '') + (case when c2.suffix is not null then ' ' + c2.suffix else '' end) personName FROM constituents c2 WHERE constituentTypeID = 1) cPerson ON coalesce(c.firstname, '') + ' ' + (case when c.middlename is not null then c.middlename + ' ' else '' end) + coalesce(c.lastname, '') + (case when c.suffix is not null then ' ' + c.suffix else '' end) = cPerson.personName WHERE c.constituentTypeID > 1 AND c.lastname is not null AND c.firstname is not null GROUP BY c.constituentID ;
notes
- in the query above, the first part selects person constituents that include a related organization (in the 'institution' field) and the second part selects organization constituents that included a related person (in the 'lastname' and 'firstname' fields). both parts of the query attempt to find a matching constituent record ID for the related organization or person, before falling back on using the same ID as the main constituent record. in either case, the ID should identify an Organization or Person that has been imported via the Organization or Person extracts above.
- true first-class relationships between constituents are not yet possible; they are on the roadmap (see CSPACE-3938) but not likely to be implemented soon. MMI is adding customized fields for this purpose (see http://wiki.collectionspace.org/display/deploy/MMI+Name+Authority+schema+extension+notes+v1.12).
- we can import related Person records into 'contactName' (repeatable) field for an Organization, but this field should probably be re-labeled to clarify that it stores not just "contacts" but any related persons.
- there is no existing field in the Person schema for storing related Organizations.
- the full refName (not just shortIdentifier) should be used to reference the Person record in the import, per Aron (11/3/2011).
Person-Person Relationship extract
field mapping
TMS table name |
TMS field name |
CSpace field name |
notes |
---|---|---|---|
Associations |
ID1 |
reference to one Person |
|
Associations |
ID2 |
reference to other Person |
|
Relationships |
relation1 |
? |
|
Relationships |
relation2 |
? |
|
sql query
view: cs_personpersonrelationships (refreshed 2012-04-20)
SELECT oa.ID1 constituentID_1, oa.ID2 constituentID_2, r.relation1 + '-' + r.relation2 relationshipType FROM Associations oa, Constituents con1, Constituents con2, Relationships r WHERE oa.tableID = 23 AND oa.ID1 = con1.constituentID AND oa.ID2 = con2.constituentID AND oa.relationshipID = r.relationshipID ;
notes
- relationship type is included in this query, but we don't have a good place to put it in CSpace
Contact Information extracts
sql query - Address
view: cs_contactaddress
SELECT a.constituentID, (case when a.active = 0 or a.addresstypeid in (4,5,7,9) then 'previous' else 'current' end) addressType, a.streetLine1 addressPlace1, a.streetLine2 + (case when a.streetLine3 is null then '' else ', ' + a.streetLine3 end) addressPlace2, a.city addressMunicipality, a.state addressStateOrProvince, a.zipCode addressPostCode, (case when a.countryID = 0 then '' else c.country end) addressCountry FROM ConAddress a LEFT OUTER JOIN Countries c ON a.countryID = c.countryID ;
sql query - Email
view: cs_contactemail
SELECT e.constituentID, e.emailAddress email FROM ConEMail e ;
sql query - Telephone Number
view: cs_contacttelephone
SELECT p.constituentID, p.phoneNumber telephoneNumber FROM ConPhones p WHERE lower(coalesce(p.description, 'phone')) not like 'fax%' ;
sql query - Fax Number
view: cs_contactfax
SELECT p.constituentID, p.phoneNumber faxNumber FROM ConPhones p WHERE lower(coalesce(p.description, 'phone')) like 'fax%' ;
sql query - Contact Note
(for manual update only)
SELECT a.constituentID, (case when a.remarks is null then '' else remarks + '; ' end) + (case when coalesce(a.begindate,'0') = '0' then '' else a.begindate end) + (case when coalesce(a.begindate,'0') = '0' and coalesce(a.enddate,'0') = '0' then '' else ' - ' end) + (case when coalesce(a.enddate,'0') = '0' then '' else a.enddate end) contactNote FROM ConAddress a WHERE a.remarks is not null or (a.begindate is not null and a.begindate != '0') or (a.enddate is not null and a.enddate != '0') ;
notes
- put 'previous' and 'current' indicators in the existing addressType field, per Chris (12/7/11)
- okay to ignore TMS.addressTypeID field (all but a few address records have addressTypeID = 0, and none have address type of 6 or 7 (web address); do we need to retain info on 'physical address' vs. 'mailing address), per Michael (11/17/11)
- okay to ignore email description field; very little data here, usually indicating name of person associated with email address for organization, and it can be handled manually after import, per Michael (11/17/11)
- we need new custom field for contactNote (concatenation of remarks, beginDate, endDate); this data will be updated manually, per Michael (12/9/11)
Alt Names extract
sql query - Org Alternate names
view: cs_organizationaltnames
SELECT c.constituentID orgID, (case when n.institution is not null then n.institution else (case when n.nameTitle is null then '' else n.nameTitle + ' ' end) + (case when n.firstName is null then '' else n.firstName + ' ' end) + (case when n.middleName is null then '' else n.middleName + ' ' end) + coalesce(n.lastName,'') end) longName, coalesce(n.remarks,'') + (case when n.cultureGroup is null then '' else '; ' + n.cultureGroup end) nameAdditions FROM Constituents c, ConAltNames n WHERE c.constituentID = n.constituentID AND c.constituentTypeID > 1 ;
sql query - Person Alternate names
view: cs_personaltnames
SELECT c.constituentID personID, (case when n.nameType is null then 'alternate name: ' else lower(n.nameType) + ': ' end) + (case when n.firstName is null and n.lastName is null then n.institution else (case when n.nameTitle is null then '' else n.nameTitle + ' ' end) + (case when n.firstName is null then '' else n.firstName + ' ' end) + (case when n.middleName is null then '' else n.middleName + ' ' end) + coalesce(n.lastName,'') + (case when n.suffix is null then '' else ' ' + n.suffix end) + (case when n.position is null then '' else ', ' + n.position end) + (case when n.cultureGroup is null then '' else ', ' + n.cultureGroup end) + (case when n.remarks is null then '' else ' (' + n.remarks + ')' end) end) altNameNote FROM Constituents c, ConAltNames n WHERE c.constituentID = n.constituentID AND c.constituentTypeID = 1 ;
notes
- import Org alternate names into the 'mainBodyGroup' repeating field group
- import Person alternate names into a new custom repeating field altNameNote, per Michael (11/17/11).