Troubleshooting Installation Problems

Troubleshooting installation problems

If things don't look like they are working, here are some steps to try:

Memory errors when running Ant (ant) or Maven (mvn)

As a system administrator, implementer or developer, when installing CollectionSpace or building or deploying its source code, you will likely be running the Ant (ant) and Maven (mvn) command-line tools.

If you should encounter an "out of memory" error (for example, a java.lang.OutOfMemoryError) when running Ant or Maven, you can increase the sizes of the Java heap and "permanent generation" (PermGen) space used by those tools by setting the Java Virtual Machine (JVM) Xmx and MaxPermSize values, respectively. You might start by experimenting with the settings given in this example:

Set the following environment variables in your Windows command prompt:

set ANT_OPTS=-Xmx768m -XX:MaxPermSize=512m
set MAVEN_OPTS= -Xmx768m -XX:MaxPermSize=512m

If you are using the Unix 'bash' shell, set the environment variables as follows:

export ANT_OPTS="-Xmx768m -XX:MaxPermSize=512m"
export MAVEN_OPTS="-Xmx768m -XX:MaxPermSize=512m"

You can also set a JVM minimum heap size Xms, alongside the maximum size specified via Xmx. This might help if you are having 'heap' errors; e.g.

-Xmx768m -Xms512m -XX:MaxPermSize=512m 

In addition, if you should encounter a java.lang.OutOfMemoryError: GC overhead limit exceeded error when running Maven or Ant, you can add -XX:-UseGCOverheadLimit at the end of any of the statements above; e.g.

-Xmx768m -Xms512m -XX:MaxPermSize=512m -XX:-UseGCOverheadLimit

If making these changes doesn't help, you might check for errors in your environment variables as described in 6 Common Errors in Setting Java Heap Size.

Errors while running Ant, where "build.xml" appears in any relevant error message

If an error occurs while installing the Services or Application layers, and any relevant error message mentions "build.xml", this is an error occurring while running an Ant task; that is, an installation routine that is executed by the Ant build tool. An example:

[INFO] /tmp/source/app/collectionspace-application/war-entry/build.xml:44:
Problem: failed to create task or type hostinfo
[INFO] Cause: The name is undefined.

Some of these errors may occur if you're using an outdated version of Ant. This is particularly likely if the error message mentions failed to create task or type.

You can check your version of Ant by running ant -version or a similar command at your system's command or terminal prompt. CollectionSpace requires Ant version 1.8.2 or later. However, many operating systems, as well as package managers for installing additional software on your system, may only come with or install an earlier version of Ant, such as Ant 1.7.x.

For instructions on upgrading to a newer version of Ant, see the relevant installation guide for your operating system. If you do not find the necessary instructions in that guide, you can also manually download and install Ant from the Ant website, by clicking the "Binary Distributions" (or similar) link on that site, and following the instructions there.

Verifying that the installation was successful

Try entering the following URLs into your browser, where <hostname-or-ip-address> is the actual hostname or IP address of your server running CollectionSpace (without the angle brackets):

http://<hostname-or-ip-address>:8180/cspace-services/acquisitions
http://<hostname-or-ip-address>:8180/cspace-services/intakes
http://<hostname-or-ip-address>:8180/cspace-services/vocabularies

They all should return a valid XML document which usually starts with numeric values for pageNum, PageSize, itemsInPage

If at any point it says Index failed then you have a permissions issue
also if it says something like GET index - that also isn't great.

Permission issues are usually related to a failure of the ant import step.

Issues with user account permissions, even though all the steps of the install instructions were successful

Most permission related stuff is set up by the ant import script. When ant import has run successfully, you should not see a BUILD_FAILED message or any stacktraces printed to the screen. If you do see any errors, some frequently encountered issues are:

  • You don't have permission to write to the file /tmp/aclCache.data. To resolve this problem, you can either delete that file or run  ant create_db -Drecreate_db=true import as root.
  • You are attempting to import duplicate roles or permissions. To resolve this problem, make sure that you run ant create_db -Drecreate_db=true before running ant import. Better yet, run them together in that order: ant create_db -Drecreate_db=true import
  • When you ran ant import, it failed with an error message which may look something like this: Access denied for user 'cspace'@'localhost' (using password: NO). To resolve this problem, make sure that all of the required environment variables listed in the relevant installation guide for your operating system, including DB_PASSWORD_CSPACE, are present in your shell or command environment. See the section in that guide relating to environment variables for details.

'Directory does not exist' error during ant undeploy

Symptom(s):

  • When running ant undeploy from the services layer source code tree, you encounter an error similar to this:

    Directory does not exist:
    /tmp/cspace-2.0-tag/services/vocabulary/3rdparty/nuxeo-platform-cs-vocabulary/${env.CSPACE_JEESERVER_HOME}/nuxeo-server/plugins
    

Fix:

  • Make sure that all of the required environment variables listed in the relevant installation guide for your operating system, including CSPACE_JEESERVER_HOME, are present in your shell or command environment. See the section in that guide relating to environment variables for details.

Deployment errors due to restrictive filesystem permissions

In addition, depending on the permissions that your account has on the computer where you are installing CollectionSpace, it is possible that operating system permissions could cause ant undeploy and/or ant deploy to fail. For example, on a Linux/Unix/Mac OS X system, if you used sudo to unpack the CollectionSpace tarballs in /usr/local/share, then the root user will own the $CSPACE_JEESERVER_HOME directory and its contents. If you then use sudo to run the mvn clean install -DskipTests command, root will own files in the source tree (even in /tmp) and your maven repository (even if it is in your home directory, e.g., in your .m2 repository). This would then cause the ant undeploy step to fail, but if you run ant with sudo, your environment variables will not be effective.

The solution is to use chown to change ownership of your $CSPACE_JEESERVER_HOME directory (e.g. sudo chown -R yourusername:groupname $CSPACE_JEESERVER_HOME). If you have already run mvn with sudo, you might need to chown your local maven repository folder (typically in $HOME/.m2) and the source in /tmp. Read the notes from ant to see what has failed. Check permissions on the file it complains about and chown accordingly.

Port conflicts starting up the CollectionSpace servers

By default, the process associated with the CollectionSpace servers listens for incoming connections at port 8180. If you try to start the servers, when either:

  • another instance of the servers are already running; or
  • some any other process on the server computer is listening on that same port

you will see error messages referring to this port conflict in Tomcat's main log file, $CSPACE_JEESERVER_HOME/logs/catalina.out

In addition, requests sent to the Services layer via its REST APIs will fail with an HTTP status code of 500 (Internal Server Error). You may also see an org.jboss.resteasy.spi.UnhandledException or similar Exception in the body of the response.

Verifying that the databases were created successfully

PostgreSQL

At a command prompt, enter:

psql -U {dbusername}

where {dbusername} is the actual name of a PostgreSQL database user, such as postgres.

Then, enter the password for that user when prompted.

Next, at the postgres=# prompt, enter:

postgres=# \l

(This is a 'backslash' character, followed by a lowercase 'L' character)

Can you see databases named nuxeo and cspace?

If no:
Make sure you have run ant create_db -Drecreate_db=true and it succeeded.

If that doesn't seem to be working try either

mvn clean install

or you can attempt to find the init_db.sql script in the top-level /src directory of the services source code tree and run that from the command line.

if yes:

postgres=# \c cspace
postgres=# \d

can you see any tables?

if no:
make sure you have run ant create_db -Drecreate_db=true followed by ant import and this worked. (For more troubleshooting on "ant import", see above.)

if yes:

postgres=# select * from permissions;

are there any rows in the table?
if yes?
Try shutting down and restarting the CollectionSpace servers. This might help the server notice that the permissions are set.

MySQL

At a command prompt, enter:

mysql -u {dbusername} -p

where {dbusername} is the actual name of a MySQL database user.

Then, enter the password for that user when prompted.

Next, at the mysql> prompt, enter show databases;

mysql> show databases;

Can you see databases named nuxeo and cspace?

If no:
Make sure you have run ant create_db -Drecreate_db=true and it succeeded.

If that doesn't seem to be working try either

mvn clean install

or you can attempt to find the init_db.sql script in the top-level /src directory of the services source code tree and run that from the command line.

if yes:

mysql> use cspace;
mysql> show tables;

can you see any tables?

if no:
make sure you have run the ant create_db -Drecreate_db=true and ant import commands and they worked. (For more troubleshooting on "ant import", see above.)

if yes:

mysql> select * from permissions;

are there any rows in the table?
if yes?
Try shutting down and restarting the CollectionSpace servers. This might help the server notice that the permissions are set.

Options missing from some dropdown menus

Symptom(s):

  • When clicking on some dropdown menus, you see only the default option (Please select a value or the equivalent). None of the other items in that menu appear.

Fix:

  • This is usually because the default vocabularies and the values of the terms in those vocabularies, used in various dropdown menus, are missing. To fix this, initialize the default vocabularies and term lists for your tenant. You can do this by following the steps in Initializing data. (These steps also appear under "Initializing default authorities and term lists" in the relevant installation guide.)

Error creating authority term records (Person, Organization, Storage Location ...)

Symptom(s):

  • Most functionality works, but you encounter errors when attempting to save terms in an authority, such as a Person, Organization or Storage Location term.
  • In the Services layer log file, $CSPACE_JEESERVER_HOME/logs/cspace-services.log, you may see messages similar to these (in the example of an error on attempting to save an Organization term record):

    2012-01-13 00:14:30,235 DEBUG [http-8180-8] [org.collectionspace.services.common.security.SecurityInterceptor:104] received POST on orgauthorities/urn:cspace:name(organization)/items
    ...
    2012-01-13 00:14:30,339 DEBUG [http-8180-8] [org.collectionspace.services.organization.OrgAuthorityResource:387] org.collectionspace.services.organization.OrgAuthorityResource
    org.collectionspace.services.common.document.DocumentNotFoundException: No document found matching filter params.
    

Fix:

  • Initialize the default authorities for your tenant, such as the default Person, Organization and Storage Location authorities.
    You can do this by following the steps in Initializing data. (These steps also appear under "Initializing default authorities and term lists" in the relevant installation guide.)

Troubleshooting issues in setting up CollectionSpace to use the PostgreSQL database.

Selected services functionality is failing: requesting lists, searching, sorting (ordering)

Symptom(s):

  • Most services functionality works, but you encounter errors when performing certain tasks: requesting lists of records, searching and sorting (ordering) records. In the CollectionSpace user interface, you may observe that lists of records and search results always come back empty (i.e. with no records).
  • Automated tests in the services source tree that exercise that functionality are failing.; e.g. readList, readPaginatedList, and any tests that begin with 'search' or 'sort'.
  • Error messages similar to org.postgresql.util.PSQLException. message: ERROR: operator does not exist: character varying = bigint appear in the relevant log file(s) in CSPACE_JEESERVER_HOME/logs:

Fix:

  • Execute commands to add functions that cast integers and bigints to text. You will need to do this in the template1 database, and in any databases already created from that template (e.g. nuxeo, cspace), For example, you might run these from a .sql script file, as follows:
% psql -U postgres -f create-casts.sql

Where the file create-casts.sql looks like this:

\c template1
CREATE OR REPLACE FUNCTION pg_catalog.text(integer) RETURNS text STRICT IMMUTABLE LANGUAGE SQL AS 'SELECT textin(int4out($1));';
CREATE CAST (integer AS text) WITH FUNCTION pg_catalog.text(integer) AS IMPLICIT;
COMMENT ON FUNCTION pg_catalog.text(integer) IS 'convert integer to text';

CREATE OR REPLACE FUNCTION pg_catalog.text(bigint) RETURNS text STRICT IMMUTABLE LANGUAGE SQL AS 'SELECT textin(int8out($1));';
CREATE CAST (bigint AS text) WITH FUNCTION pg_catalog.text(bigint) AS IMPLICIT;
COMMENT ON FUNCTION pg_catalog.text(bigint) IS 'convert bigint to text';

\c nuxeo
CREATE OR REPLACE FUNCTION pg_catalog.text(integer) RETURNS text STRICT IMMUTABLE LANGUAGE SQL AS 'SELECT textin(int4out($1));';
CREATE CAST (integer AS text) WITH FUNCTION pg_catalog.text(integer) AS IMPLICIT;
COMMENT ON FUNCTION pg_catalog.text(integer) IS 'convert integer to text';

CREATE OR REPLACE FUNCTION pg_catalog.text(bigint) RETURNS text STRICT IMMUTABLE LANGUAGE SQL AS 'SELECT textin(int8out($1));';
CREATE CAST (bigint AS text) WITH FUNCTION pg_catalog.text(bigint) AS IMPLICIT;
COMMENT ON FUNCTION pg_catalog.text(bigint) IS 'convert bigint to text';

\c cspace
CREATE OR REPLACE FUNCTION pg_catalog.text(integer) RETURNS text STRICT IMMUTABLE LANGUAGE SQL AS 'SELECT textin(int4out($1));';
CREATE CAST (integer AS text) WITH FUNCTION pg_catalog.text(integer) AS IMPLICIT;
COMMENT ON FUNCTION pg_catalog.text(integer) IS 'convert integer to text';

CREATE OR REPLACE FUNCTION pg_catalog.text(bigint) RETURNS text STRICT IMMUTABLE LANGUAGE SQL AS 'SELECT textin(int8out($1));';
CREATE CAST (bigint AS text) WITH FUNCTION pg_catalog.text(bigint) AS IMPLICIT;
COMMENT ON FUNCTION pg_catalog.text(bigint) IS 'convert bigint to text';

If you encounter an error when running this command under Windows:

% psql -U postgres -f create-casts.sql

you can try copying and pasting in the contents of that SQL script, at the psql command prompt.

'new encoding (UTF8) is incompatible' error

Symptom(s):

  • The error message new encoding (UTF8) is incompatible with the encoding of the template database (SQL_ASCII) appears when you run ant create_db -Drecreate_db=true to set up databases. This may occur on Unix-like operating systems - such as some recent Ubuntu Linux releases - where the locale value for the postgres system user is set to a value that does not end in UTF-8 or utf-8.

Fix:
Although this is time-consuming, the following process may help avoid errors you may encounter by attempting other approaches:

  • Uninstall your PostgreSQL packages, possibly by using your package manger's uninstall (or equivalent) command.
  • Set the locale value for the postgres system user to any appropriate value ending in UTF-8 or utf-8:
    • At your shell prompt, type locale -a to view a list of available values for locale settings.
    • Use your operating system-specific method to set the locale value for the postgres user to an appropriate item in that list, representing your preferred language with UTF-8 support. For example, in Ubuntu, execute update-locale localename as the postgres user: sudo -u postgres update-locale LANG=en_US.UTF-8, which sets the locale value to US English with UTF-8 support.
  • Reinstall PostgreSQL.

counterpoint: The above procedure didn't work on Ubuntu 10.04 LTS. I had to update template1 following the instructions at http://wiki.postgresql.org/wiki/Adventures_in_PostgreSQL%2C_Episode_1 (with the following change: )

create database template1 with template = template0 encoding = 'UTF8';

'no pg_hba.conf entry for host "::1"' errors

Symptom(s):

  • The error message psql: FATAL: no pg_hba.conf entry for host "::1", user "postgres", database "template1", SSL off appears when you start the PostgreSQL database server.

Fix:

We believe you may encounter this issue - and thus will need one or more uncommented entries for IPv6 in the pg_hba.conf configuration file - on systems on which IPv6 is enabled. Currently, this pertains to the default configurations of Windows 7, Windows Vista, and Windows Server 2008 (including R2). As IPv6 is more widely adopted over time, more systems will also have IPv6 enabled by default.

'language "plpgsql" does not exist' errors

Symptom(s):

  • The error message org.postgresql.util.PSQLException: ERROR: language "plpgsql" does not exist appears when you run ant create_db -Drecreate_db=true  to set up databases.
  • This error message appears, as well, when you start the Nuxeo server.
  • This error is most likely to be seen with PostgreSQL versions prior to 9.0. Newer versions of PostgreSQL include the 'plpgsql' language by default.

Fix:

  • Execute commands in the template1 database, and in any databases already created from that template (e.g. nuxeo, cspace), to add the plpgsql procedural language. For example, you might run these from a .sql script file, as follows:
% psql -U postgres -f add-plpgsql.sql

Where the file add-plpgsql.sql looks like this:

\c template1
CREATE LANGUAGE plpgsql;

\c nuxeo
CREATE LANGUAGE plpgsql;

\c cspace
CREATE LANGUAGE plpgsql;

'HeuristicMixedException' errors

Symptom(s):

  • Error messages containing javax.transaction.HeuristicMixedException appear when you start the CollectionSpace servers. (Other startup error messages containing the word heuristic may also appear.)

Fix:

  • Edit a PostgreSQL configuration file and set max_prepared_transactions to a non-zero value. Specifically:
    • Edit the file postgresql.conf. Depending on your version of PostgreSQL and operating system, you can find this file in a PostgreSQL configuration directory or data directory; e.g. in /var/lib/pgsql/data on some Linux and Unix-like systems, in /etc/postgresql/{version}/main on Debian/Ubuntu systems; and on some Mac OS X systems, in /Library/PostgreSQL/{version} where {version} is the PostgreSQL version, such as 9.1.
    • Uncomment the following line (if commented out), and set its value as recommended by Nuxeo:

      max_prepared_transactions = 64
      
    • Restart the PostgreSQL server. E.g. on a RedHat/Fedora/CentOS Linux system via sudo service postgresql restart.
    • If the server fails to start up after making this change (see PostgreSQL server fails to start or restart), reduce the value to max_prepared_transactions = 32, then try restarting again.

psql client 'Cannot read termcap database; using dumb terminal settings' errors

Symptom(s):

  • The error message Cannot read termcap database; using dumb terminal settings appears, after login when using the psql command-line client for PostgreSQL. In some cases, you may also see an Aborted message, indicating that psql has unexpectedly quit.

Fix:

  • As a workaround, start psql via psql -n. This will disable readline mode, and allow you to use psql. (As a side effect, running psql via this workaround will disable command auto-completion and some special keystrokes; e.g. Control-C to cancel the current query.)
  • As a more comprehensive fix, identify and install - or appropriately link psql to - the requisite system library or libraries. (You may need to do some investigation about this in PostgreSQL's documentation or forums, or on other Internet sites.)

'Console code page (...) differs from Windows code page (...)' error

Symptom(s):

  • The error message WARNING: Console code page (...) differs from Windows code page (...) ... 8-bit characters might not work correctly appears, possibly while running the psql command-line client for PostgreSQL. (The numbers that appear in one or both of the parenthetical groups in the error message may vary, depending on language. One example: Console code page (437) differs from Windows code page (1252).)

Fix:

  • This issue is of limited scope and does not directly affect the use of PostgreSQL with a CollectionSpace system. It affects only the use of 8-bit characters within psql.
  • For steps to resolve it, see the "Notes for Windows Users" section of the psql manual page, such as this psql page for PostgreSQL 9.1.

'role "reader" already exists' error

Symptom(s):

  • The error message Failed to execute: CREATE ROLE reader WITH PASSWORD 'read' LOGIN ... org.postgresql.util.PSQLException: ERROR: role "reader" already exists appears when you run ant create_db -Drecreate_db=true to set up databases.

Fix:

  • Delete the existing role. (It will be created again the next time you run ant create_db -Drecreate_db=true)

    % psql -U postgres -c "DROP ROLE reader"

'Ident authentication failed for user "postgres"' error

Symptom(s):

  • The error message org.postgresql.util.PSQLException: FATAL: Ident authentication failed for user "postgres" appears when you run ant create_db -Drecreate_db=true to set up databases.

Fix:

  • Edit a PostgreSQL configuration file to change how the postgres user authenticates to the database, from ident, which requires that you are logged in or acting as your operating system's postgres user), to md5, which requires a password. Specifically:
    • Edit the file pg_hba.conf. Depending on your version of PostgreSQL and operating system, you can find this file in a PostgreSQL configuration directory or data directory; e.g. in /var/lib/pgsql/data for PostgreSQL 8.4 on some Linux and Unix-like systems, and in /etc/postgresql/{version}/main on Debian/Ubuntu systems, where {version} is the PostgreSQL version, such as 9.1.
    • In this file, make sure that the postgres user, or all users, if the postgres user is not individually listed, have an authentication METHOD of md5, rather than ident.

      # TYPE  DATABASE    USER        CIDR-ADDRESS          METHOD
      
      # "local" is for Unix domain socket connections only
      local   all         all                               md5
      # IPv4 local connections:
      host    all         all         127.0.0.1/32          md5
      
    • As an alternative, you can specify these privileges individually for each user, which provides granular control over who can connect, from where, and to what database. This is particularly advisable for connections over IPv4 and IPv6 network connections; e.g.:

      # TYPE  DATABASE    USER        CIDR-ADDRESS          METHOD
      
      # "local" is for Unix domain socket connections only
      local   all         all                               md5
      # IPv4 local connections:
      host    all         postgres    samehost              md5
      host    jbossdb     jboss       samehost              md5
      host    nuxeo       nuxeo       samehost              md5
      host    cspace      cspace      samehost              md5
      

      (In PostgreSQL versions prior to 9.0, you can instead specify 127.0.0.1/32 for the CIDR-ADDRESS, in place of samehost.)

    • Restart the PostgreSQL server. E.g. on a RedHat/Fedora/CentOS Linux system via sudo service postgresql restart.

'Class Not Found: JDBC driver org.postgresql.Driver could not be loaded' error

Symptom(s):

  • This error message appears when you run ant create_db -Drecreate_db=true to set up databases.

Fix:

  • Verify that the value of the CSPACE_JEESERVER_HOME environment variable is set to the filesystem path to your Tomcat folder. This value is used by the ant create_db target to find the directory where the PostgreSQL driver file is located. On most Unix/Linux systems, you can verify this by typing echo $CSPACE_JEESERVER_HOME at a Terminal prompt.
  • For newer CollectionSpace systems with a PostgreSQL 9.1 driver:
    • Verify that the PostgreSQL 9.1-901 JDBC 4 driver file, postgresql-9.1-901.jdbc4.jar, is present in CSPACE_JEESERVER_HOME/cspace/services/db/jdbc_drivers.
    • If it is not:
  • For older CollectionSpace systems with a PostgreSQL 8.4 driver:
    • Verify that the PostgreSQL 8.4-702 JDBC 4 driver file, postgresql-8.4-702.jdbc4.jar, is present in CSPACE_JEESERVER_HOME/lib.
    • If it is not:

PostgreSQL server fails to start or restart

Symptom(s):

  • The PostgreSQL server is not running. Attempts to start or restart the server are failing.

Fix:

  • Check the PostgreSQL startup log file for more details. Specifically:
  • View the file pgstartup.log to identify the cause of the startup failure. Depending on your version of PostgreSQL and operating system, you can find this file in a PostgreSQL directory; e.g. in /var/lib/pgsql for PostgreSQL 8.4 on some Linux and Unix-like systems.
  • Try temporarily starting PostgreSQL directly via the pg_ctl command, rather than via your operating system's mechanism for starting services (such as via service postgresql start on some Linux systems). Temporarily starting PostgreSQL via pg_ctl may potentially provide better startup error messages.

Some common issues that may present PostgreSQL from starting up include:

  • An out of memory error, due to settings (such as the value of max_prepared_transactions) in its configuration file, postgresql.conf, or due to a lack of free system memory overall.
  • A misconfiguration in any of PostgreSQL's other configuration files. (In general, you might first check for PostgreSQL configuration files that have been recently modified.)
  • Incorrect filesystem ownership or access privileges for various files in PostgreSQL's directories.
  • An incomplete or corrupted installation of the PostgreSQL server.

For further reference

Several of these errors and their solutions are described in the Nuxeo document Configuring PostgreSQL. That document also provides additional details on how to "tune" PostgreSQL for better performance on your system.

Other issues not listed here

If you encounter issues not listed here, you can ask for assistance on the CollectionSpace Talk mailing list or in the Forums on the CollectionSpace website.

You may also find that your issue has been discussed in the archives of the CollectionSpace Talk list or in an existing bug report.