PostgreSQL Installation under Linux

Table of Content


Installing the PostgreSQL Server

We recommend that you install the latest available, stable version of PostgreSQL 12. See also the PostgreSQL-related recommendations from Nuxeo, the vendor of the content management system used by CollectionSpace.

Apt Package Manager (Debian and Ubuntu)

Ubuntu 22.04

Instructions based on the Postgresql wiki

echo # Set up the postgres repository
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'

echo # Install the postgres repository certificates
sudo apt install curl ca-certificates gnupg
curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

echo # Install postgres
sudo apt update
sudo apt install postgresql-12 postgresql-contrib-12


Setting up the PostgreSQL Server

Changes to PostgreSQL Configuration Files

After installing the PostgreSQL server, a system user named postgres should have automatically been created.

In your Linux shell, change/switch to this user:

sudo su - postgres

Changes to pg_hba.conf File

Next, edit the PostgreSQL pg_hba.conf configuration file using a tool like vi or vim. The location differs between Linux flavors, e.g.:

  • Debian/Ubuntu: /etc/postgresql/12/main/pg_hba.conf

So, in Ubuntu 22.04, for example, open it up with your preferred text editor:

vim /etc/postgresql/12/main/pg_hba.conf

Towards the end of the file, add (or modify) the following lines of text to the section starting with something like "# IPv4 local connections:":

# IPv4 local connections:
host    all             all         samehost                md5

Next, make sure entries under "IPv6 local connections" are disabled by adding the '#' character to beginning of this line:

# IPv6 local connections:
#host    all             all             ::1/128                md5

Save the changes and quit your editor.

Changes to postgresql.conf File

Finally, set the max_prepared_transactions property in the main PostgreSQL config file named postgresql.conf. The file's location differs between Linux flavors, e.g.:

  • Debian/Ubuntu: /etc/postgresql/12/main/postgresql.conf

So, in Ubuntu 22.04, for example, open it with your favorite text editor

vim /etc/postgresql/12/main/postgresql.conf

then make the following changes to that file:

max_prepared_transactions = 64

The max_prepared_transactions property might be commented out with the '#' character. If you see the '#' character at the beginning of this line, remove it.

A max_prepared_transactions variable setting of 64 might be too memory consuming for your system. If this value is indeed too high for your system, you'll get an error when restarting the PostgreSQL server. Something like:

Restarting PostgreSQL 12 database server: main The PostgreSQL server failed to start. Please check the log output ... FATAL: could not create shared memory segment: Invalid argument 
...
failed!

Lowering this setting to 32 should also be fine, but you might need to adjust based on your system.

Save your changes and quit.

Log out of the 'postgres' account (use the shell 'exit' command to return to your normal linux account) and restart your PostgreSQL service:

exit
sudo systemctl restart postgresql

Set up datatype 'casts'

Now we need to create some datatype casts that CollectionSpace uses. While logged in as the postgres user, enter the psql shell:

sudo su - postgres
psql -U postgres -d template1

then enter the following commands into the psql shell:

CREATE FUNCTION pg_catalog.text(integer) RETURNS text STRICT IMMUTABLE LANGUAGE SQL AS 'SELECT textin(int4out($1));';
CREATE CAST (integer AS text) WITH FUNCTION pg_catalog.text(integer) AS IMPLICIT;
COMMENT ON FUNCTION pg_catalog.text(integer) IS 'convert integer to text';
CREATE FUNCTION pg_catalog.text(bigint) RETURNS text STRICT IMMUTABLE LANGUAGE SQL AS 'SELECT textin(int8out($1));';
CREATE CAST (bigint AS text) WITH FUNCTION pg_catalog.text(bigint) AS IMPLICIT;
COMMENT ON FUNCTION pg_catalog.text(bigint) IS 'convert bigint to text';

Remain in the psql shell and continue to the next section.

Create the csadmin User

At psql's command prompt (ending in #), switch to the postgres database from the template1 database by entering the following (starting with a backslash character as shown):

\c postgres

In response, you should see a message similar to this:  You are now connected to database "postgres" as user "postgres".

 Enter the following command to create a csadmin user/role with appropriate privileges.  (Be sure to substitute a password of your choosing for replacemewithyourpassword below:

CREATE ROLE csadmin LOGIN PASSWORD 'replacemewithyourpassword' SUPERUSER INHERIT NOCREATEDB NOCREATEROLE NOREPLICATION;

In response, you should see a message similar to this: CREATE ROLE 

Quit the psql program by typing \q (or press ctrl+d).

Log out of the 'postgres' Linux account (the 'exit' command returns you to your normal linux account) and then restart the PostgreSQL server:

exit
sudo systemctl restart postgresql

Verify that you are able to connect to the PostgreSQL server using the csadmin account username and password you just added:

psql -U csadmin -h localhost -d postgres

If successful, you should be in the psql shell and see something like this:

psql (12.18 (Ubuntu 12.18-1.pgdg22.04+1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

postgres=# 

If you think you've encountered an error, send an email to the CollectionSpace "Talk" list at talk@collectionspace.org.  Include any error messages you've seen and give a detailed description of what steps you've already performed.

Next, exit the psql shell by typing \q (or press ctrl+d).

You're finished with the PostgreSQL setup. 



Database Tuning

Tuning the database is OPTIONAL! CollectionSpace will run even if none of the steps below are carried out. They are merely steps to tweak the database and improve performance when running with CollectionSpace (Nuxeo)

The below is a condensed version of the instructions on Nuxeo's Configuring PostgreSQL page. If you want more details than are given below, please refer to that page.

More generically, the PostgreSQL wiki page on Tuning Your PostgreSQL Server may also be helpful.


Tuning takes place in postgresql.conf - the location differ among OS' flavours. Some examples are:

  • Debian/Ubuntu: /etc/postgresql/12/main/postgresql.conf

For example, for Debian/Ubuntu:

vim /etc/postgresql/12/main/postgresql.conf


Edit the following value in that file:

shared_buffers = 512MB

Now on the command line, set the kernel.shmmax value.. This value should at least be shared_buffer + 128MB

sysctl kernel.shmmax=650000000

If you get an error, you need to up the value of kernel.shmax

You need to register the final value of shmmax:

Open /etc/sysctl.conf

vim /etc/sysctl.conf


and add the following line - replacing the value with the value you set on for shmmax:

kernel.shmmax = 650000000

finally set effective cache_size:

Do this by running (and printing stats in megabytes):

free -m

This will give you an output like so:

free -m
             total       used       free     shared    buffers     cached
Mem:          4096       3302        793          0       253        1226
-/+ buffers/cache:       1822       2273
Swap:          511        216        295


find your effective cache_size by adding the number under buffers and cached. In the above example:
253+1226=1479

To this number you add the shared_buffers value .. so we get a total of:
1479+512=1991 ~ 2GB

again edit postgres conf

vim /etc/postgresql/12/main/postgresql.conf

and set effective cache size to the value you found in the previous steps.

effective_cache_size = 2GB


Also set the following values:

work_mem = 16MB                         # min 64kB
maintenance_work_mem = 32MB             # min 1MB
random_page_cost = 2.0


restart for changes to take effect

Debian/Ubuntu:

sudo systemctl restart postgresql

Report Authoring

If you want to run reports on you CollectionSpace instance, you will need to make changes to your "pg_hba.conf".

For example in Ubuntu you would open it in your favorite text editor:

vim /etc/postgresql/12/main/pg_hba.conf


If you are going to allow report authoring on the local machine, then you need to add this entry as well:

host      nuxeo      reader     samehost      md5


If you are going to allow remote report authoring - authoring reports while connecting to the database from remote hosts - then you need to add an entry much like the following, but with your own subnet address entered in place of the example address below. (The example below would allow remote report authoring from hosts on a specific subnet of the UC Berkeley network. See also the documentation for pg_hba.conf for your PostgreSQL release for more details on how to specify a subnet address.) Don't use '169.229.0.0'! This is just an example IP address.

host 	nuxeo 	reader 	169.229.0.0/16 	md5


Note that for remote report authoring to work, you will also have to make another change within a second configuration file, to the {{listen_addresses}} value in the postgresql.conf file. If you can control access to PostgreSQL via a firewall, you can use a value of "*" for simplicity,

listen_addresses = '*' # what IP address(es) to listen on; 

or for more security, you can list the individual IP addresses from which report authors will connect.