Tuning CollectionSpace for Performance

This document discusses some of the configuration changes ("tuning") you can make to a CollectionSpace system that have the potential to enhance its performance.

Changes of this type are generally appropriate to be made by an experienced system administrator or implementer, or by a software developer. Often, these changes involve selectively changing a single configuration option or closely related set of options, then carefully measuring changes in performance for selected operations, to determine whether they have had the desired impact and have not introduced unacceptable tradeoffs.

Changes that can improve performance in the browser

The following changes, made via build or configuration changes on the server, have the potential to significantly improve CollectionSpace's performance in users' web browsers.

Minifying and concatenating UI artifacts

This document describes how to minify (reduce in size by stripping whitespace, etc.) and concatenate (combine into fewer files) JavaScript scripts and CSS stylesheets within CollectionSpace's User Interface layer, so these can be downloaded faster by browsers:

/wiki/spaces/collectionspace/pages/666273358

To follow those instructions, you'll first need to check out the UI layer source code.

Caching UI artifacts

These documents describe two complementary techniques through which you can improve CollectionSpace performance by configuring browsers to locally store ("cache") copies of frequently retrieved items, such as HTML pages, JavaScript scripts, CSS stylesheets, and images:

Enabling GZIP compression

This JIRA issue describes how you can improve CollectionSpace performance by configuring Tomcat to perform GZIP compression of data transmitted between the server and browser:

CSPACE-4998: implement gzip compression to static files served from tomcat

Changes that can improve performance in the server

The following changes, also made via build or configuration changes on the server, have the potential to significantly improve the responsiveness of CollectionSpace's Services layer: the speed at which that layer can return data in response to requests.

Configuring the database to support your large dataset

If your CollectionSpace system contains many tens of thousands, or even hundreds of thousands or millions of records - especially when considering all record types in your system, including relationships between records, which are themselves stored as records - you may find it advantageous to "tune" your database to support your large dataset.

These documents include a section on tuning PostgreSQL:

This document includes a section on how to configure MySQL to support large datasets:

Installing and configuring MySQL

Designating fields as "prefetched"

Fields whose values are routinely returned in lists and in search results should be designed as "prefetched," to enhance retrieval speed. These are the fields whose values appear in places such as:

  • Search results on the Find and Edit page
  • The right sidebar on Cataloging and procedural record pages
  • Search results within the dialog for relating records

In general, a CollectionSpace system comes with most or all of these fields already designated as "prefetched." In case you might need to selectively configure these settings, this document describes how to do so:

How to designate fields as "prefetched"

Currently, due to a technical limitation, multivalued fields and fields in multivalued groups (e.g. the Title field in Cataloging records) can't be designated as prefetched. This may no longer be true as of CollectionSpace version 2.0.

Designating selected fields as indexed

Fields on which CollectionSpace will frequently be performing database lookups - retrieving specific records by their values - may be candidates for indexing. In some selected cases, adding indexes will improve the performance of those database lookups, and thus overall system performance. (There are also tradeoffs when adding indexes, relating to memory use and availability, speed of creating and updating records, and so on; and in some instances, lookups using indexes can actually perform more slowly than sequential lookups.)

In general, a CollectionSpace system comes with a minimal, yet sensible, set of fields already designated as indexed. Beyond this, a software developer, database administrator, or an experienced implementer will typically need to look at the types of queries performed against the database, and the performance of those queries, both with and without indexes, to identify which additional fields might benefit by being indexed, if any. (As well, there are also database-specific configuration settings that you may need to tune, to ensure that the database uses those indexes appropriately when performing queries.)

CollectionSpace provides two mechanisms, independent of any configuration settings for any database type or instance, you can use to ensure that your designated fields are indexed:

Designating database fields to be indexed