/
Django web application data source considerations

Django web application data source considerations

Question: What data source(s) should be used for the Django-based web applications?

Accessing the CollectionSpace database directly (whether through the services layer, Postgres directly, NXQL, CMIS)

Advantages

  • The web applications would be accessing real-time data.

Disadvantages

  • Possible impact on performance of CollectionSpace
  • Performance of public-facing web applications might be too slow

Accessing a snapshot or index of the needed data (whether on the same Postgres server or another one)

Advantages

  • Performance can be maximized
  • We have the option of hosting these data on a different server

Disadvantages

  • Data would not be real-time.  However, this can be mitigated by frequent (daily) updates and we could build the capability to update cached data on demand.

Process

First let's determine whether performance of a typical query is sufficient.  One caveat: We know database performance on UCJEPS needs to be addressed.  Queries take far too long to run generally. Still, it seems unlikely that this will work.  Uncached queries that get Darwin Core fields for a set of 53 objects take 18 seconds uncached (where label header like '%REEF POINT%').  ... where fieldlocstate='MA' takes 274 seconds.