Document Repository SPI

CollectionSpace service layer uses 3rd party document management repository for managing documents for various services as shown in Deploying the CollectionSpace Service Layer. For example, Nuxeo is used for document management in CollectionSpace.

This page describes a client-side framework for document management in the CollectionSpace services layer. This framework hides the vendor-specific implementations as well as encapsulates the boiler-plate code required to interact with the 3rd party repository services. It is modeled as service provider interface (SPI) as shown in the diagram below where repository providers could be plugged into the framework based on the configuration.

On This Page

The framework has the following major interfaces:

  1. Repository Client
  2. Repository Client Factory
  3. Document Handler
  4. Document Handler Factory
  5. Document Wrapper

Following sections describe the purpose of these interfaces. Then we describe the configuration required to initialize the repository framework and how each service could use a repository client. At the end, we describe how this framework is used at runtime.

You can browse the source code for common repository layer over here. The Nuxeo specific client code is here. The CollectionObject service specific handlers are over here.

Repository Client

The Repository Client is the interface used by CollectionSpace services to interact with a 3rd party document repository such as Nuxeo. It hides details about the protocols and APIs used to connect with the repository. It has methods for various document management operations including the CRUD operations (create, get, update, delete) as well as index (getall) and search (TBD) operations.

A repository client utilizes service context to retrieve the service-specific meta data (e.g. domain, workspace, etc.) in order to complete the request.

There are two types of clients in CollectionSpace to interface with Nuxeo repository.

  1. RepositoryJavaClient uses Nuxeo Java Remote APIs
  2. RepositoryRESTClient uses Nuxeo REST APIs

Source could be found here : RepositoryClient.java. Subsequent sections show how the repository client is used by a CollectionSpace service.

The sequence diagram at the end of this page shows how a Repository is used in conjunction with a DocumentHandler and a DocumentWrapper.

RepositoryClientFactory

Repository client factory creates a repository client based on the /wiki/spaces/collectionspace/pages/666274459. It is assumed that a CollectionSpace service uses one and only one type of a repository.

Limitation: Repository client factory is a singleton in CollectionSpace (v 0.3). That means all the services use the same repository. Which repository is used by the service layer is configured in the /wiki/spaces/collectionspace/pages/666274459. This limitation would be removed in the future (TBD) when different services use different types of repositories.

DocumentHandler

The purpose of the Document Handler is to handle the document processing tasks for the operations performed on the repository. It is used to process the extracted/exported documents from the repository as well as to fill/import documents into the repository from a CollectionSpace service. Most of the methods operate on all the parts of service object. A document handler uses the service context in order to to retrieve the service meta data to operate on the request and response.

Following diagram shows the DocumentHandler hierarchy.

In Nuxeo, documents are represented in the following two abstractions.

  1. DocumentModel is used to represent a document when Java APIs are used to interface with the Nuxeo repository.
  2. Restlet Representation is used to represent a document when RESTful APIs are used to interface with the Nuxeo repository.

Therefore, for interfacing with the Nuxeo Document Repository, there are two types of document handlers available, DocumentModelHandler and RepresentationHandler. Each type is specific to how a document is represented when using different APIs and protocols. A particular type of document handler should be used while interfacing with the repository using a particular client APIs. DocumentModelHandler uses DocumentModel of Nuxeo JAVA Remote APIs while RepresentationHandler operates on representations of Nuxeo RESTful APIs.

A CollectionSpace service receiving a service request creates the necessary document handler with the help of a service specific document handler factory. A service specific RemoteDocumentModelHandler is created when a CollectionSpace service is invoked by a remote service consumer.

Source could be found here DocumentHandler.java.

The sequence diagram at the end of this page shows how a DocumentHandler is used in conjunction with a RepositoryClient and a DocumentWrapper.

DocumentHandlerFactory

A CollectionSpace service receiving a service request creates the necessary document handler with the help of a DocumentHandlerFactory. A service specific document handler factory creates document handler based on the context in which it is used. That is remote or local document handler respectively to use to process request received remotely or locally.

Source could be found here DocumentHandlerFactory.java.

DocumentWrapper

DocumentWrapper wraps a document representation. For Nuxeo, two kinds of document wrappers are available.

  1. DocumentModelWrapper for documents retrieved using Nuxeo Java APIs
  2. RepresentationWrapper for documents retrieved using Nuxeo RESTful APIs

Source could be found here DocumentWrapper.java.

The sequence diagram at the end of this page shows how a DocumentWrapper is used in conjunction with a RepositoryClient and a DocumentHandler.

Configuration

The configuration for document repository SPI involves 2 things.

  1. Repository client
  2. Service binding referring to a repository client

Repository client

Repository client in the service layer configuration. This configuration involves repository client type and the class to use. The repository client factory uses the class information to instantiate the repository client. The repository client (e.g. nuxeo-java) is used to identify the repository client at other places.

Note: As of v0.3 the service layer only supports one type of repository client. This would change in future when services using more than one types of repository are developed.

Following is an example of repository client configuration.

repository client
    <repository-client name="nuxeo-java" default="true">
        
        <host>127.0.0.1</host>
        <port>62474</port> 
         ...
        <client-type>java</client-type>
        <client-class>org.collectionspace.services.nuxeo.client.java.RepositoryJavaClient</client-class>
    </repository-client>

Service binding

Each /wiki/spaces/collectionspace/pages/666274339 refers to a repository client to use. This reference is by the name of the repository client. For example, in the following example of the CollectionObjects service binding for tenant movingimages.us, repository client named nuxeo-java is used. The meta data for the repository client nuxeo-java is described in the earlier section.

service binding
    <tenant:tenantBinding
        id="1" name="movingimages.us" displayName="Museum of Moving Images" version="0.1" repositoryDomain="default-domain">
        <tenant:serviceBindings name="CollectionObjects" version="0.1">
            <!-- begin collectionobject service meta-data -->
            <service:repositoryClient xmlns:service='http://collectionspace.org/services/common/service'>nuxeo-java</service:repositoryClient>
...
        </tenant:serviceBindings>
        <!-- end collectionobject service meta-data -->
    </tenant:tenantBinding>
    <!-- end movinimages.us tenant meta-data -->

Sequence Diagram

Typical call sequence for CREATE operation is as follows:

  1. A service resource receives the CREATE call
  2. The resource creates a remote service context. The service meta data is retrieved from the binding using the tenant context (available after successful authentication)
  3. The resource then creates a remote document handler using the service-specific document handler factory
  4. It then either gets or creates a repository client using the repository client factory. Service metadata is used
  5. The resource calls CREATE operation on the repository client and passes the context and a handler
  6. The repository client calls prepare on the document handler before calling the repository service in order to prepare input data. Prepare is mainly useful for operations (CREATE, UPDATE) that change the state of the document in the repository.
  7. Repository client opens a repository session and makes a call on the repository service to create an empty document.
  8. The repository client then calls handle on the document handler in order to fill the received service object parts into the empty repository document.
  9. The repository client finally saves the document and closes the repository session.
  10. The repository client finally calls complete on the handler. Output is prepared at this time if required (for UPDATE only)
  11. The resource returns the output of the operation as response.

Detailed call sequence is as shown in sequence diagram below.

Example

The source code for the above shown interaction for the CollectionObjectResource for POST/create looks as follows. It is recommended to walk through the code for more details. You could start from here.

create
    @POST
    @Consumes("multipart/mixed")
    public Response createCollectionObject(MultipartInput input) {

...
            RemoteServiceContext ctx = createServiceContext(input);
            DocumentHandler handler = createDocumentHandler(ctx);
            String csid = getRepositoryClient(ctx).create(ctx, handler);
            UriBuilder path = UriBuilder.fromResource(CollectionObjectResource.class);
            path.path("" + csid);
            Response response = Response.created(path.build()).build();
            return response;
...
}
Recent updates

There are no recent updates at this time.