Blob Service RESTful API



The Blob Service offers a RESTful API to perform CRUD (create, read, update, and delete) operations on file/document related information. Included in this information is the type and encoding of a file, and a reference to the location of the actual file/document bits.

When possible, these operations follow the Common model for CollectionSpace REST services.

Assumptions and Definitions of Terms

A "blob record" refers to an instance of the Blob Info Schema: it contains information about a specific document/file's type, encoding, etc. and a reference to the actual file/document bits. For example, if the actual file is a JPEG image then the blob record would have its file type (MIME type) field set to image/jpeg.

Blob Derivatives

Blob derivatives will be modeled as child blobs of the original file/document -this is how Nuxeo currently models them. From the context of the original/parent blob, derivatives can be referenced with terms from a controlled set of terms -for example, for image blobs the terms might be 'thumbnail', 'small', 'medium', and 'large'.

Eventually derivative terms and the properties of these terms will be part of the CollectionSpace configuration mechanism. However this is currently out of scope for the 1.x version of the service. Derivative terms and their corresponding properties will be specific to the file/document type. For example, for an image file, a term set definition might look something like:

Image Terms = {'thumbnail' = 64 x 64, 'small' = 128 x 128, 'medium' = 256 x 256 }

As of the v1.7 release, only the following image-related derivative terms exist:

  • 'Thumbnail' (a JPEG-format derivative, 64 x 64 pixels)
  • 'Medium' (a JPEG-format derivative, 256 x 256 pixels)
  • 'OriginalJpeg' (a JPEG-format derivative, with the same pixel resolution as the original image)
Table of Contents

Related Links:

Blob Info Schema

Blob Service CREATE requests

1. Create a new blob record and persist a related document/file- Option #1

This creates a blob record as well as stores and relates a copy of the posted file. The information in the blob record (see Blob Info Schema) is automatically populated by the Blob service based upon the file/document that is posted with the request.

POST to "/blobs" with MIME=multipart/form-data

Notes: Returns the csid of the blob record. The csid is part of the fully qualified URL that is returned in the location header of the HTTP response.

The blob service expects a POST similar to the one web browsers use to upload files. See the following HTML snippet for how you can use a web browser to post a new blob.

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Strict//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
        <title>Blob Service POST Example</title>
    </head>
    <body>
        <h1>Blob Service POST Example</h1>
        <form method="POST" action="http://localhost:8180/cspace-services/blobs?blobUri=" enctype="multipart/form-data">
            <label>Please select a file to post to the Blob service:</label>
            <br />
            <input type="file" size="50" name="file" />
            <input type="submit" value="Upload now!" />
        </form>
    </body>
</html>

You can also use a tool like cURL to POST an image.  Here is an example:

 


2. Create new blob record - POST request contains no embedded document/file payload - Option #2

POST to "/blobs?blobUri=" with MIME=application/xml

Notes: Using the 'blobUri' query parameter, you can include a URI to a FILE resource or an HTTP resource that you want the service to use when creating the blob record. If the URI is set and valid, then the Blob service will try to use the URI to create a blob record as well as store and relate a copy of the document/file. The information in the blob record (see Blob Info Schema) is automatically populated by the Blob service and derived from the resource pointed to by the URI.

If the blobUri query parameter is empty or missing then the Blob service will instead expect a blob record payload -see Blob Info Schema.

Supported URL protocols

  • If the 'blobUri' refers to an HTTP resource, the service assumes the resource is not protected -i.e., does not require authentication credentials.
  • If the 'blobUri' refers to a FILE resource, the file must exists on (or be reachable from) the same machine upon which the Blob service is running. It also means that the appropriate file-system permissions exist on the resource such that the service can access and read it.
# the blobUri references a file named "cow.jpg" located in the Apache Tomcat folder where the CollectionSpace server is running
POST to blobs?blobUri=file:cow.jpg
 
# the blobUri references a file using an absolute path located in the Tomcat folder where the CollectionSpace server is running
POST to blobs?blobUri=file:///Users/remillet/dev/temp/cow.jpg
Blob Service GET requests.

3. Get an existing blob record.

GET to "/blobs/{csid}"

Notes: Returns the blob record corresponding to the csid -see Blob Info Schema. If the blob record was created with a document/file then a URL to the blob's content (the document/file) is included in the resulting payload.

4. Get an existing blob's data content.

GET to "/blobs/{csid}/content"

Notes: If the blob record was created with a document/file then this returns the blob data content (a file/document) corresponding to the blob record's csid

5. Get the full set of raw metadata about an existing blob record's content.

GET to "/blobs/{csid}/content/metadata"

Notes: Returns the full set of raw metadata that exists for the content of the blob record with csid. This metatdata will essentially be a property bag specific to the media-type/MIME-type of the content.

6. Get a list of derivatives for an existing blob record.

GET to "/blobs/{csid}/derivatives"

Notes: Returns a list of blob derivatives corresponding to the blob record's csid. This will include a URL to each derivative.

7. Get a specific derivative record of an existing blob record.

GET to "/blobs/{csid}/derivatives/{derivative_term}"

Notes: Returns what is essentially a "blob record" for the specified blob derivative. Initially, the derivative term will be one of a small set of controlled terms -i.e., 'thumbnail', 'small', 'medium', etc. Derivative terms will be defined in a CollectionSpace configuration file.

8. Get a specific blob derivative's content.

GET to "/blobs/{csid}/derivatives/{derivative_term}/content"

Notes: Returns the data content of the derivative.

9. Get the full set of raw metadata about an existing blob derivative.

GET to "/blobs/{csid}/derivatives/{derivative_term}/content/metadata"

Notes: Returns the full set of raw metadata that exists for the derivative's content.


Blob Service UPDATE requests:

1. Update an existing blob record.

PUT to "/blobs/{csid}" with MIME=multipart/form-data

Notes: Returns the csid of the blob record.

This PUT request to /blobs/{csid} is not currently implemented. Please see CSPACE-6633 for updates.

2. Update an existing blob record:

PUT to "/blobs/{csid}" with MIME=multipart/xml

Notes: The payload must include a 'srcUri' parameter that is set to a valid and accessible URI.