API Documentation
Giles provides REST endpoints to specific functionality such as uploading or retrieving images. The following sections describe all so far available endpoints, their parameters, and what they return.
Authentication
Token Authentication
For all REST endpoints, a user has to send a valid API token (see here for a description how to get a token) to authenticate. Some endpoints can be used without a token when requesting public files. Below descriptions indicate when an endpoint can be used without a token. In general, there are two ways to send a token: as request parameter or as Authorization header (this is the recommended way).
When sending a token in the header, add the following header field:
Authorization: token your-api-token
When sending a token as request parameter, add the following to your url:
accessToken=your-api-token
Adding Data
Add files to Giles
POST
In theory, you can add any file type you want to Giles. However, only file types supported by Digilib or file types that Giles knows how to convert really make sense. Currently, Giles knows how to convert PDFs to images. Giles is globally configurable in regards to what image type it uses and what DPI when converting PDFs to images (default is tiff/600 dpi). Giles will create one image per PDF page and put all images together into one folder so that you can use Digilib's paginator feature. The original PDF is stored separately outside of Digilib's image folder.
You can add files to Giles by making a POST request to:
/rest/files/upload
You request should be made with a content-type header of:
multipart/form-data
Giles expects the following parameters:
accessToken: an API token that is used to authenticate the uploading user (if possible use the Authorization header instead of this parameter)
access (optional): access policy for uploaded files; possible values are PRIVATE or PUBLIC; default is PRIVATE.
document_type (optional): specifies if uploaded files are several pages of the same document (MULTI_PAGE) or if they should be uploaded as separate documents (SINGLE_PAGE); default is SINGLE_PAGE.
files: the files to be uploaded
A valid request would look something like this:
POST /giles/rest/files/upload HTTP/1.1 Host: giles-host Authorization: token your-giles-token Cache-Control: no-cache Content-Type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW ------WebKitFormBoundary7MA4YWxkTrZu0gW Content-Disposition: form-data; name="files"; filename="test.png" Content-Type: image/png [content of file] ------WebKitFormBoundary7MA4YWxkTrZu0gW Content-Disposition: form-data; name="files"; filename="test.pdf" Content-Type: application/pdf [content of file]
Giles will respond with an progress id and a url to check the progress of the upload:
{ "id":"PROGQ3Fm2J", "checkUrl":"http://your-giles-org.net/giles/rest/files/upload/check/PROGQ3Fm2J" }
You can now poll the url provided as 'checkUrl'. As long as the upload hasn't finished yet, you will get a 202 Accepted response with the following response body:
{ "msg":"Upload in progress. Please check back later.", "msgCode":"010" }
V0.5 As of version v0.5, Giles will also supply the URL of the upload itself (see "Get info about upload" below). Keep in mind, however, that requests to this URL will return incomplete results as long as processing of a file is ongoing. Only the poll URL will indicate when processing has finished.
Once uploading has finished, you will retrieve the complete information as listed below.
[ { "documentId" : "DOC123edf", "uploadId" : "UPxx456", "uploadedDate" : "2016-09-20T14:03:00.152Z", "access" : "PRIVATE", "uploadedFile" : { "filename" : "uploadedFile.pdf", "id" : "FILE466tgh", "url" : "http://your-giles-host.net/giles/rest/files/FILE466tgh/content", "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf", "content-type" : "application/pdf", "size" : 3852180 }, "extractedText" : { "filename" : "uploadedFile.pdf.txt", "id" : "FILE123cvb", "url" : "http://your-giles-host.net/giles/rest/files/FILE123cvb/content", "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.txt", "content-type" : "text/plain", "size" : 39773 }, "pages" : [ { "nr" : 0, "image" : { "filename" : "uploadedFile.pdf.0.tiff", "id" : "FILEYUI678", "url" : "http://your-giles-host.net/giles/rest/digilib?fn=username%FILEYUI678%2FDOC123edf0%2FuploadedFile.pdf.0.tiff", "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.0.tiff", "content-type" : "image/tiff", "size" : 2032405 }, "text" : { "filename" : "uploadedFile.pdf.0.txt", "id" : "FILE789UIO", "url" : "http://your-giles-host.net/giles/rest/files/FILE789UIO/content", "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.0.txt", "content-type" : "text/plain", "size" : 4658 }, "ocr" : { "filename" : "uploadedFile.pdf.0.tiff.txt", "id" : "FILE789U12", "url" : "http://your-giles-host.net/giles/rest/files/FILE789U12/content", "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.0.tiff.txt", "content-type" : "text/plain", "size" : 4658 } }, { "nr" : 1, "image" : { "filename" : "uploadedFile.pdf.1.tiff", "id" : "FILE045tyhG", "url" : "http://your-giles-host.net/giles/rest/digilib?fn=username%2FFILE045tyhG%2FDOC123edf0%2FuploadedFile.pdf.1.tiff", "path" : "username/UPxx456/DOC123edf/uploadedFile.1.tiff", "content-type" : "image/tiff", "size" : 2512354 }, "text" : { "filename" : "uploadedFile.pdf.1.txt", "id" : "FILEMDSPfeVm", "url" : "http://your-giles-host.net/giles/rest/files/FILEMDSPfeVm/content", "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.1.txt", "content-type" : "text/plain", "size" : 5799 }, "ocr" : { "filename" : "uploadedFile.pdf.1.tiff.txt", "id" : "FILEMDSPfe12", "url" : "http://your-giles-host.net/giles/rest/files/FILEMDSPfe12/content", "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.1.tiff.txt", "content-type" : "text/plain", "size" : 5799 } }
Uploads expire after 24 hours or after a server restart. If you request an expired upload or an upload that doesn't exist you will get a 404 response with the following response body:
{ "errorCode" : "404", "errorMsg" : "Upload does not exist." }
Retrieving Data
Get all uploads of user
GET
You can get the details of all uploads of a user by making a GET request to:
/rest/files/uploads
Giles expects the following parameters:
- accessToken: an API token that is used to authenticate the uploading user (if possible use the Authorization header instead of this parameter)
Giles will respond with the a map of [upload ids => file ids and filename of the uploaded file]:
[ { "UPMDG2ddX4bDKk": [ { "id": "FILE0fPS2iO6Ev7g", "filename": "myfirstfile.pdf" } ] }, { "UPVrMKIv": [ { "id": "FILEkUcHBh", "filename": "file2.0.tiff" } ] }, { "UP7R6GOs": [ { "id": "FILEkUcHBh", "filename": "myfile2.tiff" } ] } ]
Get image from Digilib
GET
V0.4.2 PUBLIC API Starting with version v0.4.2 this endpoint can be used without an access token for public images. Note that for private images an access token is required.
You can get images from Digilib through Giles by making a GET request to:
/rest/digilib
Giles expects the following parameters:
- accessToken: an API token that is used to authenticate the uploading user (if possible use the Authorization header instead of this parameter)
- fn: path to image in digilib
- dw or dh: you need at least one size parameter, either width (dw) or height (dh) or both
- any other digilib parameter (optional)
Get public image from Digilib
GET
You can get public images from Digilib without a GitHub access token by making a GET request to:
/rest/digilib/public
Giles expects the following parameters:
- fn: path to image in digilib
- dw or dh: you need at least one size parameter, either width (dw) or height (dh) or both
- any other digilib parameter (optional)
If the requested image is set to public, Giles will return the image from Digilib. Otherwise, you will receive an http status 403 Forbidden.
Get info about upload
GET
You can get information about an upload by making a GET request to:
/rest/files/upload/{uploadId}
where {uploadId}
refers to an id of a previous upload.
Giles expects the following parameters:
- accessToken: an API token that is used to authenticate the uploading user (if possible use the Authorization header instead of this parameter)
A user has only access to upload he initiated himself.
[ "documentId" : "DOCOhcqLGMXL8dC", "uploadId" : "UPMDG2ddX4bDKk", "uploadedDate" : "2016-10-04T17:40:15.254Z", "access" : "PUBLIC", "uploadedFile" : { "filename" : "your-file.pdf", "id" : "FILE0fPS2iO6Ev7g", "url" : "https://your.host/giles/rest/files/FILE0fPS2iO6Ev7g/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf", "content-type" : "application/pdf", "size" : 1453836 }, "extractedText" : { "filename" : "your-file.pdf.txt", "id" : "FILEjXRK3MKDjcqx", "url" : "https://your.host/giles/rest/files/FILEjXRK3MKDjcqx/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.txt", "content-type" : "text/plain", "size" : 84313 }, "pages" : [ { "nr" : 0, "image" : { "filename" : "your-file.pdf.0.tiff", "id" : "FILEgwyK2KjEiniN", "url" : "https://your.host/giles/rest/digilib?fn=youruser%2FUPMDG2ddX4bDKk%2FDOCOhcqLGMXL8dC%2Fyour-file.pdf.0.tiff", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.0.tiff", "content-type" : "image/tiff", "size" : 1938832 }, "text" : { "filename" : "your-file.pdf.0.txt", "id" : "FILEu3zp4FHaNBEz", "url" : "https://your.host/giles/rest/files/FILEu3zp4FHaNBEz/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.0.txt", "content-type" : "text/plain", "size" : 3461 } }, { "nr" : 1, "image" : { "filename" : "your-file.pdf.1.tiff", "id" : "FILE1vgFj8feXHtG", "url" : "https://your.host/giles/rest/digilib?fn=username%2FUPMDG2ddX4bDKk%2FDOCOhcqLGMXL8dC%2Fyour-file.pdf.1.tiff", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.1.tiff", "content-type" : "image/tiff", "size" : 1938382 }, "text" : { "filename" : "your-file.pdf.1.txt", "id" : "FILER0t8JQ1WuU94", "url" : "https://your.host/giles/rest/files/FILER0t8JQ1WuU94/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.1.txt", "content-type" : "text/plain", "size" : 3930 } }, { "nr" : 2, "image" : { "filename" : "your-file.2.tiff", "id" : "FILEzQaVarnXZy52", "url" : "https://your.host/giles/rest/digilib?fn=youruser%2FUPMDG2ddX4bDKk%2FDOCOhcqLGMXL8dC%2Fyour-file.pdf.2.tiff", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.2.tiff", "content-type" : "image/tiff", "size" : 1809905 }, "text" : { "filename" : "your-file.pdf.2.txt", "id" : "FILEFlTXtknorFua", "url" : "https://your.host/giles/rest/files/FILEFlTXtknorFua/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.2.txt", "content-type" : "text/plain", "size" : 3563 } } ] } ]
Get info about document
GET
PUBLIC API This endpoint can be used without an API token when requesting public files.
V0.8 Starting with version v0.8, the returned json contains lists of additional files for the uploaded document and each page (as shown in the example below). Before version v0.8, the additionalFiles sections are not included.
You can get information about a document by making a GET request to:
/rest/documents/{documentId}
where {documentId}
is the id of the upload you are requesting information about.
A response looks similar to this:
{ "documentId" : "DOCOhcqLGMXL8dC", "uploadId" : "UPMDG2ddX4bDKk", "uploadedDate" : "2016-10-04T17:40:15.254Z", "access" : "PUBLIC", "uploadedFile" : { "filename" : "your-file.pdf", "id" : "FILE0fPS2iO6Ev7g", "url" : "https://your.host/giles/rest/files/FILE0fPS2iO6Ev7g/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf", "content-type" : "application/pdf", "size" : 1453836 }, "extractedText" : { "filename" : "your-file.pdf.txt", "id" : "FILEjXRK3MKDjcqx", "url" : "https://your.host/giles/rest/files/FILEjXRK3MKDjcqx/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.txt", "content-type" : "text/plain", "size" : 84313 }, "additionalFiles": [ { "filename": "your-file.pdf.txt.species.csv", "id": "FILEZGpnr7Keocfh", "url": "http://your.host/giles/rest/files/FILEZGpnr7Keocfh/content", "path": "other/youruser/UP0GCnEZg9l02y/DOCGS1PfODiKbcx/your-file.pdf.txt.species.csv", "content-type": "text/csv", "size": 237, "processor": "carolus" } ], "pages" : [ { "nr" : 0, "image" : { "filename" : "your-file.pdf.0.tiff", "id" : "FILEgwyK2KjEiniN", "url" : "https://your.host/giles/rest/digilib?fn=youruser%2FUPMDG2ddX4bDKk%2FDOCOhcqLGMXL8dC%2Fyour-file.pdf.0.tiff", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.0.tiff", "content-type" : "image/tiff", "size" : 1938832 }, "text" : { "filename" : "your-file.pdf.0.txt", "id" : "FILEu3zp4FHaNBEz", "url" : "https://your.host/giles/rest/files/FILEu3zp4FHaNBEz/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.0.txt", "content-type" : "text/plain", "size" : 3461 }, "ocr" : { "filename" : "your-file.pdf.0.tiff.txt", "id" : "FILEu3zp4FHaN567", "url" : "https://your.host/giles/rest/files/FILEu3zp4FHaN567/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.0.tiff.txt", "content-type" : "text/plain", "size" : 3461 } "additionalFiles": [ { "filename": "your-file.pdf.0.tiff.txt.species.csv", "id": "FILE9K9XJuIrN28X", "url": "http://your.host/giles/rest/files/FILE9K9XJuIrN28X/content", "path": "other/youruser/UP0GCnEZg9l02y/DOCGS1PfODiKbcx/your-file.pdf.0.tiff.txt.species.csv", "content-type": "text/csv", "size": 25, "processor": "carolus" } ] }, { "nr" : 1, "image" : { "filename" : "your-file.pdf.1.tiff", "id" : "FILE1vgFj8feXHtG", "url" : "https://your.host/giles/rest/digilib?fn=youruser%2FUPMDG2ddX4bDKk%2FDOCOhcqLGMXL8dC%2Fyour-file.pdf.1.tiff", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.1.tiff", "content-type" : "image/tiff", "size" : 1938382 }, "text" : { "filename" : "your-file.pdf.1.txt", "id" : "FILER0t8JQ1WuU94", "url" : "https://your.host/giles/rest/files/FILER0t8JQ1WuU94/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.1.txt", "content-type" : "text/plain", "size" : 3930 }, "ocr" : { "filename" : "your-file.pdf.1.tiff.txt", "id" : "FILER123JQ1WuU94", "url" : "https://your.host/giles/rest/files/FILER123JQ1WuU94/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.1.tiff.txt", "content-type" : "text/plain", "size" : 3930 }, "additionalFiles": [ { "filename": "your-file.pdf.1.tiff.txt.species.csv", "id": "FILE9K9XJuIrN890", "url": "http://your.host/giles/rest/files/FILE9K9XJuIrN890/content", "path": "other/youruser/UP0GCnEZg9l02y/DOCGS1PfODiKbcx/your-file.pdf.1.tiff.txt.species.csv", "content-type": "text/csv", "size": 23, "processor": "carolus" } ] }, { "nr" : 2, "image" : { "filename" : "your-file.pdf.2.tiff", "id" : "FILEzQaVarnXZy52", "url" : "https://your.host/giles/rest/digilib?fn=youruser%2FUPMDG2ddX4bDKk%2FDOCOhcqLGMXL8dC%2Fyour-file.pdf.2.tiff", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.2.tiff", "content-type" : "image/tiff", "size" : 1809905 }, "text" : { "filename" : "your-file.pdf.2.txt", "id" : "FILEFlTXtknorFua", "url" : "https://your.host/giles/rest/files/FILEFlTXtknorFua/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.2.txt", "content-type" : "text/plain", "size" : 3563 }, "ocr" : { "filename" : "your-file.pdf.2.tiff.txt", "id" : "FILEFlTXtkn345ua", "url" : "https://your.host/giles/rest/files/FILEFlTXtkn345ua/content", "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.2.tiff.txt", "content-type" : "text/plain", "size" : 3563 }, "additionalFiles": [ { "filename": "your-file.pdf.2.tiff.txt.species.csv", "id": "FILE9K9XJuI78YUR", "url": "http://your.host/giles/rest/files/FILE9K9XJuI78YUR/content", "path": "other/youruser/UP0GCnEZg9l02y/DOCGS1PfODiKbcx/your-file.pdf.2.tiff.txt.species.csv", "content-type": "text/csv", "size": 30, "processor": "carolus" } ] }
Get full image from Giles
GET
PUBLIC API This endpoint can be used without an API token when requesting public files.
You can get the original version of a file that you have uploaded through Giles by making a GET request to:
/rest/files/{fileId}/content
where {fileId}
is the id of the file you are trying to download.
Giles expects the following parameters:
- accessToken: an API token that is used to authenticate the uploading user (if possible use the Authorization header instead of this parameter)
Note: when requesting information about an upload, or after uploading a file, the path property of the JSON response will point here for PDF files.
Modifying Data
Change Document Access
POST
V0.4.2 This features is available with version v0.4.2.
You can change the access type of a document (private or public) by making a POST request to:
/rest/documents/{documentId}/access/change
where {documentId}
is the id of the document you want to change the access type.
Giles expects the following parameters:
- accessToken: an API token that is used to authenticate the uploading user (if possible use the Authorization header instead of this parameter)
- access: the new type of access for the specified document:
private
orpublic
Search
Search with Freddie
GET
V0.5 Starting with version v0.5 documents submitted to Giles can be search if Freddie has been added to the ecosystem.
You can search all text documents of a user by making a GET request to:
/rest/search?q={querystring}
where {querystring}
is a Solr query string.
Giles expects the following parameters:
- q: the query string
- accessToken: an API token that is used to authenticate the uploading user (if possible use the Authorization header instead of this parameter)
Giles will respond to a search request with a list of results, similar to:
[{ "id": "FILEqvXix777A6er", "uploadId": "UPsQY6W7CbBsl1", "filename": "GraceHopper.pdf.0.txt", "documentId": "DOC8uY40VmywRMe", "uploadDate": "2017-04-27T16:33:39.428Z", "access": "PRIVATE", "contentType": "text/plain", "size": 0, "url": "http://your.giles.host/giles/rest/files/FILEqvXix777A6er/content", "documentUrl": "http://your.giles.host/giles/rest/documents/DOC8uY40VmywRMe", "page": 0 } ]