Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Current »

Version 2 of the Giles API does not require Giles tokens anymore. Instead, Citesphere tokens can be used. The following describes the endpoints that have been implemented so far.

Authentication

To authenticate, simply send a Citesphere user token to Giles as Bearer header.

Authorization: Bearer your-citesphere-token

Giles will check with Citesphere if the user that the passed token belongs to has access to the requested document.

Adding Data 

Add files to Giles 

POST

In theory, you can add any file type you want to Giles. However, only file types supported by Digilib or file types that Giles knows how to convert really make sense. Currently, Giles knows how to convert PDFs to images. Giles is globally configurable in regards to what image type it uses and what DPI when converting PDFs to images (default is tiff/600 dpi). Giles will create one image per PDF page and put all images together into one folder so that you can use Digilib's paginator feature. The original PDF is stored separately outside of Digilib's image folder.

You can add files to Giles by making a POST request to:

/api/v2/files/upload

You request should be made with a content-type header of:

multipart/form-data

Giles expects the following parameters:

  • access (optional): access policy for uploaded files; possible values are PRIVATE or PUBLIC; default is PRIVATE.

  • document_type (optional): specifies if uploaded files are several pages of the same document (MULTI_PAGE) or if they should be uploaded as separate documents (SINGLE_PAGE); default is SINGLE_PAGE.

  • files: the files to be uploaded

A valid request would look something like this:

POST /giles/rest/files/upload HTTP/1.1
Host: giles-host
Authorization: Bearer your-citesphere-token
Cache-Control: no-cache
Content-Type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW
 
------WebKitFormBoundary7MA4YWxkTrZu0gW
Content-Disposition: form-data; name="files"; filename="test.png"
Content-Type: image/png
 
[content of file]
 
------WebKitFormBoundary7MA4YWxkTrZu0gW
Content-Disposition: form-data; name="files"; filename="test.pdf"
Content-Type: application/pdf
 
[content of file]

Giles will respond with an progress id and a url to check the progress of the upload:

{
    "id":"PROGQ3Fm2J",
    "checkUrl":"http://your-giles-org.net/giles/api/v2/files/upload/check/PROGQ3Fm2J"
}

You can now poll the url provided as 'checkUrl'. As long as the upload hasn't finished yet, you will get a 202 Accepted response with the following response body:

{
    "msg":"Upload in progress. Please check back later.",
    "msgCode":"010"
}

As of version v0.5, Giles will also supply the URL of the upload itself. Keep in mind, however, that requests to this URL will return incomplete results as long as processing of a file is ongoing. Only the poll URL will indicate when processing has finished.

Once uploading has finished, you will retrieve the complete information as listed below.

Upload Image Sample Response from Giles

[ {
  "documentId" : "DOC123edf",
  "uploadId" : "UPxx456",
  "uploadedDate" : "2016-09-20T14:03:00.152Z",
  "access" : "PRIVATE",
  "uploadedFile" : {
    "filename" : "uploadedFile.pdf",
    "id" : "FILE466tgh",
    "url" : "http://your-giles-host.net/giles/rest/files/FILE466tgh/content",
    "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf",
    "content-type" : "application/pdf",
    "size" : 3852180
  },
  "extractedText" : {
    "filename" : "uploadedFile.pdf.txt",
    "id" : "FILE123cvb",
    "url" : "http://your-giles-host.net/giles/rest/files/FILE123cvb/content",
    "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.txt",
    "content-type" : "text/plain",
    "size" : 39773
  },
  "pages" : [ {
    "nr" : 0,
    "image" : {
      "filename" : "uploadedFile.pdf.0.tiff",
      "id" : "FILEYUI678",
      "url" : "http://your-giles-host.net/giles/rest/digilib?fn=username%FILEYUI678%2FDOC123edf0%2FuploadedFile.pdf.0.tiff",
      "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.0.tiff",
      "content-type" : "image/tiff",
      "size" : 2032405
    },
    "text" : {
      "filename" : "uploadedFile.pdf.0.txt",
      "id" : "FILE789UIO",
      "url" : "http://your-giles-host.net/giles/rest/files/FILE789UIO/content",
      "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.0.txt",
      "content-type" : "text/plain",
      "size" : 4658
    },
    "ocr" : {
      "filename" : "uploadedFile.pdf.0.tiff.txt",
      "id" : "FILE789U12",
      "url" : "http://your-giles-host.net/giles/rest/files/FILE789U12/content",
      "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.0.tiff.txt",
      "content-type" : "text/plain",
      "size" : 4658
    }
  }, {
    "nr" : 1,
    "image" : {
      "filename" : "uploadedFile.pdf.1.tiff",
      "id" : "FILE045tyhG",
      "url" : "http://your-giles-host.net/giles/rest/digilib?fn=username%2FFILE045tyhG%2FDOC123edf0%2FuploadedFile.pdf.1.tiff",
      "path" : "username/UPxx456/DOC123edf/uploadedFile.1.tiff",
      "content-type" : "image/tiff",
      "size" : 2512354
    },
    "text" : {
      "filename" : "uploadedFile.pdf.1.txt",
      "id" : "FILEMDSPfeVm",
      "url" : "http://your-giles-host.net/giles/rest/files/FILEMDSPfeVm/content",
      "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.1.txt",
      "content-type" : "text/plain",
      "size" : 5799
    },
    "ocr" : {
      "filename" : "uploadedFile.pdf.1.tiff.txt",
      "id" : "FILEMDSPfe12",
      "url" : "http://your-giles-host.net/giles/rest/files/FILEMDSPfe12/content",
      "path" : "username/UPxx456/DOC123edf/uploadedFile.pdf.1.tiff.txt",
      "content-type" : "text/plain",
      "size" : 5799
    }
  }

Retrieving Data

Get info about document

GET

This endpoint can be used without an API token when requesting public files.

Starting with version v0.8, the returned json contains lists of additional files for the uploaded document and each page (as shown in the example below). Before version v0.8, the additionalFiles sections are not included.

You can get information about a document by making a GET request to:

/api/v2/resources/documents/{documentId}

where {documentId} is the id of the upload you are requesting information about.

A response looks similar to this:

{
  "documentId" : "DOCOhcqLGMXL8dC",
  "uploadId" : "UPMDG2ddX4bDKk",
  "uploadedDate" : "2016-10-04T17:40:15.254Z",
  "access" : "PUBLIC",
  "uploadedFile" : {
    "filename" : "your-file.pdf",
    "id" : "FILE0fPS2iO6Ev7g",
    "url" : "https://your.host/giles/rest/files/FILE0fPS2iO6Ev7g/content",
    "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf",
    "content-type" : "application/pdf",
    "size" : 1453836
  },
  "extractedText" : {
    "filename" : "your-file.pdf.txt",
    "id" : "FILEjXRK3MKDjcqx",
    "url" : "https://your.host/giles/rest/files/FILEjXRK3MKDjcqx/content",
    "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.txt",
    "content-type" : "text/plain",
    "size" : 84313
  },
  "additionalFiles": [
     {
        "filename": "your-file.pdf.txt.species.csv",
        "id": "FILEZGpnr7Keocfh",
        "url": "http://your.host/giles/rest/files/FILEZGpnr7Keocfh/content",
        "path": "other/youruser/UP0GCnEZg9l02y/DOCGS1PfODiKbcx/your-file.pdf.txt.species.csv",
        "content-type": "text/csv",
        "size": 237,
        "processor": "carolus"
     }
    ],
  "pages" : [ {
    "nr" : 0,
    "image" : {
      "filename" : "your-file.pdf.0.tiff",
      "id" : "FILEgwyK2KjEiniN",
      "url" : "https://your.host/giles/rest/digilib?fn=youruser%2FUPMDG2ddX4bDKk%2FDOCOhcqLGMXL8dC%2Fyour-file.pdf.0.tiff",
      "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.0.tiff",
      "content-type" : "image/tiff",
      "size" : 1938832
    },
    "text" : {
      "filename" : "your-file.pdf.0.txt",
      "id" : "FILEu3zp4FHaNBEz",
      "url" : "https://your.host/giles/rest/files/FILEu3zp4FHaNBEz/content",
      "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.0.txt",
      "content-type" : "text/plain",
      "size" : 3461
    },
    "ocr" : {
      "filename" : "your-file.pdf.0.tiff.txt",
      "id" : "FILEu3zp4FHaN567",
      "url" : "https://your.host/giles/rest/files/FILEu3zp4FHaN567/content",
      "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.0.tiff.txt",
      "content-type" : "text/plain",
      "size" : 3461
    }
    "additionalFiles": [
      {
         "filename": "your-file.pdf.0.tiff.txt.species.csv",
         "id": "FILE9K9XJuIrN28X",
         "url": "http://your.host/giles/rest/files/FILE9K9XJuIrN28X/content",
         "path": "other/youruser/UP0GCnEZg9l02y/DOCGS1PfODiKbcx/your-file.pdf.0.tiff.txt.species.csv",
         "content-type": "text/csv",
         "size": 25,
         "processor": "carolus"
      }
    ]
  }, {
    "nr" : 1,
    "image" : {
      "filename" : "your-file.pdf.1.tiff",
      "id" : "FILE1vgFj8feXHtG",
      "url" : "https://your.host/giles/rest/digilib?fn=youruser%2FUPMDG2ddX4bDKk%2FDOCOhcqLGMXL8dC%2Fyour-file.pdf.1.tiff",
      "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.1.tiff",
      "content-type" : "image/tiff",
      "size" : 1938382
    },
    "text" : {
      "filename" : "your-file.pdf.1.txt",
      "id" : "FILER0t8JQ1WuU94",
      "url" : "https://your.host/giles/rest/files/FILER0t8JQ1WuU94/content",
      "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.1.txt",
      "content-type" : "text/plain",
      "size" : 3930
    },
    "ocr" : {
      "filename" : "your-file.pdf.1.tiff.txt",
      "id" : "FILER123JQ1WuU94",
      "url" : "https://your.host/giles/rest/files/FILER123JQ1WuU94/content",
      "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.1.tiff.txt",
      "content-type" : "text/plain",
      "size" : 3930
    },
    "additionalFiles": [
      {
         "filename": "your-file.pdf.1.tiff.txt.species.csv",
         "id": "FILE9K9XJuIrN890",
         "url": "http://your.host/giles/rest/files/FILE9K9XJuIrN890/content",
         "path": "other/youruser/UP0GCnEZg9l02y/DOCGS1PfODiKbcx/your-file.pdf.1.tiff.txt.species.csv",
         "content-type": "text/csv",
         "size": 23,
         "processor": "carolus"
      }
    ]
  }, {
    "nr" : 2,
    "image" : {
      "filename" : "your-file.pdf.2.tiff",
      "id" : "FILEzQaVarnXZy52",
      "url" : "https://your.host/giles/rest/digilib?fn=youruser%2FUPMDG2ddX4bDKk%2FDOCOhcqLGMXL8dC%2Fyour-file.pdf.2.tiff",
      "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.2.tiff",
      "content-type" : "image/tiff",
      "size" : 1809905
    },
    "text" : {
      "filename" : "your-file.pdf.2.txt",
      "id" : "FILEFlTXtknorFua",
      "url" : "https://your.host/giles/rest/files/FILEFlTXtknorFua/content",
      "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.2.txt",
      "content-type" : "text/plain",
      "size" : 3563
    },
    "ocr" : {
      "filename" : "your-file.pdf.2.tiff.txt",
      "id" : "FILEFlTXtkn345ua",
      "url" : "https://your.host/giles/rest/files/FILEFlTXtkn345ua/content",
      "path" : "youruser/UPMDG2ddX4bDKk/DOCOhcqLGMXL8dC/your-file.pdf.2.tiff.txt",
      "content-type" : "text/plain",
      "size" : 3563
    },
    "additionalFiles": [
      {
         "filename": "your-file.pdf.2.tiff.txt.species.csv",
         "id": "FILE9K9XJuI78YUR",
         "url": "http://your.host/giles/rest/files/FILE9K9XJuI78YUR/content",
         "path": "other/youruser/UP0GCnEZg9l02y/DOCGS1PfODiKbcx/your-file.pdf.2.tiff.txt.species.csv",
         "content-type": "text/csv",
         "size": 30,
         "processor": "carolus"
      }
    ]
  }

Get full image from Giles

GET  

This endpoint can be used without an API token when requesting public files.

You can get the original version of a file that you have uploaded through Giles by making a GET request to:

/api/v2/resources/files/{fileId}/content

where {fileId} is the id of the file you are trying to download.

Note: when requesting information about an upload, or after uploading a file, the path property of the JSON response will point here for PDF files.

Note: this will only work for files that have been uploaded using version 2 of the API. For previously uploaded document, use the original API.


  • No labels