Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Notes on Giles & Digilib

Giles works under the assumption that it is running on the same machine as your Digilib installation. When uploading a file, Giles simply puts that file into a folder accessible by Digilib. If you have different requirements, then Giles won't work for you. However, if you are comfortable programming Java, you could implement your own FileStorageManager (edu.asu.giles.files.impl.FileStorageManager). This class is the one that handle file storage in Giles.

...

  • giles.base.url: the final URL Giles will run at (e.g. http://myserver.org/giles). Giles will use that URL to build links to Digilib content.

  • admin.password: the password for you admin user. It should be encoded with BCrypt strength=4. Default is "admin".

  • admin.username (optional): if you want your admin user to have a different name than "admin", you can specify that with this property.

  • github.clientId: OAuth client id of your GitHub application registration.

  • github.secret: OAuth client secreate of your GitHub application registration.

  • db_files: Path to a folder in your file system that will hold Giles' database files.

  • db.driver: if you don't use MySQL you can specify the appropriate driver here. Note that if you are not using MySQL you will also have to add the correct driver dependency to the pom.xml.

  • db.database.url: the url to the database (most likely something like jdbc:mysql://localhost:3306/giles).

  • db.user: username of your database user.

  • db.password: password of your database user.

  • digilib.url: url to your Digilib installation. The path should be the path to the Scaler servlet (e.g. http://myserver.org/digilib/servlet/Scaler)

  • digilibBaseDir: path to the digilib directory that should hold your images. Digilib has to have access to this directory.

  • jars.url: url to your Jars installation.

  • buildNumber (optional): if you want Giles to show a specific version number, you can specify that version number with this property.

  • pdfBaseDir: path to a folder in the file system to store uploaded PDFs
  • pdf.conversion.dpi (optional): dpi used for converting PDFs to images. Default is 600.
  • pdf.conversion.type (optional): image type to use when converting PDFs to images. Options are:

    • RGB (Red, Green, Blue) [default] 

    • ARGB (Alpha, Red, Green, Blue)

    • GRAY (Shades of gray)

    • BINARY (Black or white)

  • pdf.conversion.format (optional): image format to use when converting PDFs to images. Default is tiff.

    • Should be one of the following: 
      JPG, jpg, tiff, bmp, BMP, pcx, PCX, gif, GIF, WBMP, png, PNG, raw, RAW, JPEG, pnm, PNM, tif, TIF, TIFF, wbmp, jpeg
  • [with v0.4textBaseDir: path to a folder in the file system that hold extracted text files
  • [with v0.4tesseract.bin: path to parent folder of tesseract executable. For example, if your tesseract executable is /usr/bin/tesseract, then this property should be set to /usr/bin/.
  • [with v0.4tesseract.data: path to the parent folder of your tesseract tessdata folder. For example, if you tessdata folder is at /usr/share/tesseract/tessdata/, then this property should be set to /usr/share/tesseract/.

  • [with v0.4ocr.worker.count (optional): number of threads that are used to run tesseract. Default is 2.

  • [with v0.4ocr.images.from.pdfs (optional): this property defines if OCR is run on images that were created from PDFs. Should be true or false. Default is true.
  • [with v0.4log.level (optional): if Maven is run using the mydev or test profile, this property sets the log level. Default is debug. The prod profile sets this property to info.


Step-by-step guide

You will probably not getting around learning a little bit about Maven in order to build Giles, but this step-by-step guide hopefully makes it as easy as possible. 

...