Excerpt |
---|
This page describes how you install Giles. Giles depends on two other web applications: Digilib and Jars. Digilib serves up images and Jars manages their metadata. You can run Giles without Jars. However, you will have a bunch of non-working links in that case. |
System Requirements
- Java 8
- Tomcat 7
- Digilib
- Jars
- Tesseract
Note: if you don't have Tesseract installed, you can disabled the OCR feature (see properties below) to keep Giles from trying to run OCR on images. If the OCR feature is enabled but Tesseract is not installed or configured, Giles will still work and simply silently fail when trying to OCR images.
Notes on Giles & Digilib
Giles works under the assumption that it is running on the same machine as your Digilib installation. When uploading a file, Giles simply puts that file into a folder accessible by Digilib. If you have different requirements, then Giles won't work for you. However, if you are comfortable programming Java, you could implement your own FileStorageManager (edu.asu.giles.files.impl.FileStorageManager). This class is the one that handle file storage in Giles.
...
Code Block | ||||
---|---|---|---|---|
| ||||
create table UserConnection (userId varchar(255) not null, providerId varchar(255) not null, providerUserId varchar(255), rank int not null, displayName varchar(255), profileUrl varchar(512), imageUrl varchar(512), accessToken varchar(512)text not null, secret varchar(512), refreshToken varchar(512), expireTime bigint, primary key (userId, providerId, providerUserId)); create unique index UserConnectionRank on UserConnection(userId, providerId, rank); |
...
giles.base.url: the final URL Giles will run at (e.g. http://myserver.org/giles). Giles will use that URL to build links to Digilib content.
admin.password: the password for you admin user. It should be encoded with BCrypt strength=4. Default is "admin".
admin.username (optional): if you want your admin user to have a different name than "admin", you can specify that with this property.
github.clientId: OAuth client id of your GitHub application registration.
github.secret: OAuth client secreate secrete of your GitHub application registration.
- [starting with v0.7] google.clientId: OAuth client id of your Google application registration.
- [starting with v0.7] google.secret: OAuth client secrete of your Google application registration.
- [starting with v0.7] mitreid.clientId: OAuth client id of your MITREid Connect application registration.
- [starting with v0.7] mitreid.secret: OAuth client secrete of your MITREid Connect application registration.
- [starting with v0.7] mitreid.server.url: Url of your MITREid Connect server instance.
- [starting with v0.7] github.show.login: flag to specify if the GitHub login button should be shown on login page.
- [starting with v0.7] google.show.login: flag to specify if the Google login button should be shown on login page.
- [starting with v0.7] mitreid.show.login: flag to specify if the MITREid Connect login button should be shown on login page.
- [starting with v0.7] jwt.signing.secret: a secure-random secret key used to sign Giles API tokens.
- [starting with v0.7] jwt.signing.secret.apps: a secure-random secret key used to sign app tokens.
- db_files: Path to a folder in your file system that will hold Giles' database files.
db.driver: if you don't use MySQL you can specify the appropriate driver here. Note that if you are not using MySQL you will also have to add the correct driver dependency to the pom.xml.
db.database.url: the url to the database (most likely something like jdbc:mysql://localhost:3306/giles).
db.user: username of your database user.
db.password: password of your database user.
digilib.url: url to your Digilib installation. The path should be the path to the Scaler servlet (e.g. http://myserver.org/digilib/servlet/Scaler)
digilibBaseDir: path to the digilib directory that should hold your images. Digilib has to have access to this directory.
jars.url: url to your Jars installation.
- [starting with v0.6] jars.file.url: path to file metadata in metadata service (automatically prefixed with value of jars.url)
- [starting with v0.6] metadata.upload.add: path to page in metadata service for adding metadata after an upload (automatically prefixed with value of jars.url)
- [starting with v0.6] metadata.service.doc.url: path to document metadata in metadata service (automatically prefixed with value of jars.url)
- buildNumber (optional): if you want Giles to show a specific version number, you can specify that version number with this property.
- pdfBaseDir: path to a folder in the file system to store uploaded PDFs
- pdf.conversion.dpi (optional): dpi used for converting PDFs to images. Default is 600.
pdf.conversion.type (optional): image type to use when converting PDFs to images. Options are:
RGB (Red, Green, Blue) [default]
ARGB (Alpha, Red, Green, Blue)
GRAY (Shades of gray)
BINARY (Black or white)
pdf.conversion.format (optional): image format to use when converting PDFs to images. Default is
tiff
.- Should be one of the following:
JPG, jpg, tiff, bmp, BMP, pcx, PCX, gif, GIF, WBMP, png, PNG, raw, RAW, JPEG, pnm, PNM, tif, TIF, TIFF, wbmp, jpeg
- Should be one of the following:
- [starting with v0.4] textBaseDir: path to a folder in the file system that hold extracted text files
- [starting with v0.4] tesseract.bin: path to parent folder of tesseract executable. For example, if your tesseract executable is
/usr/bin/tesseract
, then this property should be set to/usr/bin/
. [starting with v0.4] tesseract.data: path to the parent folder of your tesseract
tessdata
folder. For example, if you tessdata folder is at/usr/share/tesseract/tessdata/
, then this property should be set to/usr/share/tesseract/
.- [starting with v0.9] tesseract.create.hocr (optional):
true
orfalse
; if set totrue
Giles will instruct tesseract to create hocr instead of plain text. Default isfalse
. - [starting with v0.4] ocr.worker.count (optional): number of threads that are used to run tesseract. Default is 2.
- [starting with v0.4] ocr.images.from.pdfs (optional): this property defines if OCR is run on images that were created from PDFs. Should be
true
orfalse
. Default istrue
. - [starting with v0.4] log.level (optional): if Maven is run using the mydev or test profile, this property sets the log level. Default is
debug
. The prod profile sets this property toinfo
.
Step-by-step guide
You will probably not getting around learning a little bit about Maven in order to build Giles, but this step-by-step guide hopefully makes it as easy as possible.
- Install Maven
- In a terminal go into the folder giles-{version}/giles-spring
- You need only one command to build Giles:
mvn clean package
. This will delete all previous built files and generate a war file. However, you need to specify the above listed properties for Giles to work correctly. - To specify a property when running Maven, the easiest way is to append
-D{property_name}={property_value}
. For example, if your database user is called "giles", then you would append-Ddb.user=giles
to the Maven command. The complete command would look like this:mvn clean package -Ddb.user=giles
. Append each property to the command like described in step 4. You complete command string will look something like this:
mvn clean package -Ddb_files=/path/to/db/files -Dadmin.password=GilesPassword -Dgithub.clientId=githubClientId -Dgithub.secret=githubClientSecret -Ddb.driver=com.mysql.jdbc.Driver -Ddb.database.url=jdbc:mysql://localhost:3306/giles -Ddb.user=giles -Ddb.password=GilesDbPassword -Ddigilib.url=http://myserver.org/digilib/servlet/Scaler -DdigilibBaseDir=/path/to/digilib/images -Djars.url=http://myserver.org/jars -Dgiles.base.url=http://myserver.org/giles -DpdfBaseDir=/path/to/pdfs/folder
Maven will create a new folder in giles-spring called
target
. If Maven ran successfully you will find a file calledgiles.war
inside this folder.- Simply put
giles.war
into your Tomcat's webapp directory and you should be good to go!
Info | ||
---|---|---|
| ||
Starting with version v0.5 a lot of the system properties can also be configured through the web app itself by any admin user. |