The Giles Ecosystem is a distributed system to run OCR on images and extract images and texts from PDF files.

Components

The core components of the Giles Ecosystem are located in the following repositories:

Dependencies

The system depends on the following software:

The Giles Ecosystem documentation (in progress) can be found here: Giles Ecosystem Home.

There is a Docker compose file to run the Giles Ecosystem in several Docker containers: https://github.com/diging/giles-eco-docker