docuManager

Summary

Frontend for displaying and annotating digital texts.

docuManager imports collections of images in a folder of a filesystem. It allowes to add transcription of texts or texts generated by OCR to the images and display theses side by side. It is then possible to annotate images and texts. Annotation can be stored at any open-annotation compatible files system. The image component is based on digilib.

docuManger is integrated in the giles-eco system and can be called from giles after uploading images.

Added texts are indexed using apache-solr. The frontend allows searches either in a single document or a whole collection. 

docuManager doesn't allow editing of Metadata. It depends on metadata either provided in a database (exposed via REST) or stored in the filesystem based on MPIWG's index.meta format.

Supported formats

Images

All image formats digilib supports. PDF can be imported, existing annotation in the PDF are extracted and stored as annotations on the image.

Texts

Texts have to be either in TEI, ALTO-XML or HOCR. The display environment does only displays a subset of TEI. If the position of text is provided in ALTO-XML or HOCR the text is display at this position to generate a side by side view of  images and text

Export

Annotations of texts can be exported in ODF using LibreOffice API for formatting of the text.

Implementation Details

  • Django >1.10
  • python 3.x

Dependencies

  • digilib
  • libreOffice (olny for export)
  • apache-solr