/
Software Requirements (Module 1)
Software Requirements (Module 1)
Please install the following applications well in advance of our meeting.
- Zotero Firefox Plugin. Zotero is a bibliographic metadata management tool with some nifty features. We'll use it to build corpora from online databases. Be sure to install the plugin for Firefox!
- ImageMagick. This package provides all kinds of nifty tools for working with images. We'll need it work with PDFs later on.
- OSX: From the terminal, do:
brew install imagemagick
- Ubuntu Linux: From the terminal, do:
sudo apt-get install imagemagick
- OSX: From the terminal, do:
- Xpdf. Command-line tools for working with PDFs.
- Tesseract OCR. Tesseract is an open source engine for Optical Character Recognition (OCR). We'll use this to extract plain-text from PDFs.
- OSX: From the terminal, do:
brew install tesseract --all-languages
- Ubuntu Linux: see this page.
- OSX: From the terminal, do:
Related content
Tutorial: Text Extraction and OCR with Tesseract and ImageMagick
Tutorial: Text Extraction and OCR with Tesseract and ImageMagick
More like this
Giles Ecosystem Home
Giles Ecosystem Home
More like this
docuManager
docuManager
More like this
Preparing your computer for the course
Preparing your computer for the course
More like this
Giles Ecosystem
Giles Ecosystem
More like this
Module 1: Digital Materials & Collections
Module 1: Digital Materials & Collections
More like this