Preparing your computer for the course
This how-to will walk you through setting up your computer for this course. If you are using a computer with Mac OSX or Linux (e.g. Ubuntu), you can proceed with the tutorial below. If you're stuck with windows, please see Help! I'm stuck with Windows :-(.
If you run into trouble, see Getting Help.
Everybody
- Install Anaconda Python 2.7. Anaconda is a distribution of Python, a popular programming language for scientific computing. Anaconda comes pre-loaded with over 300 awesome Python packages (libraries), so it really cuts down on set-up time. Be sure to select Python 2.7, NOT Python 3.4.
- Install Atom. Atom is a slick text editor.
- Install Java 8.
- If you don't know what version of Java you have installed, see this page (https://java.com/en/download/help/version_manual.xml) for instructions.
- To download the latest version of Java, see this page: http://www.oracle.com/technetwork/java/javase/overview/java8-2100321.html
- Install the latest version of Firefox web browser.
- Install the Zotero Firefox Plugin. Zotero is a bibliographic metadata management tool with some nifty features. We'll use it to build corpora from online databases. Be sure to install the plugin for Firefox!
- Install ImageMagick. This package provides all kinds of nifty tools for working with images. We'll need it work with PDFs later on.
- OSX: From the terminal, do:
brew install imagemagick
- Ubuntu Linux: From the terminal, do:
sudo apt-get install imagemagick
- OSX: From the terminal, do:
- Install Xpdf. Command-line tools for working with PDFs.
- Install Tesseract OCR. Tesseract is an open source engine for Optical Character Recognition (OCR). We'll use this to extract plain-text from PDFs.
- OSX: From the terminal, do:
brew install tesseract --all-languages
- Ubuntu Linux: see this page.
- OSX: From the terminal, do:
- SmartGit. This is a great tool for working with Git repositories.
- Install Cytoscape (http://cytoscape.org/)
- Install NLTK Corpora. Follow the instructions at http://www.nltk.org/data.html#command-line-installation.
- Download Stanford NER.
- Tethne. Once you have installed Anaconda (or Python 2.7 + setuptools), you should be able to install Tethne from the terminal via pip:
$ pip install tethne
Mac OSX
- Install Homebrew. Homebrew is a package manager that makes it easy to install software from the command line.
Related articles