Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

What we will focus on in this course are ways of identifying theoretically relevant features in (or around) texts using computational methods, and using those features as the basis for quantitative analysis. Given the limited time available, you will not be an expert in any of these methods at the end of the course. Instead, my goal is to give you a thorough enough introduction to a range of methods that, as you develop your research project, you will know roughly where to start. 

Programming Required

It is extremely challenging to design a course that introduces truly useful computational methods, while at the same time avoiding advanced programming techniques. The fact of the matter is that a truly useful and rigorous computational analysis of texts requires some degree of computer programming. Luckily, some programming languages are abstract enough (i.e. close enough to the semantics of everyday language) that the determined scholar can get up and running fairly quickly. Interpreted languages like Python, Ruby, and R are fairly easy to learn, and each has a rich ecosystem of packages and extensions for quantitative analysis, including text-based analyses.
My strategy in designing this course was therefore to provide an introduction to quantitative text analysis in Python, without requiring you to know any Python ahead of time. To do this, I have created a series of “IPython Notebooks” — these are interactive notebooks that run in your web browser, pre-loaded with blocks of code surrounded by expository text. You can run the analyses in these notebooks without changing more than one or two lines of code, using text collections that I will provide. If you want to use your own text collections, or tweak the methods, you can alter the code to your heart’s content.  

Format

This course will mix short lectures with hands-on coding exercises. The course is divided into bite-size "modules". For each module, we will start with 15-20 minutes of lecture on the core concepts of the module. We will then (usually) work through some code samples together, with further exposition on Python coding techniques. You will then have time to play with the code, either by tinkering with the parameters or applying it to other datasets.

...