Module 2: Introduction to Networks
This module will provide an introduction to networks and graph theory, as well as tools for generating, analyzing, and visualizing network models. This will involve some very light-touch, guided use of Python. By the end of the module, we will be using the collections that we compiled and curated in Module 1, as well as bibliographic metadata from other sources, to create a variety of corpus-based network models. We will move through this module in parts, and post material for each part as we go along.
Before You Proceed!
If you haven't finished installing all of the software in Preparing your computer for the course, please do so now.
Goals
In this module, we will use network analysis and graph theory is an entry-point to analyzing bibliographic metadata. Networks are seemingly natural models for bibliographic collections, because they can be interpreted relationally: we are often interested in latent relationships among documents (e.g. shared topical orientation), or in relationships among features of those documents (e.g. authors, or words). Networks are also attractive because there is a rich ecosystem of computational tools and quantitative statistics for analyzing and describing them, which are relatively easy to grasp quickly. It is important to keep in mind, however, that networks are not always the most suitable models — we should always think carefully about how we theorize bibliographic metadata, and what that entails for our analytic approach.
By the end of this module, you should:
- Be able to use tethne (a Python library) to parse bibliographic metadata from the Web of Science, JSTOR Data-for-Research, and your Zotero collections.
- Be able to use tethne to generate several different kinds of network models using those metadata.
- Be able to use Cytoscape and/or Gephi to visualize your network models.
- Possess a basic vocabulary and understanding of graph theory, and some common statistics used in network analysis.
- Components, nodes, edges, and degree;
- Directed and undirected graphs;
- Centrality statistics (betweenness, closeness);
- Concepts of flow, conductivity, and connectivity;
- Clustering.