Kelle Dhein
2/15/16
After reading a few of Erick's tethne notebooks and the ScottBot Irregular series on network models, I began working with the dfr python package. Using a sample data set of academic articles I downloaded from Erick's tethne tutorials, I have begun writing a program that will find the normalized pointwise mutual information between two words. So far I have written the part of the program that returns the number of documents containing at least one instance of a certain word. Now I will write a second part of the program that will return the number of documents containing two input words. After that, I will use those first two programs to make a program that returns the pointwise mutual information between words. Then, I will continue adding to that program to write a program that returns the normalized pointwise mutual information between two words.
2/29/16
I finished writing the program to find the normalized pointwise mutual information between two words. I got a refresher on logarithms, making sure integers turn into floats, using "try/except" instead of "if/then", and defining variables within a program. Now I just need to have the program run a set number of times, using random words from the corpus as inputs in each iteration. I was also able to turn a network graph I was working on for another class into a shiny, orange cytoscape network graph thanks to what I have learned in this course.
3/16/16
Because of spring break, I was in class twice since my last update. I am now trying to create a network graph of all the word pairs in the corpus I am working with that have an npmi value greater than 0.48014461624471066. Eric and I decided to use that value as a threshold because all pairs of words with an nmpi value equal or greater to that value are in the top 1% of npmi values.
3/28/16
Since my last update, I have finished creating a network graph of word pairs with an npmi value over .48etc using networkx. Then I viewed the network on cytoscape and learned how to organize and view the graph in different ways. Now I am working on a series of program files that utilize networkx to create a network graph of gene regulatory network.