Verb Paraphrasing Experiment

Tuesday, March 11th, 2008

addressed ~ toasted (sometimes)

I’m taking an NLP class this semester, and it has been interesting. We just completed our first problem set: find verb pairs such that you can replace one with the other in at least one sentence (without changing the meaning of the sentence too much). Example: “President Bush addressed/toasted the crowd.”

For my part, I implemented an algorithm by Glickman and Dagan that takes a probabilistic and unsupervised approach to the problem. The reason I post this here is because my code will just rot on my machine unless I do something with it. The code works on the AQUAINT corpus, processed by minipar. The algorithm finds some legitimate paraphrases and also some bogus ones. The top 5 ranked verbs drawn from a New York Times corpus:

take approached (good)
become defined (not so good)
abandon put (bad)
planned mounted (good)
addressed toasted (good)

Debate Diagrams: Primaries Visualization

Sunday, December 16th, 2007

debate diagrams

I built Debate Diagrams to make sense of the Democratic primary debates. In such a crowded field, the candidates need to distinguish themselves: one strategy is direct comparison. Debate Diagrams parses the transcripts of 5 officially sanctioned Democratic debates to place an arc between two candidates when one mentions another by name. The arcs become denser as they continue doing it.

My visualization draws substantial inspiration from Martin Wattenberg’s fantastic piece, The Shape of Song. It also follows on the heels of a similar visualization produced by the NY Times for last Sunday’s paper. It’s my first project in Flex.

Try the interactive version!