Visit my Georgia Tech site

I recently graduated and started a new research lab at Georgia Tech. For my latest work please see the comp.social research lab site.

Georgia Tech!!

I’m joining the faculty at Georgia Tech!! In the fall, I’ll be an Assistant Professor of Interactive Computing, in the College of Computing. I’m deliriously excited to join their faculty. It’s like getting picked to open for the Stones, in 1967.

I may help out with a class in the fall, then I’ll start teaching properly in the spring. Until then, I’m madly dissertating, distracting myself with Link Different, attending ICWSM, perhaps ASA (say hi!) and looking for somewhere to live in Atlanta.

p.s. I know most academics don’t like (or don’t admit to liking) sports. But I love college basketball. *gasp* I come from a school with a big I for a mascot; the yellow jacket is a big improvement. And now I get to heckle Duke anytime I want.

Widespread Worry and the Stock Market

UPDATE: I have released the classifiers, R scripts and aggregate data from this paper. The archive has a README to get you started and some example Java showing how to use the classifiers. Get it here.

I have a new paper at ICWSM 2010. I’m really looking forward to all the great work in the program. The central thesis of my paper: estimating anxiety, worry and fear from blogs provides some novel information about future stock market prices.

ABSTRACT: Our emotional state influences our choices. Research on how it happens usually comes from the lab. We know relatively little about how real world emotions affect real world settings, like financial markets. Here, we demonstrate that estimating emotions from weblogs provides novel information about future stock market prices. That is, it provides information not already apparent from market data. Specifically, we estimate anxiety, worry and fear from a dataset of over 20 million posts made on the site LiveJournal. Using a Granger-causal framework, we find that increases in expressions of anxiety, evidenced by computationally-identified linguistic features, predict downward pressure on the S&P 500 index. We also present a confirmation of this result via Monte Carlo simulation. The findings show how the mood of millions in a large online community, even one that primarily discusses daily life, can anticipate changes in a seemingly unrelated system. Beyond this, the results suggest new ways to gauge public opinion and predict its impact.

pdf Widespread Worry and the Stock Market.
Proc. ICWSM, 2010.

We Meddle Lists

Making lists is tedious. So it is on Twitter. We make lists to share with the world, but no one will make personal lists for us. You have to make the close friends list, the family list, the tech list, the coworker list. Phew.

That’s the problem We Meddle Lists wants to solve. It uses the history you’ve naturally built up in Twitter and turns it into some nifty (private) lists. Below is one of the lists We Meddle created for me. I track Inner Circle in Seesmic’s Twitter app. This list is especially useful when I’ve been away awhile and need to catch up. We Meddle also does some cool community detection. Try it here!

my inner circle

CSCW 2010: Understanding Deja Reviewers

I’m happy to announce a new paper, a departure from my thesis work. It’s going to appear at CSCW 2010, and it looks at people who write product reviews that really look like other reviews. I call them deja reviewers. I’m also happy to report that the note got the best of CSCW award. Very cool!

ABSTRACT: People who review products on the web invest considerable time and energy in what they write. So why would someone write a review that restates earlier reviews? Our work looks to answer this question. In this paper, we present a mixed-method study of deja reviewers, latecomers who echo what other people said. We analyze nearly 100,000 Amazon.com reviews for signs of repetition and find that roughly 10–15% of reviews substantially resemble previous ones. Using these algorithmically-identified reviews as centerpieces for discussion, we interviewed reviewers to understand their motives. An overwhelming number of reviews partially explains deja reviews, but deeper factors revolving around an individual’s status in the community are also at work. The paper concludes by introducing a new idea inspired by our findings: a self-aware community that nudges members toward community-wide goals. (espresso machine courtesy of jakeliefer.)

pdf Understanding Deja Reviewers.
Proc. CSCW, 2010.

Who Do You Gossip About?

I so should be doing other things. Like reviewing CHI papers. But I’m making fun little apps. I’ve always been fascinated with gossip. “Gossip” has a negative connotation, but it’s essential to social life. In a few bored hours last night, I wrote a little gossip app, and you can download it.

Before we go any further, there’s two pretty tight requirements: you need to use Mail.app on a Mac and you need to use IMAP. The app is called “Bit of Gossip.” It crawls through your sent mail looking for people you mention in the message body but don’t include on the recipient list. Don’t worry, we all do it. And don’t worry, the app does everything locally; your mail never leaves your machine.

This isn’t a research project. Just a fun little hack. It’s also pretty bare bones. A little dialog just pops up as it processes, then TextEdit shows you the results. Like I said, not a research project, not a finished project. But I found it pretty fun and enlightening. And, the name extraction stage can take a while. Oh, and it also handles nicknames (e.g., Tom is short for Thomas, etc. … I couldn’t get anything good without it.) The source is in there if you want it. Do with it what you will.

DOWNLOAD BIT OF GOSSIP
(drag to Applications)

Couldn’t have done it without the distinguished Stanford Named Entity Recognizer and Platypus. And of course Perl. Where would I be without you, darling?

Google Fellowship

I am very, very, very happy to tell you that last week Google gave me a fellowship, their Fellowship in Social Computing. They will fund me for the next two years, the rest of my PhD. It also comes with a bunch of goodies, including a new phone, which I need desperately. (I try not to show my phone at conferences.)

I pitched three new projects in my proposal to them, projects I hope to get out soon. Google only accepted nominations from universities. I was quite skeptical that our traditional CS department would nominate me, but they pleasantly surprised me. Thank you, Google!

A Love Letter to VIM

Despite its hideous logo, VIM is a fantastic editor. VIM gets a kiss of death from HCI: it’s modal, has a steep learning curve and requires mental & muscle memory. Whenever I praise it around HCI folk, they crinkle their noses. But it’s an expert interface. Although I learned it many years ago, way before I knew anything about HCI, here’s 3 reasons I still love it:

  1. It minimizes programmer energy. VIM never makes you take your hands off the keyboard! I cannot say this enough. You never have to take your hands off the keyboard! An IDE like Eclipse will do lots of fancy stuff for you, but you have to click-click-click. It’s horrible. You should spend 95% of your time on the keyboard, not clicking. Time clicking is time not coding.
  2. It rocks on data files. I often have complex data files that I need to hack up in some structured way. Like, find the second ::, delete from there until the next !. Repeat 10,000 times. The only other way is often a script. VIM solves this problem quickly and indulges my inner laziness.
  3. It’s everywhere and needs only 2MB of memory. My app should need 300MB of memory, not my editor. Couple this with the fact that it’s standard on every *nix box (OS X too) and you’ve got a strong reason to give it try.

Tie Strength Talk

This was my favorite CHI to date. It was nearby — big plus. I met a lot of great people, and the conference moved primarily out of the paper sessions and into the hallways for the first time. I gave the predicting tie strength talk to a nearly full house, and was very happy with how warmly the work was received. Here are the slides in pdf and at slideshare.

My main argument is that we can model tie strength with quite high accuracy using only the data left behind in social media. In this case, I used Facebook data to predict relationship tie strengths provided by our participants. I then argue that we could use this relationship model to do smart things like provide good defaults for privacy controls (the illustration in the slides near the end).

CHI 2009: Predicting Tie Strength

Social media treats all users the same: trusted friend or total stranger, with little or nothing in between. In reality, relationships fall everywhere along this spectrum, a topic social science has investigated for decades under the theme of tie strength. Our work bridges this gap between theory and practice. In this paper, we present a predictive model that maps social media data to tie strength. The model builds on a dataset of over 2,000 social media ties and performs quite well, distinguishing between strong and weak ties with over 85% accuracy. We complement these quantitative findings with interviews that unpack the relationships we could not predict. The paper concludes by illustrating how modeling tie strength can improve social media design elements, including privacy controls, message routing, friend introductions and information prioritization.

We won best paper!

pdf Predicting Tie Strength With Social Media.
Proc. CHI, 2009.