Rob St. Amant

Rob St. Amant
Birthday
December 31
Bio
My roots are in San Francisco and later Baltimore, where I went to high school and college. I stayed on the move, living for a while in Texas, several years in a small town in Germany, and then several more in Massachusetts, working on a Ph.D. in computer science. I'm now a professor at North Carolina State University, in Raleigh. My book, Computing for Ordinary Mortals, will appear this fall from Oxford University Press. http://goo.gl/hQBHy

MY RECENT POSTS

Editor’s Pick
MAY 23, 2012 8:35AM

I'll bet you like ice cream.

Rate: 14 Flag


Do you like ice cream? I'll predict that if you care enough to mention ice cream via Twitter, you're probably in favor of it, even moderately enthusiastic.

Am I just guessing? Not entirely. I used a visualization system developed by my colleague Chris Healey to produce the result at the top of this post. When I typed in "ice cream", the system retrieved a few hundred recent tweets containing that term and generated a visualization of their emotional content, or sentiment. Try it yourself, with your own keywords, on Chris's tweet viz Web page.

The visualization integrates a lot of information (details here), but I'll concentrate on the basics: roughly speaking, the green circles are for tweets that express "positive" sentiment, and the blue circles are for tweets expressing "negative" sentiment. Sentiment is inferred from the words in the tweet. For example, "ice cream and good sex!" (the text of an actual tweet, with relevant words in bold) contains "good" and "sex". Very nice. But how could ice cream be bad? Well, some people who eat ice cream when they're feeling down might tweet about it. And one tweet says it's "dangerous even in an ice cream shop! Robbery yesterday..."

We'll need a bit of theory to understand how the circles are laid out. James Russell, a psychologist at Boston College, has proposed a conceptual framework for understanding emotion [PDF]. (There have been many attempts to formalize what we know about emotion.) In Russell's framework, one important factor is valence, which ranges from unpleasant to pleasant. (This is actually what I meant by "negative" and "positive" above.) The horizontal placement of a circle is an indication of the unpleasantness or pleasantness expressed in a tweet. And the vertical placement? That's another factor in Russell's framework: arousal, which ranges from being nearly comatose to being very, very excited. So the circles toward the top are excited, with happy tweets on the right and stressed-out tweets on the left, and at the bottom they're kind of... meh. We don't see a lot of the latter; tweeting takes some effort, after all.

It's surprisingly hard to find topics where tweets contain uniformly pleasant or unpleasant sentiment. Not even for the keyword "funeral":


Some people apparently like funerals! But if you hovered your mouse pointer over different green circles to read the tweets, you'd find that some Twitter users have the word "funeral" in their names, and sometimes they tweet about happy things. (Why not filter out names? We'd lose information. For example, a query on "obama" might then ignore tweets containing only @BarackObama, which we probably do want to see.) Also, you'll occasionally find something along these lines: "Nice to see my loving family at an event other than a funeral." A loving family is a pleasant thing to have.

This last example suggests that an automated analysis isn't as smart as a person in extracting meaning or sentiment. A human being, for example, might reasonably judge that a tweet about "an event other than a funeral" isn't really about funerals. Processing natural language, beyond the level of individual words, remains a very hard problem.

I'm writing about Chris's tweet viz system for a couple of reasons. First, it's cool. Half a billion people across the world use Twitter, and 340 million tweets are posted each day. (I'm quoting one of my students, Shishir Kakaraddi, who just completed an M.S. thesis in the general area of tweet summarization.) We need good tools for making sense of all this data. Second, the project is a nice example of how research can drive software development. It's not just about what people might like to see in a visualization of Twitter data; the design of the visualization draws on psychological models of visual perception (for example, in the color choices) and of emotion (in the analysis and display of sentiment information). By building and testing systems like these, we can learn new and interesting things.
 
Update: Taxes.

taxes1

Your tags:

TIP:

Enter the amount, and click "Tip" to submit!
Recipient's email address:
Personal message (optional):

Your email address:

Comments

Type your comment below:
Into the feed, for ice cream lovers.
I ran a "taxes" query, but no results popped up. Same result when I tried a couple other queries. Maybe the dear workplace hasn't upgraded to IE9. Maybe I'm incapable of correctly clicking on a query button. Nontheless, this sounds like an interesting tool. It could be useful to companies that organize focus groups. Easy way to gather data on a topic or to see the results of a campaign. [What the hell is wrong with me these days? Seeing everything in terms of marketing. help me (whimper)]
Bummer, Stim. I've added a screen shot for taxes. It shows surprising range, but it turns out that this is mostly due to the limitations in dictionary-based sentiment analysis. For example, "free" "health" "care" "people" are all relatively pleasant terms.
Rob, this totally blows my mind. I ran it a couple of times. My name had five hits, Chris Matthews had thousands, and some random things were interesting as well. Don't I wish I had invested in Twitter technology? And yes I do like ice cream.
Cool! Thanks for checking it out, Amy. It can be a time sink, but a fun one.
Cod liver oil. None. Hmmm, guess I'm just not hip.
"sentiment analysis" is a definite emerging area of AI & worthwhile to follow.... companies are starting to use it to gauge effectiveness of their campaigns ... however for a example of "what to avoid" consider the mcdonalds case where people tweeted lame stuff about what happened to them at mcdonalds....
Unfortunately, Chicken Mãâàn, not all browsers or browser versions are supported. That's the bane of software development (especially research software). When I run a query for cod+liver+oil (on Safari), I find, surprisingly, that sentiment is mainly positive. There are a lot of false positives, though. (Chris Healey tells me that in his informal testing, sentiment seems to be accurately judged about 80% of the time.) Here's an example of a false positive: "Ravenous hunger with a cupboard stocked with only cod liver oil & biso, car miles away, food shows on the telly. WHY HAST THOU FORSAKEN ME GOD." None of the negatives you'd expect (ravenous, hunger, forsaken) are in the sentiment dictionary being used--just the positive ones (car, food, God).

vzn, that's my understanding as well; companies are really interested. And it can be manipulated, indirectly. Some companies do a lot of customer support via Twitter, I'm told, and the reps are all as chipper as can be.

Hey, Sandra no-longer-miller-for-a-very-long-time! It's nice to see you. I hope the writing is going well.
I hate ice cream. it's boring. it's cold. it's tastes fake. it ain't even ice-cream. its chemical mush
and sex sucks.
and the internet is full of vomitory vacuous verbose verbiage
it's a cybafogged unreality
Dude, you should tweet that. Well, the first 140 characters...
"boss is evil" has 104 out there
only 104?
That's surprising! The visualization is based on a set of Tweets that comes directly from Twitter. Without double quotes around the query, tweets can have the query terms in any order. "boss is evil" returns only two tweets right now. And "boss is nice"? 38. Wow.

I guess people are happy today. But it's only 9 a.m.
Hmmm...so a little bird is telling me SEO may never be the same again! After my most recent semester that was ultimately saturated with Goleman, Lazarus and Snyder, your post was a definite excursion into valence... What great research and links!
That's an interesting thought, KC. I don't know very much about what search engine optimization companies do, or whether the major search engines pay attention to sentiment, but the latter wouldn't surprise me. I've read about specialized search engines that pay attention to sentiment, though; it would be really useful, for example, for categorizing companies and product reviews, if it were reliable enough. Are people positive or negative on X? Do they have strong or weak feelings about X? We still need to worry about the black hats, of course...
Interesting post. However, next time, please make sure the words in your graph are legible to the average reader. I'm wearing my glasses, but looking closely at the screen and squinting didn't help much.
Sorry, Malunsinka! I'm hampered by OS limitations--we get only 485 pixels of width to show an image, and I'm not able to fiddle the font size directly in the visualization.
Thanks for this, it was fascinating--I didn't know something like this existed. I tried it and it is amazingly fast. I can see how it could be manipulated by marketeers.
Fun nonetheless.
Hi, Anne! I'm glad you liked it. Chris Healey tells me that some marketers, such as Zappos, seem to recognize this already. They're not deceptive about it, though--their representatives tweet generally positive stuff.
I love ice cream. I hate the very idea of Twitter...