Skip to main navigation Skip to search Skip to main content

Choosing the Right Words: Characterizing and Reducing Error of the Word Count Approach

  • H. Andrew Schwartz
  • , Johannes Eichstaedt
  • , Lukasz Dziurzynski
  • , Eduardo Blanco
  • , Margaret L. Kern
  • , Stephanie Ramones
  • , Martin Seligman
  • , Lyle Ungar
  • University of Pennsylvania
  • Lymba Corporation

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

18 Scopus citations

Abstract

Social scientists are increasingly using the vast amount of text available on social media to measure variation in happiness and other psychological states. Such studies count words deemed to be indicators of happiness and track how the word frequencies change across locations or time. This word count approach is simple and scalable, yet often picks up false signals, as words can appear in different contexts and take on different meanings. We characterize the types of errors that occur using the word count approach, and find lexical ambiguity to be the most prevalent. We then show that one can reduce error with a simple refinement to such lexica by automatically eliminating highly ambiguous words. The resulting refined lexica improve precision as measured by human judgments of word occurrences in Facebook posts.

Original languageEnglish
Title of host publication*SEM 2013 - 2nd Joint Conference on Lexical and Computational Semantics
PublisherAssociation for Computational Linguistics (ACL)
Pages296-305
Number of pages10
ISBN (Electronic)9781937284480
StatePublished - 2013
Event2nd Joint Conference on Lexical and Computational Semantics, *SEM 2013 - Atlanta, United States
Duration: Jun 13 2013Jun 14 2013

Publication series

Name*SEM 2013 - 2nd Joint Conference on Lexical and Computational Semantics
Volume1

Conference

Conference2nd Joint Conference on Lexical and Computational Semantics, *SEM 2013
Country/TerritoryUnited States
CityAtlanta
Period06/13/1306/14/13

Fingerprint

Dive into the research topics of 'Choosing the Right Words: Characterizing and Reducing Error of the Word Count Approach'. Together they form a unique fingerprint.

Cite this