Abstract
The problem of geolocating Reddit users without access to the author information API is tackled in this study. Using subreddit data, we analyzed and identified user location based on their interactions within location-specific subreddits. Using unsupervised learning methods such as Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization (NMF) algorithms, we examined conversations about COVID-19 and immunization across the U.S., focusing on COVID-19 vaccination. Our topic modeling identifies four themes: humor and sarcasm (e.g., jokes about microchips), conspiracy theories (e.g., tracking devices and microchips in the COVID-19 vaccine), public skepticism (e.g., debates over vaccine safety and freedom), and vaccine brand concerns (e.g., Pfizer, Moderna, and booster shots). Our geolocation analysis shows that regions with lower vaccination rates often exhibit a higher prevalence of misinformation-labeled comments. For example, counties such as Ada County (Idaho), Newton County (Missouri), and Flathead County (Montana) showed both a low vaccine uptake and a high rate of false information. This study provides useful information on the many different examples of misinformation that are disseminated online. It gives us a better understanding of how people in different parts of the U.S. think about getting a COVID-19 vaccine.
| Original language | English |
|---|---|
| Article number | 748 |
| Journal | Information (Switzerland) |
| Volume | 16 |
| Issue number | 9 |
| DOIs | |
| State | Published - Sep 2025 |
Keywords
- COVID-19
- fake news
- geolocation
- misinformation
- topic modeling
- unsupervised learning
Fingerprint
Dive into the research topics of 'Mapping the Infodemic: Geolocating Reddit Users and Unsupervised Topic Modeling of COVID-19-Related Misinformation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver