Research

Please check out our research projects:

Affective Response towards AI Generated Multi-modal News

Sejin Paik, Sarah Bonna, Ekaterina Novozhilova, Ge Gao, Jongin Kim, Derry Tanti Wijaya, Margrit Betke

This study explores the affective responses and newsworthiness perceptions of generative AI for visual journalism. While generative AI offers advantages for newsrooms in terms of producing unique images and cutting costs, the potential misuse of AI-generated news images is a cause for concern. For our study, we designed a 3-part news image codebook for affect-labeling news images based on journalism ethics and photography guidelines. We collected 200 news headlines and images retrieved from a variety of U.S. news sources on the topics of gun violence and climate change, generated corresponding news images from DALL-E 2 and asked study participants to annotate their emotional responses to the human-selected and AI-generated news images following the codebook. We also examined the impact of modality on emotions by measuring the effects of visual and textual modalities on emotional responses. The findings of this study provide insights into the quality and emotional impact of generative news images produced by humans and AI. Further, results of this work can be useful in developing technical guidelines as well as policy measures for the ethical use of generative AI systems in journalistic production. The codebook, images and annotations are made publicly available to facilitate future research in affective computing, specifically tailored to civic and public-interest journalism.

Multi-modal Emotion prediction towards Gun Violence News

Ge Gao, Sejin Paik, Carley Reardon, Yanling Zhao, Lei Guo, Prakash Ishwar, Margrit Betke, Derry Tanti Wijaya

This study created a novel dataset BU-NEmo+ and provided a benchmark for predicting people’s emotional reactions towards multi-modal (images and headlines) news content related to gun violence. In curating the dataset, we developed methods to identify news items that will trigger similar versus divergent emotional responses. All prediction models outperformed our baselines by significant margins across several metrics. News consumers and social media platforms could use our models to safeguard against manipulative news content and predict whether a post is likely to be click-bait.

Detecting frames in news headlines and its application to analyzing news framing trends surrounding U.S. gun violence

Siyi Liu, Lei Guo, Kate Mays, Margrit Betke, Derry Tanti Wijaya

This Gun Violence Frame Corpus (GVFC) was curated and annotated by journalism and communication experts. Our proposed approach sets a new state-of-the-art performance for multiclass news frame detection, significantly outperforming a recent baseline by 35.9% absolute difference in accuracy. We apply our frame detection approach in a large scale study of 88k news headlines about the coverage of gun violence in the U.S. between 2016 and 2018.

Accurate, Fast, But Not Always Cheap: Evaluating “Crowdcoding” as an Alternative Approach to Analyze Social Media Data

Lei Guo, Kate Mays, Sha Lai, Mona Jalal, Prakash Ishwar, Margrit Betke

This study evaluated the validity and efficiency of crowdcoding based on the analysis of 4,000 tweets about the 2016 U.S. presidential election. The results show that compared with the traditional quantitative content analysis, crowdcoding yielded comparably valid results and was superior in efficiency, but was more expensive under most circumstances.

Dynamic Allocation of Crowd Contributions for Sentiment Analysis during the 2016 U.S. Presidential Election

Mehrnoosh Sameki, Mattia Gentil, Kate K. Mays, Lei Guo, Margrit Betke

We explore two dynamic-allocation methods: (1) The number of workers queried to label a tweet is computed offline based on the predicted difficulty of discerning the sentiment of a particular tweet. (2) The number of crowd workers is determined online, during an iterative crowd sourcing process, based on inter-rater agreements between labels.We applied our approach to 1,000 twitter messages about the four U.S. presidential candidates Clinton, Cruz, Sanders, and Trump, collected during February 2016.

Big Social Data Analytics in Journalism and Mass Communication: Comparing Dictionary-Based Text Analysis and Unsupervised Topic Modeling

Lei Guo, Chris J. Vargo, Zixuan Pan, Weicong Ding, Prakash Ishwar

By applying two “big data” methods to make sense of the same dataset—77 million tweets about the 2012 U.S. presidential election—the study provides a starting point for scholars to evaluate the efficacy and validity of different computer-assisted methods for conducting journalism and mass communication research, especially in the area of political communication.

From Crowdsourcing to Crowdcoding: An Alternative Approach to Annotate Big Data in Communication Research

Lei Guo, Kate Mays, Sha Lai, Mona Jalal, Prakash Ishwar, Margrit Betke

Performance Comparison of Crowdworkers and NLP Tools on Named-Entity Recognition and Sentiment Analysis of Political Tweets

Mona Jalal, Kate K. Mays, Lei Guo, Margrit Betke

Our experiments show that, for our dataset of political tweets, the most accurate NER system, Google Cloud NL, performed almost on par with crowdworkers, but the most accurate ELS analysis system, TensiStrength, did not match the accuracy of crowdworkers by a large margin of more than 30 percent points.

These research have been sponsored to date by: