Analyzing Text Data Indexed by a Hidden Characteristic: Method and Application to the Study of Media Bias in the U.S.
-- Abstract: Partisan bias enters political news coverage through selective coverage and biased presentation, and understanding the contribution through each channel necessitates estimating the difference in ideological leanings conditional on covering the same events. Political events, however, are an underlying characteristic that is unavailable to researchers absent large-scale hand-coding. In this paper, I propose a novel method to estimate quantities of substantive interest conditional on a hidden characteristic of the text data. The method overcomes difficulties of existing methods that are sensitive to critical tuning parameters and produce biased estimates. Applying the proposed method to the study of media bias, I find the ideological difference between media organizations is not always driven by how events are covered. Instead, the contribution of presentation bias to the total difference in ideological leanings between news articles published by media organizations varies in ways that are consistent with the structural difference between different media types.
-- Presented at APSA 2020.