Natural Sentences as Valid Units for Coded Political Texts

Published in British Journal of Political Science, 2012

Recommended citation: Thomas Daubler, Kenneth Benoit, Slava Jankin Mikhaylov, and Michael Laver (2012). "Natural Sentences as Valid Units for Coded Political Texts." British Journal of Political Science, 42(4): 937-951.

All methods for analyzing text require the identification of a fundamental unit of analysis. In expert-coded content analysis schemes such as the Comparative Manifesto Project, this unit is the ‘quasi-sentence’: a natural sentence or a part of a sentence judged by the coder to have an independent component of meaning. Because they are subjective constructs identified by individual coders, however, quasi-sentences make text analysis fundamentally unreliable. The justification for quasi-sentences is a supposed gain in coding validity. We show that this justification is unfounded: using quasi-sentences does not produce valuable additional information in characterizing substantive political content. Using natural sentences as text units, by contrast, delivers perfectly reliable unitization with no measurable loss in content validity of the resulting estimates.

Download paper here