Evaluating information-theoretic measures of word prediction in naturalistic sentence reading

Christoph Aurnhammer and Stefan L. Frank


We review information-theoretic measures of cognitive load during sentence processing that have been interpreted as approximations of word prediction effort. Two such measures, surprisal and next-word entropy, suffer from short-comings when employed for a predictive processing view. We propose a novel metric, lookahead information gain, that can overcome these short-comings. We put the different measures to the test by estimating them using probabilistic language models. Subsequently, we analyse how well the estimated measures predict human processing effort in three data sets of naturalistic sentence reading. Our results replicate the well known effect of surprisal on word reading effort, but do not indicate a role of next-word entropy or lookahead information gain. Our computational results suggest that, in a predictive processing system, the cost of predicting may not outweigh the gains. This idea poses a potential limit to the value of a predictive mechanism for the processing of language.