“How many people do you know in prison?”: Using overdispersion
in count data to estimate social structure in networks
∗
Tian Zheng
†
Matthew J. Salganik
‡
Andrew Gelman
§
June 13, 2005
Abstract
Networks—sets of objects connected by relationships—are important in a number of fields. The study
of networks has long been central to sociology, where researchers have attempted to understand the causes
and consequences of the structure of relationships in large groups of people. Using insight from previous
network research, Killworth et al. (1998a,b) and McCarty et al. (2001) developed and evaluated a method
for estimating the sizes of hard-to-count populations using network data collected from a simple random
sample of Americans. In this paper we show how, using a multilevel overdispersed Poisson regression
model, these data can also be used to estimate aspects of social structure in the population. Our work
goes beyond most previous research on networks by using variation, as well as average responses, as a
source of information. We apply our method to the McCarty et al. data and find that Americans vary
greatly in their number of acquaintances. Further, Americans show great variation in propensity to form
ties to people in some groups (e.g., males in prison, the homeless, and American Indians), but little
variation for other groups (e.g., twins, people named Michael or Nicole). We also explore other features
of these data and consider ways in which survey data can be used to estimate network structure.
Keywords: negative binomial distribution, overdispersion, sampling, social networks, social structure
1
Introduction
Recently a survey was taken of Americans, asking, among other things, “How many males do you know
incarcerated in state or federal prison?” The mean of the responses to this question was 1.0. To readers of
this journal that number may seem shockingly high. We would guess that you probably don’t know anyone
in prison. In fact, we would guess that most of your friends don’t know anyone in prison either. This number
may seem totally incompatible with your social world.
∗
We thank Peter Killworth and Chris McCarty for the survey data on which this study was based, and Francis Tuerlinckx,
Tom Snijders, Peter Bearman, Michael Sobel, and Tom DiPrete for helpful discussions. We also thank three anonymous
reviewers for their constructive suggestions. This research was supported by the National Science Foundation, a Fulbright
Fellowship, and the Netherland-America Foundation.
†
Department of Statistics, Columbia University, New York
‡
Department of Sociology, Columbia University, New York
§
Department of Statistics and Department of Political Science, Columbia University, New York
1