Pr(y=0)
Pr(y=1)
Pr(y=3)
Pr(y=5)
Pr(y=9)
Pr(y=10)
Pr(y>=13)
Censored at 1
0.0
0.4
0.8
0.0
0.4
0.8
simulated
data
0.00
0.15
0.30
0.00
0.15
0.30
simulated
data
0.00
0.10
0.00
0.10
simulated
data
0.00
0.04
0.08
0.00
0.04
0.08
simulated
data
0.000
0.015
0.030
0.000
0.015
0.030
simulated
data
0.00
0.03
0.06
0.00
0.03
0.06
simulated
data
0.00
0.10
0.20
0.00
0.10
0.20
simulated
data
Censored at 3
0.0
0.4
0.8
0.0
0.4
0.8
simulated
data
0.00
0.15
0.30
0.00
0.15
0.30
simulated
data
0.00
0.10
0.00
0.10
simulated
data
0.00
0.04
0.08
0.00
0.04
0.08
simulated
data
0.000
0.015
0.000
0.015
simulated
data
0.00
0.03
0.06
0.00
0.03
0.06
simulated
data
0.00
0.02
0.04
0.00
0.02
0.04
simulated
data
Censored at 5
0.0
0.4
0.8
0.0
0.4
0.8
simulated
data
0.00
0.15
0.30
0.00
0.15
0.30
simulated
data
0.00
0.10
0.00
0.10
simulated
data
0.00
0.04
0.08
0.00
0.04
0.08
simulated
data
0.000
0.015
0.000
0.015
simulated
data
0.00
0.03
0.06
0.00
0.03
0.06
simulated
data
0.00
0.02
0.04
0.00
0.02
0.04
simulated
data
No censoring
0.0
0.4
0.8
0.0
0.4
0.8
simulated
data
0.00
0.15
0.30
0.00
0.15
0.30
simulated
data
0.00
0.10
0.00
0.10
simulated
data
0.00
0.04
0.08
0.00
0.04
0.08
simulated
data
0.000
0.015
0.030
0.000
0.015
0.030
simulated
data
0.00
0.03
0.06
0.00
0.03
0.06
simulated
data
0.00
0.02
0.04
0.06
0.00
0.02
0.04
0.06
simulated
data
Figure 10: Model checking graphs: observed vs. expected proportions of responses y
ik
of 0, 1, 3, 5, 9, 10, and
≥ 13. Each row of plots compares actual data to the estimate from one of four fitted models. The bottom row
shows our main model, and the top three rows show models fit censoring the data at 1, 3, and 5, as explained in
Section 5. In each plot, each dot represents a subpopulation, with names in gray, non-names in black, and 95%
posterior intervals indicated by horizonal lines.
are able to focus on the relative number of prisoners known, without being distracted by the total network
size of each respondent (which we have separately analyzed in Figure 3).
4.6
Posterior predictive checking
We can also check the quality of the overdispersed model by comparing posterior predictive simulations
from the fitted model to the data (see, e.g., Gelman et al., 2003, chapter 6). We create a set of predictive
simulations by sampling new data y
ik
independently from the negative binomial distributions given the
parameter vectors α, β, ω drawn from the posterior simulations already calculated. We can then examine
various aspects of the real and simulated data, as illustrated in Figure 10. For now, just look at the bottom
row of graphs in the figure; we return in Section 5 to the top three rows. For each subpopulation k, we
compute the proportion of the 1370 respondents for which y
ik
= 0, y
ik
= 1, y
ik
= 3, and so forth. These
values are then compared to posterior predictive simulations under the model. On the whole, the model fits
the aggregate counts fairly well but tends to under-predict the proportion of respondents who know exactly
one person in a category. In addition, the data and predicted values for y = 9 and y = 10 show the artifact
that persons are more likely to answer with round numbers (which can also be seen in the histograms in
19