i
ε
are assumed to be independent and identically distributed with mean 0 and common
variance
2
σ
.
15
Unlike the standard ordinary least squares (OLS) regression model, however, here
the vector of covariates
i
X
, and particularly the measure of income, is assumed to be
dependent on the
i
ε
’s. This is “simultaneity” or “endogeneity” bias, which may arise
because unmeasured factors such as, say, parental attitudes or peer-group networks may
influence both income and political attitudes. OLS estimates of
β
will be then biased by
a quantity proportional to E(
X
ε
).
16
However, under the assumptions of the model, a
vector of independent variables
i
Z
may be used to obtain consistent estimates of
β
, if
the variables in
i
Z
are correlated with the endogenous regressor(s) in
i
X
but are
independent of the error term
i
ε
. The technique of choice is called “instrumental
variables least squares” (IVLS), or equivalently, “two-stage least squares” (IISLS)
estimation (see Freedman 2005: 170-175 on the equivalence of these estimators).
The key exogenous variable in
i
Z
will be lottery winnings. Thus, lottery winnings
will be used as an instrumental variable for total income. The idea is that while
income
may be correlated with unobserved or unmeasured factors related to political attitudes,
levels of lottery winnings among survey respondents should be statistically independent
of these “omitted variables.” This is because levels of lottery winnings were randomly
assigned to respondents in the survey, and randomization should have taken care of the
confounders. Under the assumptions of the model, lottery winnings should thus be a
15
In some of the models in Doherty et al. (2005), the errors are assumed to be correlated, to take account of
possible clustering within groups of individuals who bought group tickets.
16
Namely,
.
)
(
'
)
'
(
)
ˆ
(
1
β
ε
β
β
≠
+
=
−
X
E
X
X
X
X
E
OLS