this case, it comes from the factoring in of the number of cases in estimating the standard
errors. With cross-level data, not only do we have two components of variance to
account for, but (usually) those two sources of variance have different sample sizes. OLS
uses the total number of individual cases when we actually have a sample size for the
within-group variance and a sample size for the between-group variance. Since the
number of cases enters the denominator in estimating standard errors, this inflation of the
number of cases (since the total number of cases must be greater than the number of
groups, for each group must contain more than 2 cases) leads to deflated standard errors,
In sum, then, what would be the consequences of using OLS to analyze
hierarchical data? Our estimator is inefficient and will lead to both false positives and
false negatives, depending on the level of analysis and on cross-level correlations
between variables. Put simply, OLS estimates for hierarchical data are highly suspect.
33
There is actually one additional wrinkle here, in that if we use OLS to estimate group level effects with
individual level data, OLS will use the number of individual cases as its ‘n.’ For the scholars only
interested in these aggregate/group level effects, the answer is fairly simple: treat the group as the level of
analysis. If there are any correlations between individual and group level variables, this solution suffers
from an omitted variable bias. For example, if the NES systematically under-sampled women in some
years and not in others, and women’s opinions on some issue differed from men’s, then any study of the
‘mean opinion’ on that issue over time using variables that vary over time would have an omitted variable
bias problem.
39