Political scientists have long known that "pocketbook
issues" strongly affect the fortunes of presidents and other
political leaders. Economists studying this relationship have
established that certain economic factors--such as growth in income,
inflation, and unemployment--directly affect the votes of the incumbent
party in the presidential elections. Other institutional factors, such
as the number of terms a party has occupied the oval office, also appear
to affect voting patterns. We present a set of simple "rational
voter" economic models, which includes economic and institutional
factors largely known in advance of the election. Such models typically
explain about 75 percent of the variation of the popular vote in
presidential elections since 1916. Such time series models, however, are
bedeviled by the lack of observations, since presidential elections are
only held every four years. To address this weakness, we specify and
estimate a model with two methodological improvements. First, we use
voting and economic data from the twenty largest states over elections
since 1980, estimating parameters using pooled estimation techniques.
Second, we improve the specification of the relevant variables, such as
unemployment and the incidence of "limited wars," to more
accurately reflect the motivations of contemporary voters. Our pooled
state model forecast for 2004 indicates that economic conditions favor a
narrow re-election for the incumbent President. We point out how some
elections cannot be entirely explained with a rational voter model.
Politicians have recognized that voters are interested in their
pocketbook at least since the days of the Roman Empire. (1) Indeed, in
one recent presidential campaign, the incumbent was unseated by a
challenger who made a mantra out of the phrase, "It's the
economy, stupid." Economists in recent decades have begun to
quantify the relationship between electoral support and economic
conditions. In some sense, this can be viewed as a rigorous response to
the famous question posed by another presidential challenger: "Are
you better off today than you were four years ago?"
In this paper, we start by briefly reviewing the standard approach
to forecasting the results of presidential elections on the basis of
pocketbook issues, as most comprehensively presented by Fair (2002). We
outline how a rational voter model is consistent with public choice
theory and other aspects of economic reasoning and describe a small
group of variables that has been shown to consistently explain voting
behavior in presidential elections. We note serious weaknesses in these
models, most of which can be traced to a small number of observations
over a very long time period. To address these weaknesses, we suggest
two methodological improvements, which allow for the use of
significantly more data, explicitly estimate differences in the
responses of voters from different states, and improve the specification
of variables such as unemployment and war. We then specify and estimate
a model with these innovations and present results from different
states. As a test, we compare the predicted response of voters in the 20
most populous states to the economic conditions in the year 2000 with
their actual voting behavior. Finally, we offer a conclusion about the
limits of economic rationalism in voting behavior and suggest that some
elections--such as the 1992 election, and potentially the 2004
election--are not fundamentally about economics.
The Standard Approach
Microeconomics is based on utility theory, which is founded on
axioms regarding consumer behavior. Consumers are assumed to be rational
in defined ways. For example, consumers are assumed to prefer more goods
to fewer goods and consumption now to consumption later. Upon these
axioms the whole structure of microeconomics is based, and the rest of
economics generally follows.
The advent of public choice theory has expanded the use of these
axioms into the realm of political behavior. (2) Public choice theory
provides a rigorous economic understanding for what philosophers and
peasants have known for thousands of years; politicians operate under
their own set of incentives, which are not always consistent with those
of the people they govern. It should be straightforward then to apply a
similar public choice theory to the voters. (3) Such a "rational
voter" model is simple to develop. We define an economic rational
voter model using axioms adapted from microeconomic theory, as well as a
simplified notion of voter incentives.
Voters hold the incumbent party responsible for economic
performance. This is a doubly simplifying assumption. First, in the
United States of America, a Republic with three branches of government
and a federal system, it is not entirely within the President's
power to set economic policy for all states. Second, voters could view
presidential elections as referendums on a number of issues--e.g.,
regional, personal, partisan. This assumption results in a model of
voting as a referendum on the incumbent party's record of economic
performance during the president's term in office. (4)
Voters prefer more income to less income. This straight-forward
axiom leads to a complicated data question, as there are a number of
variables that can be used to measure current income, including real
disposable income, per capita real disposable income, and median
household income. (5)
Voters prefer a less risky stream of income. We consider both
inflation and unemployment as risks to income. (6) First, a spell of
unemployment can reduce the earnings of a family to below their
expenses. The recurring focus of political advertising on Social
Security benefits indicates that, at least from the point of view of
candidates and their media advisers, voters care deeply about long-term
income security. (7) Second, we note that inflation is associated with
both economic uncertainty and with loss of real income and wealth. (8)
Thus, the components of the "misery index" of political
fame, and the "Phillips curve" of macroeconomic interest, are
components in the voter's calculus. We discuss the proper
specification for unemployment and inflation below.
Major institutional factors also govern voting behavior
The first is war. Wars have been threats to national survival and
have taken precedent in national policy for most of human history. This
includes American history; one does not need to go back too far to find
war bonds, de facto nationalization of industries during war time, and
wage and price controls. In this sense, the phenomenon of "limited
war" is a new and unusual concept. The ancients would have
scratched their heads in wonder at the United States, which, during the
first Persian Gulf War, actively debated reducing its defense budget.
Econometric models of the U.S. economy have typically used war-time
dummies to account for shifts in production during periods of war. (9)
The standard approach to predicting presidential voting behavior has
been similar. We will examine in a later section whether this needs to
be modified due to the advent of "limited war."
Testing the axioms
To test these axioms, we performed a series of exploratory data
analyses, comparing the economic and institutional factors the axioms
suggest will predict voting behavior with actual voting behavior.
Dependent variable. For the dependent variable, we use the
difference in the shares of the popular vote between the incumbent party
and the challenging party in the general election for President of the
United States. Thus, we are predicting the margin of victory or defeat,
expressed as percentage, between the major party candidates. The
selection of this measure is complicated by two factors: the likely
nonlinear relationship over the entire spectrum of potential outcomes
and the presence of third-party candidates. We deal with the former by
assuming that, within the range of outcomes for most contemporary
elections (in which the incumbent party wins or loses by less than ten
percent, and often less than five percent), the response by voters is
close to linear. (10) The latter issue is dealt with by calculating only
the vote share gained by the major party candidates, and using an
explanatory variable for significant third-party candidates. (11) We
discuss third-party candidates further below.
Income. A test of the income axiom, using the dependent variable
outlined above, is straightforward. Of the last twenty-two presidential
election years since 1916, the U.S. economy realized annual nominal GDP
growth rate more than ten percent nine times. (12) Figure 1 shows that
of those nine years, the incumbent party won seven of the presidential
Income risk: inflation. The historical record indicates that if the
price level changes rapidly it is very difficult for an incumbent party
to win the presidential election. To better assess the effect of
inflation on income security, we use the absolute value of the inflation
rate as our measure. During the Great Depression, a sharp deflation was
associated with a dramatic electoral reversal for the incumbent party,
indicating the economic fundamentals about the price level--that
unexpected price level changes and changes that result in loss of income
or wealth--are more damaging to individuals than a steady inflation
Figure 2 shows the relationship between inflation and presidential
voting. We had very high inflation rates (in absolute value, during the
election year) six out of 22 times since 1916. In four out of six times,
the incumbent party lost the seat. Even in the exception years, 1916 and
1948, high GDP growth helped incumbent parties to keep the seat.
Income risk: unemployment. Unemployment, according to the axioms,
is a threat to income, and therefore higher unemployment should directly
result in lower vote margins for the incumbent. It is straightforward to
show that during times of high unemployment (such as the beginning of
the Great Depression, and in 1980) the incumbent party is likely to lose
and that, conversely, during times of low unemployment (the Eisenhower
years, the Reagan landslide in 1984, the Clinton re-election in 1996),
the incumbent party is likely to win. The unemployment rate, however, is
strongly correlated with output and income growth rates, especially when
these are viewed on an annual basis. Therefore, using both an income and
an unemployment variable in the same equation will introduce
multicollinearity, which can result in estimation problems. Furthermore,
economic theory indicates that there is a "natural" rate of
unemployment and therefore implies that voters do not penalize a
president that keeps the unemployment rate near its natural rate. We
return to this issue below.
Record of rational voter models
The rational voter models that have been used in the past, such as
those comprehensively described in Fair (2002) and presented recently by
Anderson and Geckil (2004), have a fairly impressive ability to predict
voting behavior, given that they rely solely on economic and
Fair Model Specification. (13) The Fair model predicts the vote
share of the incumbent party, using a linear model with the following
* Per capita growth rate of real GDP during the entire term,
* Average inflation rate over the 15 quarters prior to the
* "Good news" quarters (number of quarters out of the 15
quarters before the election with per capita growth rate of real GDP of
3.2 percent or more),
* "President running" dummy (if the president is running
* A "duration" variable (the number of terms the
presidency has been held by the incumbent party),
* A party dummy variable, which varies with the party in office,
* A (full-scale) war dummy variable, which is one in years of World
War I, World War II, the Korean War, and the Vietnam War.
A model based on the Fair model was also developed by Macroeconomic
Advisers (2004). Its data refinements include using annualized percent
change of real disposable personal income over the three quarters
immediately prior to the election quarter, rather than the GDP growth
used by Fair. They also drop the "good news" quarters in favor
of housing starts, indicating that investment decisions by private
owners are a direct measure of economic confidence. Finally, they keep
the same party and duration dummy variables. The predictive results
improve slightly, even though the number of explanatory variables is
Anderson and Geckil National Model Specification. The authors
previously developed a national model to model predict the difference in
vote share between the incumbent and the challenging party. This model,
like those of Fair and Macroeconomic Advisers, is based on national
election results, using a linear model with a smaller list of somewhat
different variables (Anderson and Geckil, 2004):
VOTEDIFF (the dependent variable): Difference between the percent
of votes received by the incumbent party and the percent of votes
received by the major challenging party. This is typically the margin of
victory, ignoring the share of votes received by other parties.
GDP_LAST: Annual GDP growth rate during the last year of the
presidential term (the election year).
U_LAST: Unemployment rate of the last year of the presidential term
INF_LAST: Absolute value change in the CPI during the last year of
the presidential term.
INCMBTHRD: Dummy variable that indicates a significant third party
candidate, sharing a similar philosophy with the incumbent party,
challenging the incumbent party (e.g., Ross Perot in 1992 and Ralph
Nader in 2000). (15)
CHLGTHRD: Dummy variable that indicates a significant third party
candidate, sharing a similar philosophy with the challenger party.
WAR: Dummy variable indicating a limited war (e.g., Vietnam and
Gulf Wars). (16)
Results in 2000. The results for these two models are fairly
similar, so we consider each one from a different perspective. Table 1
shows the estimated Fair model, and its prediction for the 2000
The coefficients all have the expected signs, with positive
coefficients for growth rate and "good news" quarters
(reflecting the number of recent quarters with high growth); and
negative coefficients for inflation, and having the same party stay in
office beyond two terms. The war dummy variable is positive, reflecting
the traditional "rally around the commander" pattern of
Americans in pre-Vietnam conflicts. For the 2000 election, using actual
values, the Fair model predicts the incumbent party's vote share to
about one percent of the actual result, which was a nearly 50-50 split.
We review the Anderson-Geckil national model next, but note that it also
predicts the 2000 election to within about one percent of the actual
Results over Time: Anderson-Geckil National Model. The estimate is
shown in Table 2, with the variables defined above.
The estimated coefficient on GDP growth is positive, as expected.
The coefficients on institutional factors such as third-party
challengers are also as expected. Inflation, shown as the absolute value
of the rate, also has a negative coefficient.
A surprise lurks in the unemployment rate, however. After the role
of income growth (represented by the GDP variable) is taken into
account, as well as inflation, the unemployment rate does not have a
statistically significant coefficient. Indeed, the estimated coefficient
is positive. Another surprise lurks in the limited war dummy. If we
analyze the impact of "limited" wars, such as the Persian Gulf
War in 1992 and the Vietnam War, the coefficient in this equation
indicates that such wars produce a reduction in electoral support for
the president's party rather than the increase found for full-scale
wars during most of the century.
The Anderson-Geckil model was estimating for the years 1860 through
2000, producing a fairly consistent result, shown in Figure 3 for
1916-2004. (17) For elections before 1920, the model does explain a
large part of the variation in voter behavior but can't be relied
upon to suggest how voters will decide elections that are reasonably
close. Furthermore, the swings upward and downward in voter preferences
are not well captured by the model. On the other hand, from 1920 through
2000, this model, like the Fair model, consistently predicts the
majority of the variation in electoral behavior.
Conclusion: standard models. We conclude that a standard rational
voter model, using data from 1920-2000, is a useful guide to electoral
behavior. In particular:
1) About three-quarters of voter behavior can be explained by these
models. To be more specific, both the Fair and A-G models have an
R-squared statistic from a standard OLS regression of between .70 to
2) This explanatory power is accomplished without using any
3) The approach is fairly robust to minor specification
differences, when the objective is the overall predictive capability,
rather than "predicting" specific individual elections.
4) For most elections, 16 out of 21, the model predicts a clear
winner, and the voters elect that candidate. A simple batting average
test, which gives points only for correctly picking the winner even if
the margin is grossly mis-estimated, produces the observation that in 18
out of the 21 past elections, these models predict correctly who will be
the winner. (18)
5) Both models typically capture the amplitude in voter swings as
well as the direction. For example, both of the models show large
electoral majorities for Franklin D. Roosevelt during the
Depression/World War II years, anticipate close elections in 2000, and
properly predict victories by small margins in years such as 1952 and
6) Both models do not correctly predict voter behavior in the years
1976 and 1992. In both cases, an incumbent president is not reelected,
even though institutional and economic factors suggest that he should
7) The estimation of parameters, and the use of these parameters
for predictions, are hampered by two structural problems: First, there
are simply too few observations to estimate all the coefficients; a
1920-2000 presidential election data set includes observations from only
21 elections, which may be needed to estimate 5 or more coefficients.
Second, we expect that voters' reaction to economic conditions
would change over time. This time-varying preference might be ignored
over one to three decades; it cannot be ignored over an entire century.
There are other models incorporating economic information that we
do not consider comparable. These models include sentiment variables,
such as approval ratings from polls. Examples include Wlezian and
Erikson (1996), Wlezian (2000), and many of those summarized in Greene
(1993). While adding sentiment variables improves the forecasting
performance, it undermines the microeconomic motivation for the rational
economic voter model. Viewed from the perspective of determining what
effect pocketbook issues have on voters, such models have circular
reasoning: one set of sentiment indicators (poll results, political
support) predicts another expression of sentiment (the election
results), with the aid of economic information that also affects
sentiment. These models could be useful in at least two ways: first,
interpreting how new economic information translates into changes in
voter sentiment over time; and second, as pure predictive models.
Perhaps the most serious effort to develop a rigorous economic
model that included sentiment variables is described in Stambough and
Thorson (1999). Their model uses a principal-components approach to
isolate the underlying economic conditions and overcome
multicollinearity problems. They also use a form of pooled data, which
anticipates the innovation described in this paper for a contemporary
model. (19) The Stambough and Thorson model includes past electoral
behavior in a state, as well as indicators of political support for
candidates. It therefore is not a pocketbook model, but instead one in
which the pocketbook variables supplement information from polls and
other expressions of political support.
In addition, there are numerous back-of-the-envelope models based
on stock-market returns, consumer confidence ratings, or interpretations
of poll results. These, in turn, are supplemented by even more ersatz
models based on comparative candidate height and the league winning the
World Series. These run the gamut from the purely entertaining to the
modestly useful from the perspective of wagering on the outcome of the
election. For the same reasons we rejected the use of sentiment
variables in our pocketbook model, we did not include sunspot activity
and other events that correlate, one way or another, with electoral
Innovations in Pocketbook Models
Addressing problems in past models
To address some of the weaknesses of the past models, we suggest
two sets of innovations: the use of pooled data from multiple states and
Use of pooled data. The key econometric challenge in national
pocketbook models is too few observations. This, combined with
structural changes in the relationship over the past century, has made
it difficult to properly estimate the parameters of an equation
specifying how economic conditions affect voter behavior. The best
method of attacking this problem is to bring more data to the subject.
This can be done by treating elections as generating a pooled set of
data on election returns for states. Thus, each election is a
cross-section of data from one time period. Multiple elections over a
handful of decades then produce a time series set of these cross
sections. By limiting the elections we consider to those occurring in a
specific era, we reduce (but do not eliminate) the effects of structural
changes over time. This econometric insight dovetails with the
observation that voters in individual states may react quite differently
to economic news. In particular, we use data from the six elections from
1980 through 2000, for the 20 largest states by population. This
produces 120 observations of the dependent vote share variable, rather
than the 22 we have when using time series observations from 1916
Improved specification. In addition, with insights from Anderson
and Geckil (2004) and Macroeconomic Advisers (2004), we attempt to
refine our specification. From our national model, we adopt as variables
the absolute value of the inflation rate to account for the negative
feedback about deflation and a "limited war" dummy variable.
For a national model, we believe the Macroeconomic Advisers selection of
an improved disposable income measure enhances the equation, but at the
time we estimated the model, we did not realize that similar data were
available by state. In addition, we use the deviation from the
"natural" unemployment rate as our indicator, rather than its
absolute value. Finally, although it is clear that adding sentiment
variables such as polling or recent election results would improve the
forecasting accuracy of the model, we restricted the type of variables
to economic and major institutional factors.
Estimating a pooled state data election model
Data. An ideal pooled data set would contain individual and
different explanatory variables for each subject in the cross section,
for each time period. We approach this ideal, but do not reach it. We
assemble a data set that includes separate observations on unemployment,
as well as a state-level income variable, for all states. We also
include dummy variables for third-party candidates that can vary between
For income growth, we used the annual change in gross state product
during the election year. We note this is not precisely income, and will
discuss the point further in the econometric section, below. For
third-party candidates, the use of individual state data allows for a
more precise specification. States vary in the restrictions they place
on third-party ballot access, so we are better able to specify when a
significant third-party candidate participates by using state data than
national data. (20) For example, in three of the major states, a
third-party challenger to the incumbent did not gain ballot access
during the entire sample period; all states had third party candidates
on the challenger side. For state-level price data, we used the CPI for
large metropolitan areas in each state, if available. For example, for
California, we used the average CPI for Los Angeles, San Diego, and San
Francisco. For Indiana, no metropolitan CPI was available, so we used
the average of Cincinnati and Chicago. The CPI has well-known
limitations as a price level indicator, and city CPI data would only
partially reflect the actual inflation rate experienced by residents
across the state. (21) For unemployment by state, we had excellent data
by state from the U.S. Bureau of Labor Statistics. In the equation, we
use the deviation from a natural unemployment rate of 3.5 percent. (22)
For the limited war dummy, we use the same variable for all states.
Although we did not include it in the general model, it is clear that
the "home state" advantage for a candidate is a powerful
factor, and one that could almost be considered an institutional one.
While this is clearly a richer data set than used in the national
time-series models, there is a systemic problem with much of these data.
For the income and price data in particular, the different observations
across states are strongly correlated. Even with the efforts of the
state and national statistical agencies to improve state-level data,
some components of these variables are shared across all states. (24)
Thus, state-level data bring more information to the subject but are
likely to have measurement errors that are strongly correlated across
Econometrics. Estimating models using pooled data is tricky. Such
models can potentially have additional constants for every subject in
the cross-section, as well as interaction variables for each of the
explanatory variables, to account for different responses in each
subject. (25) Attempting to estimate all the possible multiple
interaction variables and varying constants would quickly use up the
additional degrees-of-freedom obtained by assembling the pooled data
set. Given the strong correlation of data such as income and employment
across subject states, we are likely to face multicollinearity if we
attempt to estimate too many parameters. This is compounded by the
observation that the measurement errors in these data are also
correlated across states. Indeed, trial estimation runs with a large
number of state-varying parameters produced warnings of singularity,
signaling that multicollinearity in the data is a serious problem.
We, therefore, follow the classic approach of restricting the
number of coefficients we attempt to estimate. In particular, we allow
for varying constants across all states, but do not estimate varying
coefficients for each explanatory variable in our general pooled state
Estimation results. The results of our estimation of the pooled
state model are shown in Table 3. We allow the constant term to vary
among the various states. (26) This allows for some states to appear to
be more "incumbent friendly" than others. Because the equation
is estimated over periods in which both Republicans and Democrats were
in the White House, however, the model does not reflect the partisan
leanings of the various states.
The coefficients on the behavioral variables are restricted to be
the same across all states, and can be compared directly with the
related national model results shown in Table 2. (27) Note that all of
the behavioral coefficients in the national and state models have the
same sign and similar magnitude.
It should be noted that the Eviews software we used for this pooled
data estimation does not produce standard errors or t-statistics for the
state-varying constants. We estimated them using an alternate technique
and found most of them to be statistically insignificant. This may be
due to the multicollinearity noted above, which is common in panel data.
(These statistics and other supplemental information are available from
the authors.) Regardless of the statistical significance of the
constants, the practical usefulness of the varying constants is
demonstrated in the test described below, using the 2000 election
Using additional data from multiple states over a much shorter
period produced an equation with almost the same explanatory power as
the national model. In both the long-term national model, and the
single-era pooled state model, the R-squared statistic indicates that
about 75 percent of the variation in voting can be explained by a small
number of economic and institutional variables. (28) The residuals from
the regression for all 20 states across the 1980-2000 do not appear to
have any particular common patterns. (29)
Variation across states. These results suggest that some states are
historically more "incumbent friendly." The different
constants estimated for each state suggest that, for example, voters in
New York, New Jersey, and Massachusetts have given incumbents a leeway
of two to five percent more than the voter in a typically large state.
We further tested the power of the model to capture how economic
conditions in separate states affect voting behavior. For the year 2000,
we tabulated the model's output for individual states. Note that
these are not 20 individual state estimates, but individual state
differences derived from the estimation of a national model. (30) These
"economic vote factors" should indicate how the state
incumbent advantage and the national patterns of voter response to
state-varying economic conditions affect the vote margin in each state.
We then compare these factors with the actual vote margins in these
states. If the model had no state-level predictive power, we would
expect the correlation between these data to be around zero. In the year
2000, the correlation between the economic vote factors by state and the
actual vote margins was 0.67.
Unemployment, growth, and the incumbent advantage. We observe that
the estimated coefficient on the deviation from natural unemployment is
positive. This appears to run counter to our axioms, as a more risky
stream of income should encourage voters to reject the incumbent party.
Further review is required.
First, we include in the equation both an income growth variable
(in the pooled equation, gross state product growth) and an unemployment
variable. These are two explanatory variables with a strong correlation,
which leads to less precision in the estimates and makes the equation
sensitive to small differences in variable definitions. Second, repeated
trial regressions using slightly different specifications confirm that
once income is included in the pooled state equation, the additional
explanatory power of an unemployment variable becomes small, and in some
cases, statistically insignificant. This is the case with our national
model shown in Table 2. Perhaps for this reason, the Fair model shown in
Table 1, and closely related Macroeconomic Advisers model, include an
additional growth variable and omit unemployment entirely. If an
unemployment variable is included, its specification directly affects
the equation's estimated incumbent advantage. Equations that use
the absolute level often have a high constant variable. For example, our
national model shown in Table 2 indicates a fairly large value for the
constant If one transforms the unemployment rate by subtracting the
natural rate, the constant should shift closer to zero. This more
correctly indicates a small incumbent advantage.
Historic state preferences. It is important to note that these
results were obtained without any sentiment variables, and also without
taking into account the obvious fact that some states are more likely to
vote for one party or the other. The sample period here includes
elections in which both Republicans (1980, 1984 and 1988) and Democrats
(1992 and 1996) won repeated elections. The simple predictive power
would certainly be improved by including a state-level variable for the
historic partisan preferences of voters in each state. (31) This may be
a useful extension of the research presented here and could be
considered a primarily pocketbook model if there were only a single,
historical preference variable applied across all periods for each
state, rather than a group of sentiment variables.
Prediction for 2004
Factors missing in the models
In the discussion above, we reviewed the record of the national
time-series model described by Fair (2002), as well as the variant
published by Macroeconomic Advisers (2004) and a somewhat different
model presented by Anderson and Geckil (2004). These models, as well as
a number of variations analyzed by the authors in the research leading
to this article, do surprisingly well in predicting the variation in the
vote for the incumbent party, considering that they do not include any
sentiment variables such as poll results. As a confirmation of this, all
three models predict the very close 2000 election to within about one
percent of the actual vote margin.
These models typically fail to predict the election results in 1976
and 1992, in each case forecasting that the relatively positive economic
conditions would result in an incumbent victory. Repeated efforts to
improve the model by refining the economic variables have been reported
in Fair (2002), criticized by Greene (1993), and replicated by the
authors. Fair's use of "good news" variables, for
example, is a straightforward attempt to make economic conditions
explain what (to economists, at least) appear to be recalcitrant voter
behavior. A better explanation is necessary. A fairly obvious one, given
the amount of attention in political campaigns devoted to non-economic
issues, is that voters are not driven entirely by economics. There are
strong axioms of microeconomics which, adapted to this problem in a
manner of analysis familiar to Public Choice students, produce the
theoretical grounds for a rational economic voter model. These models,
in both time series and pooled cross-section-time series versions,
provide a very powerful explanation for well over half of the variation
in voting behavior.
For models that eschew sentiment variables, such predictive power
(often resulting in R-square statistics of 75 percent or better, and
"batting average" metrics of 18-out-of-21 correct predictions)
should be accepted as nearing the limit of what social scientists can do
with unpredictable human behavior. Furthermore, the "wrong"
elections provide us with strong clues as to what non-economic variables
are most important. In 1976, Watergate, the resignation of President
Nixon, and voter revulsion over the scandal were clearly dominant
themes. In 1992, the end of the Cold War changed the role of the U.S.
As pointed out by an eminent national pollster, once the Cold War
was ended, voters were no longer required to pick a President knowing
his conduct could be tested by Cold War confrontations like the Cuban
missile crisis. (33) Thus, the elections from 1992 through 2000 were
conducted during an interregnum that removed from many voters minds the
imperative of selecting a person with a "commander in chief"
character that probably seemed vital to the generations that voted
during the eras of World War II and the Cold War.
Tragically, the terrorism of September 11, 2001 ended that
interregnum. Thus, future elections--particularly the 2004 election--are
likely to be decided on larger factors than economics.
National presidential election prediction for 2004
With this observation in mind, we nonetheless use the model to
forecast the 2004 election results, using both the national and pooled
state economic models. Using the national model and assumptions about
2004 economic conditions--but again ignoring partisanship and sentiment
indicators--the pocketbook prediction for 2004 is the following:
* The popular vote winner will be the Republican Party in 2004 by a
narrow 0.5 percent margin.
* The strongest influence on the results is the relatively strong
income and employment gains for the last two years, along with a low
* The painful limited war going on in Iraq is an institutional
factor that will reduce the support of the incumbent party.
* The only significant third party candidate is likely to be Ralph
Nader, who will be on the challenging party's side.
By comparison, Fair (2002, 2004) expects a stronger Bush victory,
as do Macroeconomic Advisers (2004). Of course, all these predictions
are based on economic conditions assumptions that may prove to be
incorrect, and are based on models that explicitly include only economic
and institutional variables. Using the state pooled-data model, which
takes into account varying state economic conditions and accounts for
states with about 76 percent of the nation's population, the
pocketbook prediction is also for a narrow 0.1 percent margin of victory
for the incumbent candidate in the largest states. (34)
Although these economic models, including our own, anticipate a
Bush re-election, we note the conclusion drawn above: voters do not only
care about economics. In particular, during times when the world order
becomes more dangerous, their dominant influences appear to be related
to national interest and national survival, rather than improving their
current economic condition. It seems apparent, even to economists that
insist on keeping poll data out of their pocketbook models, that 2004 is
such a year.
Our results indicate that voters do respond to economic conditions
in a manner consistent with a small set of axioms drawn from
microeconomics. Furthermore, rational voter models can be specified and
estimated in various forms that predict about 75 percent of the
variation in vote over elections from 1916 through 2000. Our results
show that a national model using pooled state data, estimated over a
shorter period of time, can avoid some of the econometric problems that
have hampered past time-series models. Using this model, we find that
voters in different states vary in their responses to both economic
factors and institutional factors and are significantly different in
"incumbent friendliness." As a test of our analysis, we
estimate the economic contribution to the state-by-state vote in the
2000 election and find a positive correlation of approximately 67
percent with the actual vote margins. Our pooled state model forecast
for 2004 indicates that economic conditions favor a narrow re-election
for the incumbent President. We point out how some elections--such as
the loss of George Bush to Bill Clinton in 1992, as well as the upcoming
2004 election--cannot be entirely explained with a rational voter model.
In dangerous times, voters appear to place more emphasis on national
survival and national interest than on current economic conditions.
Thus, our research both demonstrates the power of economics to predict
voting behavior, and its limitations as well.
This paper won the Edmund A. Mennis Contributed Paper Award
presented at the NABE Annual Meeting, October 3, 2004. PNC Financial
Services Group sponsored the award.
(1) "The people that once bestowed commands, consulships,
legions, and all else, now concerns itself no more, and longs eagerly
for just two things--bread and circuses," complained the
first-century Roman writer Juvenal.
(2) Public choice theory is most associated with James Buchanan,
who won the Nobel Prize for Economic Science in 1986 for his pioneering
role in establishing contemporary public choice theory. He joined with
Gordon Tullock to author the classic Calculus of Consent in 1962.
(3) We avoid here the contentious mathematical analysis of whether
individual voters can expect to control policy through voting, a matter
dealt with by another Nobel Prize-winning economist, Kenneth Arrow, in
his "social choice" analysis. His famous "impossibility
theorem" implies that, in many cases, it is impossible for
straightforward voting to result in the best outcome for everyone. See
Shaw (1999) for an accessible discussion of both public choice and
social choice theory and full references in the endnotes.
(4) Much of social research on political preferences is based on a
broader, though similar, assumption that elections are a referendum on
the performance of the incumbents, especially executives such as
governors and presidents. For this reason, the "do you approve of
the job [elected official] is doing" and "do you think the
country is on the right track or wrong track" questions are a
fixture in political polling.
(5) We briefly consider below the role of asset returns in this
question. We reject the notion of including stock market trends in the
model, because the stock market reflects discounted expectations about
(6) The "risk to income" we consider here is the risk
that a household may not earn enough income to pay its basic living
expenses. The risk tolerance of workers toward such earnings is likely
much different than that shown by investors toward their portfolios.
(7) An interesting question is whether voters evaluate
"permanent income" in any quantifiable sense. The rise of the
"investor class" suggests that this should be the case and
that both returns on investment assets and long-term expected retirement
benefits would be factors in the voter's economic calculus.
(8) The uncertainty in income arises in both the policy changes
that often follow increases in the inflation rate--wage and price
controls, for example--and the negative effects on the real economy
associated with growing inflation. Additional factors are the implicit
taxation that occurs on assets whose nominal value has increased and the
higher nominal taxes due to "bracket creep" in many income tax
(9) The pioneering macroeconometric models illustrate this most
clearly. The original Klein model was only for interwar years; the later
Klein-Goldberger model for the years 1929-1952, and omitted the war
years (1942-1945) entirely. The Wharton model of the 1970s was a
descendent of the Klein-Goldberger model, and similarly featured a
wartime dummy variable. Later models, such as the Brookings and DRI
models of the 1960's and 1970's, featured over a hundred
variables, reducing the importance of any one variable. See Intriligator
(1978) and extensive references there.
(10) We experimented with truncating the data, which reduced the
residuals in landslide election years, and could have transformed the
variable using a log, logit, probit, or similar model. The models
described below, however, are pure linear models.
(11) The third-party candidate vote totals were especially strong
earlier in the 20th century, when Teddy Roosevelt's "Bull
Moose" party challenged the incumbent of his own party and when
socialist parties had especially strong followings.
(12) For the equations, we will use a more specific income measure.
GDP, however, is a handy and consistent series for use in the nearly
century-long analysis presented here.
(13) This is the specification in Fair (2002). Fair periodically
revises his model: see, e.g., Fair (2004).
(14) Using a short sample of 1952-2000, the equation has a smaller
standard error, and a similarly high (.97) R-squared statistic.
(15) There is considerable subjectivity in defining a significant
third-party candidate. One indicator we used was the number of states on
which the candidate appeared on the ballot. Of course, to some extent
the significance of the third-party candidate is only apparent after the
election. Furthermore, third-party candidates can take votes from both
major-party candidates; we chose the party from which the candidate was
predominantly taking votes.
(16) The definition of "limited" war is also subjective,
as is the timing. We considered wars with mobilization of the populace
of lower intensity than the Korean War to be "limited."
(17) Some variables were not available in the same series this far
back in history, so some splicing or substitution was necessary.
(18) Depending on how one counts the 2000 election, predicted to
within about one percent by both models, the result could be 19 out of
(19) The estimation algorithm used was not described in the paper,
but it appears some type of varying constants or coefficients were
allowed for individual states.
(20) Leip (2004) was the source for much of these data.
(21) In particular, the overweighting of housing price data would
tend to exaggerate the price changes faced by the residents that do not
purchase a new home; the use of a "consumer" basket will also
not match the price level effects across the entire economy. This latter
bias may, however, make the variable a better indicator of how most
voters perceive inflation's effect on their lives.
(22) A potential improvement to this approach would be to estimate
the natural unemployment rates of individual states. However, since we
are using the arithmetic difference from the natural unemployment rate
as the explanatory variable, such an enhancement would largely affect
the estimated constants. See the discussion of unemployment and the
incumbent advantage below.
(23) We do consider this when testing the model results against the
2000 election and for forecasting the 2004 election.
(24) For example, income at the state level must reflect import and
export activity, which must be apportioned across all states; survey
data on labor market activity will invariably reflect fewer observations
in individual states than in the nation as a whole; and price level data
from indicative markets are invariably used, in both the government and
private sectors, as national prices.
(25) In addition, the presence of serial correlation and
heteroskedasticity in the disturbances, which is often a problem even in
non-pooled data, can create an even more difficult estimating
(26) This is equivalent to having a dummy variable for each state;
see the estimation results reported in the table.
(27) There are some differences in the models. In particular, the
income variables are different, and the national model used the
unemployment rate rather than the difference from the natural rate; see
the discussion below under "unemployment rate and the incumbent
(28) The R-squared statistic should be interpreted with some
caution in the pooled regression, but given the relatively small number
of state-varying coefficients, the caution does not undermine the
observation that, within a reasonable margin, both models explain about
75 percent of the variation.
(29) Graphs of these residuals can be obtained from the authors on
(30) Recall that we estimated behavior coefficients nationally with
pooled but allowed for differences in the incumbent advantages among
states and differences in economic conditions. Thus, the vote factors
calculated in the table are not individual state predictions. With
additional data, some restrictions on coefficients, and an adaptation of
the model that would allow for the estimation of different coefficients
for different states, it would be possible to produce individual state
(31) If we added multiple variables of this type and further
supplemented it with political support variables, we would be following
the approach in Stambough and Thorson (1999).
(32) Somewhat less powerful examples are the landslides of Reagan
in 1984 and Johnson in 1964. These victories were aided by economics,
but were obviously motivated by other issues.
(33) David Petts, of Petts & Blumenthal, Washington DC; private
(34) The vote-difference calculation here is based on a gross
allocation of voting based on share of total population. Actual voter
participation rates will vary across states.
Anderson, Patrick L. and Ilhan K. Geckil. 2004. "Pocketbook
Predictions for Presidential Elections." Presentation at the
Hauenstein Center for Presidential Studies, Grand Valley State
University. June. Excerpts found at: http://www.gvsu.edu/hauenstein.
Arrow, Kenneth. 1951. Social Choice and Individual Values. John
Wiley and Sons.
Buchanan, James and Gordon Tullock. 1962. Calculus of Consent.
University of Michigan Press.
Economic History Services. Found at: http://www.eh.net.
Fair, Ray C. 2002. Predicting Presidential Elections and Other
Things. Stanford University Press.
Fair, Ray C. 2004.
Greene, Jay P. 1993. "Forecasting Follies: Why Political
Scientists Can't Predict Presidential Elections." The American
Prospect. 15 (Fall). Also found at:
Intriligator, Michael D. 1978. Econometric Models, Techniques &
Applications. Englewood Cliffs, NJ: Prentice-Hall.
Leip, David. 2004. U.S. Election Atlas. Website with data on past
elections found at: http://www.uselectionatlas.org.
Macroeconomic Advisers. 2004. "What Does the MA Forecast Imply
for the Presidential Election?" Macroeconomic Advisers'
Economic Outlook. January 21.
New York Public Library Desk Reference. 3rd edition. Found at:
Shaw, Jane, 1999. "Public Choice Theory." Online article.
The Library of Economics and Liberty. Found at http://www.econlib.org.
Stambough, Stephen J. and Gregory R. Thorson. 1999. "Towards
Stability in Presidential Forecasting: The Development of a Multiple
Indicator Model." The International Journal of Forecasting.
15:143-152. Also found at: http://www.mrs.umn.edu/~gthorson/pubs.htm.
U.S. Department of Commerce, Bureau of Economic Analysis.
U.S. Department of Labor. Bureau of Labor Statistics.
Wlezien, Christopher and Robert Erikson. 1996. "Temporal
Horizons and Presidential Election Forecasts." Mimeo. University of
Houston. Also found at: http://depts.washington.edu/bushgore/week3.htm.
Wlezien, Christopher. 2000. "Forecasting the Presidential
Vote, 2000." Mimeo, University of Houston. Also found at:
Patrick Anderson is the founder and principal of the consulting
firm Anderson Economic Group LLC. He has served as the chief of staff of
the Michigan Department of State and as a deputy budget director for the
State of Michigan. He was an assistant vice president of Alexander
Hamilton Life Insurance Co. and a graduate fellow for the Central
Intelligence Agency. He holds a masters degree in public policy and a
bachelors degree in political science, both from the University of
Michigan at Ann Arbor.
Ilhan Kubilay Geckil is an economist with Anderson Economic Group.
His work includes economic analysis, business valuation, strategy
development, and forecasting. He holds an MA in economics from Michigan
State University and a BA in economics from the KOC University in
TABLE 1 FAIR MODEL AND 2000 ELECTIONS
Value of Coefficient X
Variable Coefficient Variable in 2000 Value
GDP Growth rate 0.70 2.2 1.54
Inflation -0.71 1.7 -1.21
Good news quarters 0.90 7.0 6.30
President running 4.00 0.0 0.00
Duration -3.30 1.0 -3.30
Party variable -2.80 1.0 -2.80
War variable 4.70 0.0 0.00
Intercept 48.40 1.0 48.40
Incumbent Party's Vote Share 48.93
TABLE 2 ESTIMATION RESULTS OF THE ANDERSON-GECKIL NATIONAL MODEL
Variable Coefficient Std. Error t-Statistic
Intercept 0.134 0.048 2.789
GDP_LAST 0.670 0.256 2.619
U_LAST 0.041 0.413 0.098
INF_LAST -2.663 0.502 -5.303
INCMBTHRD -0.089 0.049 -1.827
CHLGTHRD 0.095 0.044 2.127
WAR -0.204 0.063 -3.246
Dependent Variable: VOTEDIFF
Adjusted R-squared 0.715
S.E. of regression 0.079
Durban-Watson stat 2.177
TABLE 3 ESTIMATION RESULTS OF THE POOLED STATE MODEL
Variable Coefficient Std. Error t-Statistic
C 0.020 0.034 0.585
GSP_GR 1.195 0.297 4.029
UN 1.526 0.537 2.842
CPI -2.049 0.220 -9.294
WAR -0.173 0.028 -6.241
INCTHRD 0.004 0.027 0.151
CHLGTHRD 0.056 0.024 2.318
Fixed Effects (Cross)
Estimation Method: Estimated Generalized Least-Squares
Total pool (balanced)
Adjusted R-squared 0.640
Durbin-Watson stat 1.497