Answering a Big Question with a Small Sample: Estimating the Size of the Overseas Citizen Population

The Defense Department’s Federal Voting Assistance Program (FVAP) is charged with ensuring that active duty military and U.S. citizens residing overseas have the opportunity to vote is U.S. elections. While the U.S. government keeps track of the size and location of active duty military, similar information is not available for overseas civilians. In an attempt to address this lack of information about this important population, FVAP tasked FMG with developing estimates of the size of the overseas civilian population by country. In this blog, I will briefly discuss the approach FMG used to develop these estimates as well as the results. It is FMG’s hope that general approach described here can provide guidance to researchers who face similar problems with respect to developing predictive models of complex phenomena with a limited number of observations.

Our Methodology

There are serious challenges in developing such estimates. Most information that currently exists about this population comes from estimates or counts produced by foreign governments, with a single estimate of the size of the American population for a given country in a given year. Many countries do not provide information about the size of their American populations, and the countries that do provide estimates are typically more highly developed than those which do. These systematic differences between countries with and without estimates and the relatively small number of countries with estimates prevented us from simply using the average of the foreign government estimates as an estimate for those countries without estimates or matching countries with and without estimates to impute missing values.

To address issues related to measurement error and a lack of comparability in foreign government estimates and the dissimilarity between countries with and without estimates, FMG used a model-based approach do develop overseas citizen estimates. Specifically, the relationships between the foreign government estimates and various characteristics of the foreign countries were estimated using a regression model. These relationships were used to generate predictions of the size for all countries in our sample, including those without a foreign government estimate. Rather than simply creating a single model incorporating all country characteristics that could potentially influence the size of the foreign government estimate, multiple models were estimated using every combination of the characteristics identified in the academic migration literature. For each country, a final estimate was generated by taking a weighted average of the predictions from particular models. The weights reflected the ability of the model to explain variation in the size of overseas citizen populations that was not explained by other models.

Our Results

Using this model-based approach, FMG was able to produce annual estimates for the overwhelming majority of independent countries for the period 2000-2010. Resulting estimates by year and State Department region are presented in Table 1. FMG estimates that the global overseas civilian population increased from approximately 2.7 million in 2000 to 4.3 million in 2010, an increase of approximately 60%. In 2010, approximately half of this population was located in the Western Hemisphere and a third in Europe. However, the Middle East, South/Central Asia and Africa, while containing a relatively small fraction of overseas citizens throughout the 2000-2010period, also saw faster rates of growth than the two leading regions.

Given the small number of countries with estimates that could be used to estimate the models on which FMG’s these results are based, there is still a substantial degree of uncertainty in the estimates, which should be kept in mind when considering how they can be applied to policy development. The results could have been radically different if the sample of countries with estimates were different. The Technical Report released by FVAP(available here) provides more detail on the precise implementation of the model averaging methodology, sources of error in the estimates, as well as on our validation exercises.

Despite these limitations, FMG feels that the model averaging methodology has produced estimates of the size, geographic distribution, and growth of the overseas U.S. civilian population that can act as a good starting point for further research on the composition and voting behavior of this population. It also provides a case study in extracting useful information from small amounts of data using a careful model-based approach, which has potential application beyond the study of overseas Americans.

