Abstract

This paper undertakes a new strategy to estimate emigration rates among US immigrants by inferring the probability of emigration using longitudinal administrative earnings data. Two groups of emigrants are evaluated separately: those who emigrate from the United States and those who leave both the United States and the Social Security system. About 1.0 to 1.5 percent of the foreign-born population emigrate from the USA every year, and between about 0.8 and 1.2 percent of foreign-born workers emigrate from the Social Security system. Regression analysis suggests that immigrants with lower earnings are more likely to emigrate and that the likelihood of emigrating from the United States increases with age, but is unchanged for those leaving the US Social Security system.

1. Introduction

One of the more difficult challenges in research on immigration issues in the United States is how to measure the rate at which foreign-born people leave the USA. The flow of immigrants in and out of the country can have important consequences for labor markets, tax revenues, income support programs, and a variety of other public policies. This paper uses longitudinal administrative earnings data to infer rates of emigration among US immigrants with earnings. Four separate administrative data files, provided by the US Social Security Administration (SSA), are merged to provide information on individual earnings histories, Social Security beneficiary status, year of birth and death, and place of birth. When merged, these four files—the Summary Earnings Records, the Detailed Earnings Records, the Numerical Identification System (Numident), and the Master Beneficiary Record—contain information on nearly 325,000 immigrants between 1978 and 2003. The basic strategy is to first identify immigrants by using the birthplace and foreign-born variables from the Numident data file. Emigration among these foreign-born workers is then inferred from longitudinal earnings patterns using the Summary and Detailed Earnings Records; immigrants who have a stream of positive earnings followed by at least two years of zero earnings and no further years of positive earnings are then assumed to have left the country.

In addition to these estimates of overall emigration among foreign-born workers, a second categorization calculates the fraction of immigrants who “emigrate” from the Social Security system. Workers who leave the United States before qualifying for Social Security benefits are considered to have emigrated from the Social Security system. Workers who leave the United States and qualify for Social Security benefits, but move to a country that does not have an agreement with the United States such that the worker can receive Social Security benefits, are also considered to have emigrated from the Social Security system. Thus, the group of individuals who emigrate from the Social Security system is a subset of the group who emigrate from the United States.

This paper is believed to be the first to make such an explicit calculation of foreign-born emigration rates and tie those results to the Social Security system. Such direct connections may be important to more accurately estimate Social Security’s finances and to better understand the distribution of revenues and outlays in the current system. Hence the analysis focuses on the migration flows of workers and not on the more general question of how migration can shape the demographic distribution of the entire population. The differences between the two groups of foreign-born emigrants—those who leave the United States (hereafter called “US emigrants”) and those who leave the United States and the Social Security system (“Social Security emigrants”)—are nontrivial. In general, the probability of ever emigrating during the sample period declines more steeply by age for Social Security participants than for the entire foreign-born working population.

In contrast to the methodology introduced here, the existing literature almost exclusively uses what is known as the “residual” methodology, in which the foreign-born population is projected from one year to some year in the future by accounting for incoming immigrants and deaths in the intervening years. By calculating the difference between this expected foreign-born population in some year and the actual population in that year, the residual method yields an estimate of the number of emigrants in the intervening period. The results from the existing literature provide a useful benchmark with which to compare the rates of emigration calculated from the methodology introduced here.

The term “emigration” is used somewhat loosely in this paper to include those workers who have recorded Social Security earnings at some point during the sample period but who, then at some later point, do not have recorded earnings. Because the estimation method relies on the earnings records of those workers whose earnings are tracked by SSA, emigration flows for many unauthorized immigrants are not captured. This would suggest a downward bias to the overall emigration rate since it is believed unauthorized immigrants have a higher rate of emigration than legal immigrants (see [1, 2]). However, because individuals may leave the labor force for reasons, either voluntary or involuntary, other than emigration (e.g., to care for a child or because of disability or retirement), this method might misclassify some workers as emigrants, suggesting an offsetting upward bias to the overall estimated emigration rate. Hence, the estimates of total emigration presented here most likely lie somewhere between existing estimates of total emigration of legal foreign-born residents and total emigration of both legal and unauthorized foreign-born residents.

Overall, the estimates in this paper suggest that between 1.0 percent and 1.5 percent of foreign-born workers emigrate from the United States every year, a range consistent with previous estimates that use the residual method. These estimates suggest that the number of foreign-born workers who emigrate from the United States in a given year doubled between the late 1970s and the late 1990s, from about 200,000 to about 400,000. A smaller portion—between about 0.8 percent and 1.2 percent of the foreign-born population—is estimated to emigrate from the country and from the Social Security system annually. These estimates suggest that the number of Social Security emigrants grew from about 150,000 to about 330,000 over the same period. The differences between the two groups arise both from whether the worker emigrates to a country that has an agreement with the United States such that the worker can receive Social Security benefits and from requirements regarding quarters of covered earnings.

2. Previous Literature

Previous estimates of foreign-born emigration rates have relied almost entirely on the “residual method,” perhaps first used in the context of estimating emigration rates by Warren and Peck [3]. Estimating rates of emigration by this method involves projecting the foreign-born population by adding estimates of new immigrants and subtracting estimated deaths to construct an expected population. The difference between that projection and the actual foreign-born population observed on the future date yields an estimate of the number of emigrants during that period. Thus, for example, Warren and Peck [3] estimate that net emigration from the USA between 1960 and 1970 of foreign-born women ages 30 to 34 is: Here equals the number of foreign-born emigrants; and represent the foreign-born population in this age cohort in 1960 and 1970; is the number of foreign-born deaths; is the survival probability of the foreign-born; and is the number of immigrants between 1960 and 1970. This approach is used to generate net emigration numbers and rates by demographic characteristics and, in some cases, length of time in the United States.1

The residual method has served as the main approach for estimating emigration rates used by the US Census Bureau in estimating population stocks and flows [4, 5]. Warren and Peck’s [3] original estimates suggested that annual emigration among the foreign-born was 114,000 during the decade of the 1960s. This figure was then increased to 133,000 based on Warren and Passel’s [6] analysis of the 1965–1980 period. Ahmed and Robinson [7] used the residual method for the 1980s, which resulted in an increase in the number of foreign-born emigrants the Census Bureau assumes in their calculations to 195,000. Mulder [8] and Mulder et al. [9] indicated an emigration estimate of 225,000 for the 1990s, but it is unclear whether this number is currently used by the Census Bureau in official estimates (see the discussion in [10]). By comparison, the US Social Security Trustees currently assume separate annual emigration rates for legal permanent residents (LPR), which includes individuals authorized to live, work, and study permanently in the USA, and “others,” a category that includes some legal immigrants, such as students and those with temporary visas, in addition to unauthorized immigrants. Over the course of their 75-year projection of Social Security revenues and outlays, the Trustees assume that 250,000 LPRs emigrate annually and that about 560,000 “other” foreign-born persons emigrate in 2010, rising to about 725,000 by 2080 [11].

Although the residual method has a variety of shortcomings, results from the existing literature have provided a solid baseline from which to compare estimates of emigration. Shortcomings of the residual method include sensitivity to differences in survey coverage between the two years of data used in the calculations [7, 12], misreporting by survey respondents [13], and mismeasurement of survival rates, especially by specific country of origin [7]. Perhaps most importantly the residual method does not allow researchers to examine changes in individual-level characteristics of the population, including those that correlate with the probability of emigrating. Combining estimates from the residual method and from the method presented below can help present a more accurate picture of the rate at which foreign-born persons emigrate from the United States.

There also exist a number of papers that examine rates of emigration from countries around the world, though most of those papers struggle with the same data constraints found in the literature that examines emigration from the United States. Hanson [14] finds that, during the 1990s, 9 percent of native Mexicans aged 16 to 25 emigrated to the United States. For Norway, Bratsberg et al. [15] estimate an average return emigration rate of about 50 percent for those having lived in the country for at least five years. For those who entered the Netherlands in 1997, Bijwaard [16] finds a return emigration rate of about 34 percent after five years of residency. Liang and Morooka [17] find that the number of people who emigrated from China more than tripled between 1995 and 2000 to over 750,000 people. Using population register and other census data, de Beer et al. [18] estimate country-to-country emigration and immigration rates for 19 European countries between 2002 and 2007. Dumont and Spielvogel [19] provide an overview of research in this area.

The rate at which foreign-born persons emigrate, and the characteristics of those emigrants, can yield information about the labor market and earnings assimilation of immigrants who choose not to emigrate. The average characteristics of immigrants who remain in the United States for long periods of time are directly affected by the fact that immigrants who fail to assimilate often return to their native countries. For instance, some of the earliest research on immigrant labor market assimilation found that the average immigrant’s earnings surpassed that of the average native-born around 10 to 15 years after immigration (see, e.g., [20]). That research relied on cross-sectional data, which Borjas [21] later suggested confounded the true impact of assimilation with changes in immigrant quality over time.2 In recent research, Lubotsky [22] and Hu [23] find that using cross-sectional data fails to account for the emigration of immigrants with low earnings, which would tend to bias the cross-sectional findings upward. Lubotsky [22], who uses administrative earnings data similar to those used here, shows that the earnings gap between the foreign-born and native-born closes by 10 to 15 percent during foreign-born workers’ first 20 years in the United States, which is about half as fast as the increase found when Lubotsky uses cross-sectional census data. Hu [23], using longitudinal data from the Health and Retirement Survey, finds slower earnings growth for Hispanic immigrants relative to cross-sectional data. Hu also finds earnings declines for non-Hispanic white immigrants, as opposed to earnings increases in the cross-sectional data.3

3. Identifying Immigrants and Emigrants in the Administrative Data

For the most part, the existing literature on foreign-born emigration uses publicly available repeated cross-sectional data. Aside from Jasso and Rosenzweig [24], Van Hook et al. [10], and Reagan and Olsen [25], few researchers have used panel data to identify rates of emigration from the USA. In this paper, identifying immigrants and emigrants involves merging four administrative datasets provided by SSA.4 The files represent a 1-percent random sample of issued Social Security numbers; the full data files are those used by SSA to award Social Security benefits to eligible workers and their families. It is important to note, however, that the immigrants identified in these administrative files are only those workers who have earnings reported to SSA; thus, not included are workers outside the Social Security covered sector, workers receiving “under-the-table” earnings, or workers who do not have a valid Social Security number. Perhaps most importantly, many unauthorized immigrants are presumably excluded from the sample. These are necessary shortcomings of the data, but, as will be seen, the estimated emigration rates are similar to those found elsewhere in the literature that generally uses the residual method. Despite these shortcomings, the method used here also enables an econometric approach at the microlevel not available using the more aggregate approach.

The first data file, the Detailed Earnings Record (DER), contains longitudinal earnings information from 1978 to 2003. The total earnings variable used here is derived from the worker’s W-2 tax form and includes wage and salary earnings, tips, self-employment income, and some deferred compensation, such as an employee’s 401(k) contributions. For the years 1978–1980, there appear to be some errors in the data recording (in particular, in some records the decimal point appears to be in the wrong position) and thus some sample statistics appear to be incorrect. Following Kopczuk et al. [26], total earnings are compared with other entries from the worker’s W-2 tax form, which do not appear to have the same recording errors.5 In cases where those earnings are exactly 1/100th of total earnings, total earnings are divided by 100. Overall, for the three years under consideration (1978, 1979, and 1980), this procedure affects less than 0.2 percent of all workers and reduces average earnings by less than $1,000.6 In any case, the strategy for identification used below requires only the existence of positive earnings; the actual measurement of earnings is a second-order concern.

The Summary Earnings Records (SERs) data file is similar to the DER and contains earnings records from 1951 to 2003. Prior to 1978, however, earnings records are capped at the Social Security taxable maximum. SER earnings are used to identify the number of years for which workers have earnings that qualify them for Social Security benefits. From 1978 to 2003, the greater of the two records (SER or DER) is used; the analysis is not sensitive to this selection.

The primary difficulty in using the SER and DER earnings data is that there is no distinction between earnings recorded as zero and earnings recorded as missing; all such records appear as zeros. Thus, for example, a person who retires at age 50 and has zero earnings will look the same as a 50-year old who emigrated.

The third data file is the Numerical Identification System (Numident), which, for purposes of this study, contains information on the worker’s date of death, place of birth, and a variable indicating whether the person was born in the United States. For the US-born, the state and city of birth are recorded; country and city of birth are recorded for the foreign born.

The fourth file is the Master Beneficiary Record (MBR), which contains a variety of information on an individual’s Social Security beneficiary status, including time of beneficiary claim, beneficiary type, and whether a claim was denied. The beneficiary variable denotes whether the person is receiving worker or auxiliary benefits and the year in which those benefits began.

Using administrative data introduces both advantages and disadvantages over publicly available survey data. The major advantage of administrative data is that earnings records are presumably more accurate than those found in survey data, because they are not subject to nonresponse bias, recall error, rounding, or any other sort of respondent error (see [27]). Furthermore, the earnings data are not imputed or top-coded in any way, thereby preserving information within the upper range of the distribution. For purposes of this study, more precise earnings values are important for the regression analysis, but less important for identifying emigrants. Finally, the data include a large number of observations and longitudinal earnings histories typically unavailable in survey data.

The data also have several disadvantages, however. First, information is available on a limited number of demographic variables. For example, educational attainment, marital status, and number of children are presumably important predictors of the probability of emigrating from the United States, but are absent from the administrative records. Further, the data used here do not include a measure of legal status, which is certainly a strong predictor of the probability of emigration (see [1, 2]). Second, the administrative files capture earnings only from workers in the covered sector—that is, workers who are actively contributing to Social Security. About 4 percent of current paid civilian workers are not in the covered sector, down from about 10 percent in 1980 and, for the most part, are employees of state or local governments [28]. Finally, the administrative data do not include earnings from cash-based employment (such as under-the-table earnings) or earnings from workers who do not have, or do not report, a valid Social Security number. Both categories may be especially important for estimating emigration rates among immigrants, who are more likely than the native born not to have—or to have invalid—Social Security numbers. This is perhaps especially true for unauthorized immigrants, and thus the estimates below likely fall between previous existing estimates of the legal immigrant population and the total (legal plus unauthorized) immigrant population.

3.1. Identifying Immigrants in the USA

The merged administrative dataset contains over 2 million total observations for individuals aged 16 to 62, representing a 1-percent sample of the US population. About 7 percent of the total population aged 16 to 62 are classified as immigrants in 1978 and that fraction rises steadily over the next two decades to about 10 percent in 1990 and to over 13 percent by 2003. These shares are roughly in line with those reported elsewhere and reflect the swift increase in the share of the US foreign-born population [29].

The sample used in this analysis is restricted to those individuals who are identified in the administrative files as born outside the United States; estimates of native-born emigration rates are not considered here (see [30]). The SSA provides a codebook of over 200 country codes to identify individual places of birth.7 A number of countries are not assigned codes, but in most cases the city names permit identification of the country of birth. Of the foreign born, only about 10 percent come from countries not listed in the SSA codebook. Those countries were identified by visual inspection and added to the existing identified list.

Table 1 presents the share of immigrants from each of the 10 countries with the largest share of immigrants in the USA in the merged administrative data (regardless of emigration status) with positive earnings in 1998. Also included in the lower panel of the table is the distribution of immigrants aggregated to different regions of the world. By far the largest share of immigrants in the USA originates from Mexico with about 22 percent of total immigrants. Large percentages originate from countries in Asia (27 percent), Europe (18 percent), and the Caribbean (11 percent). For most, these shares mirror those found in a person-weighted tabulation of the 1999 March Current Population Survey (CPS); however, the CPS data indicate that nearly 30 percent of immigrants originate from Mexico and only 13 percent from Europe. The significantly larger share of immigrants from Mexico in the CPS data may reflect the presence of unauthorized immigrants in the survey data but not in the administrative earnings data, or mislabeling in either dataset.

3.2. A Brief Description of the US Social Security System

The US Social Security system was created in 1935 and is the US government’s largest single program. Currently, 53 million people receive Social Security benefits with about 69 percent of beneficiaries receiving retirement benefits or are retired workers’ spouses or children; another 12 percent of beneficiaries are survivors of deceased workers, and the remaining 19 percent receive benefits through the Disability Insurance portion of the program.

Workers are eligible for retirement benefits if they are at least 62 years old and have earned a minimum amount of earnings for at least 10 years. That minimum amount is referred to as a “quarter of coverage” and is $1,120 in 2010; any worker who therefore earns more than $4,480 in 2010 will be credited with four quarters of coverage and, following similar earnings levels for at least 10 years, would therefore earn 40 quarters of coverage and thereby be fully eligible for Social Security benefits.8 Retirement benefits are reduced for workers who claim benefits before they reach the full retirement age, currently 66, but, under law passed in 1983, will ultimately rise to 67 for those born after 1959. Social Security revenues are collected through a 12.4-percent payroll tax, which is split evenly between workers and their employers. The payroll tax only applies to earnings up to a maximum annual amount, which is $106,800 in 2010. Social Security benefits are paid through a formula that relates average lifetime earnings through a progressive benefit formula. For a more detailed overview of the US Social Security system, the interested reader should see Congressional Budget Office [31].

3.3. Identifying Emigrants from the USA

The process of identifying foreign-born workers who subsequently emigrate relies on following the sequence of earnings over time. Two groups are identified in the analysis that follows. The first is the standard definition of emigrants and consists of those immigrant workers who leave the United States to live abroad. The second group includes those workers who “emigrate” from the Social Security system. Workers who leave the United States and do not qualify for Social Security benefits (i.e., are not current beneficiaries or have fewer than 40 quarters of covered earnings) are considered to have emigrated from the Social Security system.9 Workers who leave the United States and qualify for Social Security benefits, but move to a country that does not have an agreement with the United States such that the worker can receive Social Security benefits, are also considered to have emigrated from the Social Security system.10 Thus, the group of individuals who emigrate from the Social Security system (“Social Security emigrants”) is a subset of the group that emigrates from the United States (“US emigrants”).

Both sets of emigrants are identified as those who have at least one year of positive earnings followed by at least two years of zero earnings and no further years of positive earnings. Thus, each person can only emigrate once; multiple migration events for those individuals who move back and forth between the United States and another country are not necessarily considered to have emigrated. For example, if an individual works in the USA in year one, moves back to her home country for years two and three, and then returns to the USA and appears in the data with positive earnings in year four, she would not be counted as an emigrant. Alternatively, if that same person works in the USA in year one, moves to her home country for years two and three, and then returns to the USA but does not work (or works in the uncovered sector) thereafter, she would be counted as an emigrant.

Certain other groups of workers may also be misclassified as emigrants. Individuals who leave the workforce for voluntary or involuntary reasons for more than two years (e.g., to care for a child, become disabled, retired, are unemployed for longer than two years, or move from covered to uncovered work) and do not return to paid (covered) work would be classified as emigrants.11 Alternatively, a worker’s Social Security number may become lost, which would generate erroneous occurrences of years with zero earnings, or the worker may not report his or her earnings to the SSA for reasons relating to work or immigration status.12 In sum, because the administrative earnings records do not allow the researcher to distinguish between zero earnings (out of the workforce) and missing years of earnings (out of the country) some people who simply left the workforce may instead be classified as emigrants; hence, the estimates of emigration rates may be slightly overstated. However, this upward bias may be offset because the administrative earnings records most likely underrepresent the numbers of unauthorized immigrants and immigrants who never had covered work.13

The decision to condition on at least two years of zero earnings (as opposed to one) following at least one year of positive earnings is intended to avoid one-year transitory earnings variability or unemployment.14 Since the administrative earnings records are based on the calendar year, having two years of zero earnings requires that the person be out of the labor force for at least 24 months and thus potentially attenuates the problem of year-to-year changes in earnings and employment (see, e.g., [32]).

To avoid some of the problems associated with retirement, the sample is restricted to persons aged 16 to 62. Thus, workers who retire at age 62 and have years of zero earnings thereafter are not considered to have emigrated under this definition. Workers who retire before age 62, however, may appear to have a sequence of zero-earnings years and would be considered to have emigrated.

Conditioning on at least two consecutive zero-earnings years introduces a censoring problem for workers who have positive earnings starting in 2001. For these workers there are not enough observations after 2001 to distinguish their emigration status. Further, for workers who have positive earnings in the years leading up to 2001, emigration rates may be overstated. This may occur because there are not enough years following a sequence of “positive-zero-zero” earnings years to observe additional years of positive earnings. Thus, many of the statistics presented below are truncated at 1998, but all workers are included in the regressions.

By identifying the first year of positive earnings, the amount of time spent working in the United States before emigration can be used as an approximation for the length of time spent in the United States, a common metric found in the emigration literature. This approximation will be biased downwards relative to the true length of time spent in the USA, especially for those persons who immigrate to the United States as children and do not initially have covered earnings. For example, suppose a person immigrates to the United States at age 5, attends school through college, begins work at age 22, and then emigrates to her country of birth at age 30. Although she has lived in the United States for 25 years, the procedure used here captures 8 years of positive earnings. The approximation is also biased for individuals who have multiple migration events.15

4. Identifying Rates of Emigration

The final dataset contains nearly 325,000 immigrants whose earnings were tracked by the Social Security Administration for at least one year during the sample period. Table 2 lists summary statistics for the main variables in 1998 for both definitions of emigration (US emigration and Social Security emigration). Immigrants who later left the country at some point earned, on average, about $11,600 in 1998; average earnings among those who stayed were about $27,700.16 Average earnings of Social Security emigrants are about the same as US emigrants, at about $11,300. Social Security emigrants are, on average, the youngest of the four groups, with an average age of 36 years. Both emigrant groups include a smaller proportion of men than do the nonemigrant groups. Because the administrative records do not contain family identifiers, however, there may be interactions between men and women that are not captured in these estimates.

The annual emigration rate is calculated as the number of workers identified as emigrants in that year divided by the total number of persons aged 16 to 62 with nonmissing earnings in that year. For most years in the sample period, between 1.0 percent and 1.5 percent of immigrants with earnings tracked by the SSA emigrate from the United States each year (Figure 1).17 A slightly smaller fraction of immigrants leave the Social Security system each year, between 0.7 percent and 1.2 percent. For comparison, the figure also plots the share of immigrants with two years of consecutive zero earnings. That line is about 1.5 percentage points higher than the US emigration rate line and follows a slightly different pattern. Thus, the emigration methodology does not appear to be simply picking up earnings variability. Further, the trends appear to have a modest link to the business cycle: all three metrics decline during the economic slowdowns of 1981-1982 and 1990-1991, but rise during the shorter slowdown of the first half of 1980. The link between the business cycle and the rate of emigration might be made stronger if the administrative earnings sample used here contained a larger portion of unauthorized immigrants. Papademetriou and Terrazas [1] argue that excess demand for visas forces many migrants into illegal channels; it should then follow, they argue, that unauthorized immigrants are more responsive to changes in the economy than legal immigrants (see also [2, 14]).

4.1. Comparisons with the Existing Literature

Estimated emigration rates from a number of previous studies and those estimated here are presented in Figures 2 through 4.18 In Figure 2, emigration rates for all foreign-born persons estimated in the earlier literature are compared with emigration rates found in this study between 1978 and 1998 (repeated from Figure 1). In general, these emigration rates lie between about 0.9 percent and 1.5 percent, with a higher estimate from Van Hook et al. [10] at 2.9 percent.19 In their “middle” migration series, the Census Bureau uses an emigration rate of 1.2 percent ([4], following [7]).20

Even the slight differences in emigration rates shown in Figure 2 can lead to large differences in the number of estimated emigrants. Warren and Peck [3] estimated a total of 114,000 emigrants per year during the 1960s, which translates to a 1.18 percent average annual emigration rate. Over the course of the 1980s, Ahmed and Robinson [7] estimated that 195,000 people emigrate per year, for an emigration rate of 1.15 percent. The emigration assumptions used by the Census Bureau [4] yield legal permanent resident (LPR) emigration rates (using their middle series) of about 311,000 in 2003, up from about 250,000 in 1991.21 The Social Security Trustees estimate that 250,000 LPRs emigrate per year, while, on average, about 680,000 “other” immigrants emigrate per year [11].22 The 1 : 100 sampling framework used in this study yields emigration estimates of approximately 200,000 in 1978, rising to about 400,000 by 1998. Averages suggest that about 275,000 workers emigrated each year during the 1980s, an estimate that rises to about 380,000 during the 1990s. When even a slightly lower rate is applied to the sample (say, 1.2 percent instead of 1.4 percent in 1998), the number of emigrants falls from about 400,000 to about 336,000.

Of course, since information on immigration status (LPR, “other”) is not available in the administrative data and it is unclear how many “other” immigrants are included in the earnings data, the emigration rates estimated here are some combination of the two. Further, because unauthorized immigrants (included in the “other” category) are presumably underrepresented in the administrative earnings data, it is perhaps not surprising that the total estimated number of emigrants lies between the number of LPRs and that of total emigrants estimated by the Social Security Trustees.

In Figure 3, emigration rates for immigrants who have worked in the United States for less than 10 years are compared with rates reported in the literature.23 Nearly all of these studies find the annual emigration rate for this group to be higher than the overall rate. The sole exception is Mulder [8], who finds an annual emigration rate in the first 10 years of 0.3 percent, about a third of the 0.9 percent that she found for all foreign born. The emigration rates in Figure 3 are a bit more variable than those in Figure 2, ranging from 0.3 percent to 4.4 percent. The annual emigration rate estimated from the longitudinal earnings data in this study declines from about 3 percent in 1980 to about 1.9 percent in 1998. The average emigration rate of that trend is 2.3 percent and is at about the midpoint of the estimates from the literature.

Finally, Figure 4 presents annual emigration rates by country or region from this and four earlier studies. Except for Europe and Canada, Van Hook et al. [10] report the highest annual emigration rates, with rates for Mexico (4.3 percent) more than double that of the next highest study ([33], at 1.8 percent). In general, the emigration rates estimated from the longitudinal data are in line with most of these studies, but the differences across the eight regions are smaller than those in most of the other studies. For example, estimates from the administrative files range from 1.0 percent (Central America and Africa) to 1.8 percent (Canada), whereas the estimates from Van Hook et al. [10] range from 1.8 percent (Europe and the Caribbean) to 4.3 percent (Mexico and Africa).

The fact that the annual emigration rate for Mexicans is not the highest of the eight countries and regions is perhaps surprising. The large number of Mexican immigrants currently living in the United States, coupled with Mexico’s close proximity, might suggest a higher emigration rate. Mexico’s close proximity might also suggest lower migration costs than those faced by immigrants from more distant countries (see [34]). Yet, as noted previously, the administrative earnings data used here presumably fail to capture many unauthorized immigrants who do not have a valid Social Security number or do not report their earnings to the SSA. Duleep and Dowhan [35] show that more than half of all unauthorized immigrants are from Mexico, another 10 percent come from the Caribbean, and roughly 16 percent immigrate from Central and South America (see also [36, 37]). The share of the unauthorized immigrant population from the Caribbean and Central and South America is about equal to the share of legal immigrants from those countries, suggesting that the estimated emigration rates for these countries especially are most likely smaller than the actual emigration rate.

4.2. Comparing US Emigrants with Social Security Emigrants

The existing literature on emigration from the United States is concerned with emigration from the country’s borders, but a separate policy question is as follows: at what rate do immigrants leave the Social Security system? Revenues from Social Security taxes paid by workers who then leave the Social Security system before claiming benefits are, from the perspective of the system’s finances, a pure gain. On the other hand, immigrants who enter the US workforce and qualify for benefits may, depending on their earnings, family composition, and mortality risk, impose a net cost on the system. This section compares rates of emigration from the United States with emigration rates from the Social Security system.24

Average annual emigration rates for both concepts by age, sex, time spent working, and country of origin are shown in Table 3 and Figure 5. These rates rise relatively linearly across the four age groups, from 0.69 percent for the youngest workers (16 to 24 years old) to 2.28 percent for those aged 45 to 62. Social Security emigration rates also rise by age, but at a slower rate, from 0.65 percent for the youngest workers to 1.10 percent for the oldest. The large difference in rates for the two concepts among the oldest workers may be a signal that as workers near retirement—and hence eligibility to claim Social Security benefits—they are less likely to leave the Social Security system than to emigrate from the United States. The increase in emigration rates across age groups differs from much of the previous literature, which has shown flat or somewhat declining rates of emigration by age (e.g., [3, 4, 10]). The increase in emigration rates may reflect the lack of unauthorized immigrants in the administrative files. As Duleep and Dowhan [35] show, the age distribution of unauthorized immigrants is more heavily skewed toward younger ages than that of legal immigrants; the emigration rate among the youngest workers shown in Table 3 may therefore be underestimated.25 In addition, the increase in emigration rates by age may be due to higher rates of labor force exit among older workers and not necessarily rates at which older workers leave the country or Social Security system.

Women are slightly more likely than men to emigrate from the United States in any year (1.37 percent versus 1.23 percent), but the two groups emigrate from the Social Security system at essentially the same rate (0.96 percent for women and 0.92 percent for men). This result differs from those of both Van Hook et al. [10] and Passel [33], who find that emigration rates for men are more than twice those for women. The method used here may bias estimates of women’s emigration rates upward, because women are more likely than men to exit the labor force for reasons related to childbirth and thus may appear to have emigrated rather than left the labor force.

Much as the rest of the literature has found, annual emigration rates for both concepts decline as individuals work in the United States longer. In any particular year, about 2.7 percent of immigrants who have worked in the United States less than 5 years emigrate from the country. This annual rate declines to 0.21 percent for those who have worked in the United States for more than 21 years. A similar pattern can be found in Van Hook et al. [10], but again, emigration rates in that study are higher than those found here.26

Determining emigration rates by country of origin may also be important for social policy, because immigrants and emigrants from different countries have different labor market and social behaviors. The information in Table 3 also shows emigration rates from the 10 countries with the largest share of immigrants in the USA (according to the administrative data), as well as the emigration rate from the USA to Mexico, Canada, and eight different regions (see map Figure 5). Emigration rates differ between the two emigration concepts considered here with no apparent pattern across countries and regions. Immigrants from the Australia region have the highest US emigration rates (2.3 percent), with immigrants from England and Canada emigrating at a slightly slower rate of about 1.7 percent per year. Immigrants from Mexico emigrate from the USA at slower rates (1.3 percent for US emigrants, 0.8 percent for Social Security emigrants) as do those from Central America (0.9 percent) and South America (1.2 percent). The presumed lack of unauthorized immigrants in the administrative files may generate lower emigration rates for those countries or regions that have large concentrations of unauthorized immigrants in the USA.

5. Estimating Correlations with Emigration

Using the population of emigrants defined above, the analysis in this section identifies some of the basic correlations between emigration and individual characteristics. As noted, because the administrative data contain few additional variables, the regressions include a small set of regressors: age, age squared, sex, average earnings, and country of origin. Additional covariates—such as the difference in growth rates of GDP per capita between the United States and the emigrant’s country of birth, dummy variables for each age group or year—appear to have little impact on the main results.

By taking into account the presence of censoring on the right (later) end of the time series—combined with the presence of annual earnings data—the probability of emigrating in any year is estimated as a discrete hazard model. Following Jenkins [38] and Allison [39], the discrete hazard model with a binary dependent variable reduces to a logit model. Thus, data are organized in a person—year data format where, for each person, there are as many observations as years in the sample up to the year in which the person emigrates or his or her earnings stream ends (i.e., at age 62 or in 2003); these regressions include up to nearly 3.8 million person-year observations.

All but six of the coefficients in the 16 regressions reported in Table 4 are statistically significant at the 1-percent level; the other six coefficients are statistically significant at the 10-percent level. The odds ratio on sex suggests that men are more likely to emigrate than women (controlling for age and earnings). In the US emigration regressions, the estimates suggests somewhat mixed evidence about the correlation between age and emigration, but seem to suggest that the probability of leaving the US accelerates with age, albeit only barely. In the Social Security emigration regressions, that relationship differs with the probability of emigrating rising with age, but at a decreasing rate. Average earnings are negatively correlated with emigration, so that immigrants with lower earnings are more likely to emigrate; the point estimates are similar between the two sets of regressions.27 The odds ratio on the GDP per capita difference variable is greater than one in each regression, suggesting that for an increase in the difference in rates of 1 percentage point in favor of the United States, the odds of emigrating increase by around half a percentage point. Though this finding is perhaps counterintuitive, it may be partially explained by the fact that other support for retirees in the home country—for example, health care or pension benefits—is not included in the regressions.28 Thus, the difference in growth rates of GDP per capita does not capture the entire difference between the USA and the worker’s home country.

The discrete hazard model has the advantage of allowing for complete flexibility in both duration and age by including dummy variables for both. When these controls are included (see columns 6, 7, and 8 of Table 4), the point estimates are little changed.29 Odds ratios on separate age dummy variables (not reported) also suggest a linear rise in the probability of emigrating (columns 7 and 8). Dummy variables for each country in the sample—172 in all—are included in columns 4 through 8.

These correlations are summarized graphically in Figure 6, which, using the regression results in column 3 of Table 4, plots by age the probability of emigrating for men and women with earnings of $10,000 and $30,000. The trends in the top panel illustrate how the combination of the negative age coefficient and the positive age-squared coefficient results in a curvilinear upward trend in emigration rates. For those emigrating from the Social Security system at some point during the period, the computed probabilities generate a flatter age-emigration profile, with differences between men and women that are slightly larger than those in the left panel. Note also that the US emigration probabilities grow from about one-fourth as large as the Social Security emigration probabilities at age 40 to about two-and-a-half times as large by age 62.

6. Conclusion

Estimating rates of emigration of the foreign born, and thus accurately tracking migrant flows, is hampered by the lack of data that follow people from one country to another. Researchers have therefore typically made use of the residual method, which measures the difference between the foreign-born population in one year and that in another year. Although this methodology is widely used, recent research (e.g., see Van Hook et al. [10]) has begun to explore the advantages of longitudinal data. The methodology presented in this paper uses longitudinal administrative earnings data to estimate rates of emigration among immigrants with earnings that are tracked by the Social Security Administration by observing the earnings patterns of US immigrants between 1978 and 2003.

The results are close to those found elsewhere in the literature and suggest an annual emigration rate of workers of about 1.3 percent, or about 315,000 people, over the 1978–1998 period. Averages suggest that about 280,000 workers emigrated each year during the 1980s, a figure that rises to about 380,000 during the 1990s. Analysis of the group of workers who emigrate from the US Social Security system generates a smaller annual emigration rate of about 1 percent. Logit regression models further support the differences between these groups and the probability of emigrating rises over the first part of an individual’s lifetime before declining after about age 30. However, because the data do not distinguish between zero and missing earnings, the methodology may slightly overstate the number of emigrants. On the other hand, because the administrative files do not capture those workers whose earnings are not tracked by the Social Security Administration—particularly unauthorized immigrants—these estimates are most likely smaller than actual total emigration.

Appendices

A. Exception Countries

About 80 countries have agreements with the United States whereby qualified US workers may continue to receive Social Security benefits. Such countries are known as “exception countries” and fall into three separate categories. This appendix lists the exceptions under which US workers living abroad can still receive Social Security benefits and the set of countries that have existing agreements with the United States. The information below is quoted directly from Table 5 and Appendix A of Congressional Research Service, “Social Security Benefits for Noncitizens: Current Policy and Legislation,” February 1, 2008.

Because the administrative data used in this analysis do not identify spouses or railroad workers or military service, only the first, sixth, and seventh exceptions listed above are relevant. This requires using the administrative earnings data to identify the list of “exception countries” where eligible individuals (current recipients of Social Security worker or auxiliary benefits and current workers with 40 or more quarters of coverage) can move and still receive Social Security benefits.

A.1. Exception Countries

The following country lists (see Table 6), which are subject to change periodically, are taken from the Code of Federal Regulations (C.F.R., revised through April 1, 2002) and the Social Security Administration’s International Program web page.

Social Insurance or Pension System Countries
The following countries meet the “social insurance or pension system” exception in Section  202(t)(2) of the Social Security Act: Antigua and Barbuda, Argentina, Austria, Bahamas, Barbados, Belgium, Belize, Bolivia, Brazil, Burkina Faso (formerly Upper Volta), Canada, Chile, Colombia, Costa Rica, Cyprus, Czechoslovakia, Denmark, Dominica, Dominican Republic, Ecuador, El Salvador, Finland, France, Gabon, Grenada, Guatemala, Guyana, Iceland, Ivory Coast, Jamaica, Liechtenstein, Luxembourg, Malta, Mexico, Monaco, Netherlands, Nicaragua, Norway, Panama, Peru, Philippines, Poland, Portugal, San Marino, Spain, St. Christopher and Nevis, St. Lucia, Sweden, Switzerland, Trinidad and Tobago, Trust Territory of the Pacific Islands (Micronesia), Turkey, United Kingdom, Western Samoa, Yugoslavia, Zaire (20.F.R. § 404.463).

Treaty Obligation Countries
The following countries meet the “treaty obligation” exception in Section 202(t)(3) of the Social Security Act: Germany, Greece, Ireland, Israel, Italy, Japan, Netherlands*.*Treaties between the United States and the Netherlands preclude the application of residency requirements for noncitizens with respect to monthly survivor benefits only.

Totalization Agreement Countries
The following countries meet the “totalization agreement” exception in Section 202(t)(11)(E) of the Social Security Act. The effective date is shown for each agreement.

Because in this paper the individual’s year of emigration is calculated from the administrative earnings data, only those workers who emigrate after the year the totalization agreement was approved are considered as emigrants from the Social Security system. No restriction on the year of emigration is applied to the first two groups of countries (social insurance or treaty obligation countries).

Acknowledgments

The analysis and conclusions expressed in them are those of the author and should not be interpreted as those of the Congressional Budget Office. The author wishes to thank Maureen Costantino, Paul Cullinan, Thomas DeLeire, Robert Dennis, Harriet Orcutt Duleep, Joyce Manchester, Marie Mora, Jeffrey Passel, Pia Orrenius, and researchers at the Pew Research Center for comments and suggestions.

Endnotes

  1. See also Ahmed and Robinson [7], Borjas and Bratsberg [40], Mulder [8], Mulder et al. [9], Mulder et al. [5], Van Hook et al. [10], and Warren and Passel [6]. Reagan and Olsen [25] use longitudinal data from the 1979 youth cohort of the National Longitudinal Survey (NLSY79) in an attempt to estimate emigration rates directly; their sample, however, is limited to 571 observations, nearly half of whom are of Mexican origin. Jasso and Rosensweig [24] find slightly higher emigration rates than most of the literature using longitudinal Immigration and Naturalization Service data for a single cohort of legal immigrants.
  2. Also see the summary of these issues in Borjas [41] and Duleep and Dowhan [42]. For additional examples of slower rates of assimilation, see LaLonde and Topel [43] and the discussion in Lubotsky [22].
  3. See also Baker and Benjamin [44], who find that the labor force participation of married immigrant women in Canada falls below that of their native-born counterparts after an initial period of work dedicated to financing their husbands’ human capital investment. Blau et al. [45] find rapid rates of assimilation in annual hours worked among married immigrants.
  4. Kraly [46] suggested using administrative data to estimate rates of emigration, but to my knowledge, this paper is the first to do so. For other research that uses administrative earnings data to examine immigrant earnings growth, see Lubotsky [22] and Duleep and Dowhan [47].
  5. Kopczuk et al. [26] had access to other administrative data, which allowed them to use several additional procedures to adjust the earnings for people whose records were deemed inaccurate.
  6. Average earnings are reduced by 8 percent in 1978, 6 percent in 1979, and 1 percent in 1980.
  7. See Social Security Administration Publication No. 42-007, MMREF-1 Tax Year 2005 (V.2) Appendix G-Country Codes.
  8. Prior to 1978, earnings were reported to SSA on a quarterly basis; beginning in 1978, the reporting period was modified to an annual basis, and thus the qualifying amount, though still referred to as a quarter of coverage, is actually an annual amount. It should also be noted that the eligibility and benefit rules are different for those workers who qualify for Disability Insurance benefits.
  9. Because the administrative earnings data are recorded on an annual basis, the quarter of coverage dollar amounts are converted to annual amounts.
  10. Countries where US workers can receive their Social Security benefits are known as “exception countries.” A full description of these countries can be found in the appendix.
  11. Workers who become disabled toward the end of the sample period, but do not start receiving Social Security disability benefits until after the sample period ends, might also appear to have zero-earnings years, and hence be considered emigrants, when in fact these workers are out of the labor force because of the disability.
  12. The worker’s Social Security number could also be stolen by another worker, which could introduce error in the earnings variable.
  13. The administrative data also do not allow the researcher to identify foreign persons living in the United States under temporary visa status. Emigration rates for these individuals are most likely higher than those for other immigrants, since these temporary visas have specific expiration dates. Temporary workers who overstay their visa—and are therefore unauthorized workers—may continue to have recorded earnings in the administrative data if they continue to use their Social Security number.
  14. Sensitivity analysis (not reported) that conditioned on three, four, or five years of zero earnings results in a slightly lower emigration rate, but the annual emigration patterns were not sensitive to this change.
  15. Lubotsky [22] uses survey data matched to longitudinal administrative earnings data to measure immigrant earnings growth. He uses three definitions of arrival in the United States: the reported year of arrival in the survey, the first year of covered earnings, and the earlier of the two. He notes that the second method “provides perhaps the most easily interpretable picture of immigrant earnings growth because it measures wage growth from the year of entry into the formal, or covered, US labor market” (page 842).
  16. These means are conditional on having positive earnings. Note that, in these calculations, workers emigrate in 1998 or later. Earnings are measured in 2007 CPI-U-RS dollars.
  17. One policy change that could have shifted the pattern of emigration rates is the Immigration Reform and Control Act of 1986 (IRCA), which extended legal status to immigrants who had been unlawfully living in the country continuously since before January 1, 1982, and to illegal immigrants employed in agriculture for a minimum of 90 days in the year preceding May 1986. IRCA may have led to a greater number of immigrants with earnings in the administrative data, but it is not possible to separate new immigrants from those who converted from illegal to legal status under IRCA. The introduction of large numbers of legal immigrants could bias estimates of emigration rates downward. For a description of IRCA, see Fix and Passel [48].
  18. Kraly [46] and Mulder [8] review the literature in detail.
  19. The estimates in Figures 2 through 4 from Van Hook et al., Mulder, Warren and Peck, and Ahmed and Robinson are reported in Van Hook et al. [10], Tables  2 and 3. The estimates for Passel [33] and Hollmann et al. [4] are taken directly from those studies.
  20. The average of the “lowest” emigration rates estimated by Hollmann et al. is 0.7 percent, and the average of the “highest” emigration rates is 1.4 percent.
  21. In 2005, the Census Bureau’s estimates of emigration for LPRs ranged from 229,000 to 430,000.
  22. The average between 2010 and 2080. The Trustees’ estimates rise from about 560,000 in 2010 to over 725,000 in 2080 [11].
  23. Recall that length of time spent working in the United States is measured as the number of years between the last year of positive earnings and the first year of positive earnings, similar to the third method used by Lubotsky [22]. Emigration rates reported in the literature are measured as time spent living in the United States, not necessarily working.
  24. Duleep [49] is one of the only papers to explore the relationship between the level and timing of immigrant emigration and projections of the financial status of the Social Security system.
  25. Duleep and Dowhan [35] compare stocks of legal and illegal immigrants using data from the Immigration and Naturalization Service and from Passel [36]. Nearly all of the illegal immigrant population is estimated to be under age 35, whereas about 65 percent of legal immigrants are estimated to be in the same age range.
  26. Emigration rates in Van Hook et al. [10] by length of time in the United States are 5.0 percent (0 to 4 years), 3.6 percent (5 to 9 years), and 2.0 percent (10 or more years).
  27. In the first year of each person’s earnings stream, average earnings in these regressions are set equal to the first year’s earnings. In the second and third years, average earnings are calculated as the average of earnings in years 1 and 2. Starting in year 4, average earnings are calculated as the average over the previous three years.
  28. Reagan and Olsen [25], for example, find that welfare generosity in the state of residence does not serve to deter migration to the immigrant’s home country. Dumont and Spielvogel s [19] summarize the empirical literature that finds support for the idea that return migration is more highly correlated with the economic and political situation of the country of origin than with that of the country of destination (see also [1]).
  29. Note that GDP data do not exist for all of the countries in the sample; thus, about 200,000 person-year observations are dropped when the GDP variable is introduced in column 5. Further, because there are no emigrants in 2001, 2002, or 2003, these year dummy variables are perfectly collinear and drop out of the regression, which reduces the number of observations by about an additional 100,000 person-years in column 6. Finally, because of the two consecutive zero-earnings years restriction, the dummy variables for 61- and 62-year olds are also perfectly collinear (all zero) and thus drop out from the regression in column 7, reducing the number of observations by almost another 50,000.