In today's data-driven policy landscape, the need for accurate and high-quality population data is crucial, especially for addressing the ongoing challenges and long-term consequences that have emerged in the aftermath of the COVID-19 pandemic. Sweden, renowned for its detailed population registers, has been actively engaged in discussions on over-coverage in its data systems.
Over-coverage occurs when individuals who have left their place of residence or who have passed away are not accurately removed from records, leading to population overestimates. For several reasons, the propensity for these errors is disproportionally higher when it comes to international migrants, who now constitute one-fifth of Sweden's residents.
These inaccuracies can distort our understanding in three key areas: the size of migrant populations, their demographic composition, and various socio-demographic outcomes like mortality and fertility rates. Such misrepresentations can lead to misguided policies, skewed research results, and even fuel harmful public narratives about migrant communities. International research communities and policymakers must address over-coverage to ensure that accurate data informs their decisions, leading to evidence-driven policies that truly reflect diverse population needs.
Our study introduces a novel approach to tackle this issue, using generalized linear models within the framework of multiple systems estimation (MSE). This innovative method identifies variations in over-coverage across different sub-populations, offering a more accurate picture. More specifically, the study examines the over-coverage of migrants in Sweden from 2003 to 2016. It finds that, in general, over-coverage in Sweden was relatively low in this period. A noticeable spike in over-coverage occurred in 2010, likely linked to repercussions from the 2008 Global Financial Crisis. By the end of the observation period, the MSE method suggested an overestimation of about 3%.
However, there were discrepancies in over-coverage across socio-demographic groups. Over-coverage was predominantly higher among men, young adults, recent migrants, and migrants from neighbouring countries (e.g., Norway and Denmark). Despite the public perception, MENA country migrants showed the lowest over-coverage. At the same time, over-coverage for Eastern European migrants has been on a slow rise since the European Union's enlargement, possibly due to increased circular and return migration in the context of free mobility.
Patterns varied based on how long migrants had been in Sweden. New arrivals showed the highest over-coverage, peaking in 2010, while those residing in Sweden for 6-10 years saw steadily increasing over-coverage. Long-term migrants (over 10 years) had consistently low over-coverage levels. These trends potentially reflect immigration patterns and the influence of the reformed Swedish introduction program in 2010.
Politically, this study emphasises the ramifications of over-coverage on societal comprehension. It underscores the need for a clear understanding of administrative processes related to data collection and the definition of 'usual resident'. Current EU and national definitions of resident populations contain ambiguous language, especially for border regions or mobile populations. This research shows the potential of population registers to carry out self-validation, highlighting their ability to reconcile discrepancies between de jure and de facto population definitions.