The influence of overcrowding and socioeconomy on the spatio-temporal spread of COVID-19 - a Swedish register study
Rapport, 2022
Household overcrowding, which primarily occur in groups characterised by low socioeconomy, is a known risk factor for the spread of infectious diseases. Still, little is established whether overcrowding, in itself, or in combination with other disadvantageous sociodemographic factors has affected the incidence of COVID-19 infection over time and geographical areas. This register study investigated the effect of overcrowded housing, and its interaction with various markers of low socioeconomy, as predictors for the spread of COVID-19 infection, by using regressive spatial and spatio-temporal statistical methods. Through the Swedish Tax Agency database, we could identify all legal residents in Sweden, alive at 1st of January 2020 (baseline) and by using Sweden’s personal identification number system we could link this cohort to data from other national registers. Through Statistics Sweden, we gained access to information on several sociodemographic variables relevant for this study, including variables to calculate overcrowding, as well as individual information on income, immigration background, education, occupation, and car ownership. To this, we added data from the Public Health Agency of Sweden’s register SmiNet, a register for communicable diseases, through which we had access to all positive PCR-test results for COVID-19 in Sweden until the 30th of June 2021. Based on a definition of overcrowding proposed by Eurostat, and the data at hand, we defined overcrowding as more than one person per number of rooms in a household, with the exceptions of adult couples in a relationship, children under a certain age or anyone living in a villa, detached or semi-detached house. The spatial (geographical) aggregation level was determined by Sweden’s so called DeSO zones (“DEmografiska Statistik Områden”, translated: Demographical statistical areas), which divide Sweden into 5984 subregions of varying geographical size, with each zone being inhabited by 700–2700 people. The temporal unit of the data was calendar month. As each person appearing in the statistics Sweden’s register has a DeSO identifier, we could generate monthly counts of infection for each DeSO zone. We started the statistical analyses by evaluating correlations among the spatial covariates and their interaction terms, as well as correlations between the spatial covariates and the log-counts per DeSO, both monthly and accumulated over the whole study period (18 months). This highlighted that the dynamics of infection incidence over time and that certain groups with several highly dependent covariates had a pronounced impact on infection. Our spatial analyses were conducted using an elastic-net regularised Poisson regression approach where the DeSOs’ population sizes were used as off-sets and the model selection was carried out by means of cross-validation. In the spatio-temporal analyses, a dummy variable was added for each month, while keeping the rest of the covariates, including the interaction terms, as in the spatial analyses. This approach allowed us to interpret the fitted models as models for the risk that a generic individual from any kind of DeSO zone tested positive (at a given time point), while also adjusting for collinearity and carrying out variable selection in our models to achieve parsimony. The descriptive results, which we visualized by graphical illustration of the spatially aggregated data, showed clear co-existence of overcrowded housing, low education, low income and having an foreign background in several geographical zones, especially in some of the boroughs of Sweden’s largest cities. The analyses focusing on geographical areas’ vulnerability (spatial risks) revealed higher risks in areas with a high occurrence of overcrowding, especially in interaction with a high proportion of inhabitants with a foreign background, an income below the national median or persons in health and social care work. When incorporating time in the models (spatio-temporal risk), overcrowding appeared as a predictor for COVID-19 infection, however, only during the time periods of April, May, August, and November 2020. Overcrowding otherwise seemed to foremost constitute a risk factor when interplaying with other disadvantageous socioeconomic variables, thus indicating that general socioeconomic vulnerability constituted a risk enhancer. Else, being of foreign background or being employed in a low-income job during the second wave of the pandemic, were notable predictors for the risk of testing positive for the disease. By studying overcrowding and socioeconomic factors, we identified vulnerable groups per geographical area and over the duration of the COVID-19 pandemic. Identified risk factors were clearly more prevalent in groups whose structural living conditions meant less possibilities to protect themselves, and which also already displayed markedly worse health. Targeted interventions towards ill-disposed group and geographical areas are therefore of importance in the still on-going pandemic or in the event of future widespread diseases.