This repository takes a stock of COVID-19 datasets for 26 European countries at the regional NUTS3 or NUTS2 level.
The Tracker is now published in Nature Scientific Data:
The data is released monthly on Zenodo:
This repository is updated every four weeks. All raw data and scripts are available here in case more frequent updates are required. Otherwise, please feel free to request one. It takes about 30-40 minutes to run all the scripts and upload them to GitHub. Please cite the above DOIs if you are using this dataset. The underlying data structures are constantly being updated and there might be still undetected issues in the final files. For comments, feedback, error reporting, or other queries please e-mail at email@example.com or firstname.lastname@example.org.
This project is supported by:
Challenges with data access
Almost all countries in Europe showcase COVID-19 data in the form of choropleth maps and trend graphs. Access to data behind these visualizations varies from country to country. The responsibility of providing the data ranges from official government websites, to national statistical agencies, and to health ministries. While countries host the data on these websites, some just export it to third-party repositories, for example, ArcGIS Hub and GitHub. As a result, each country has to be dealt with individually. While most countries allow access to regional data in some form, others do not release this information publicly. In case of the latter, one can likely find data scraped from websites especially on platforms like GitHub.
Information provided by countries is also not consistent. Not all countries release regional data on deaths, tests performed, hospitalization rates, vaccination rates, or information by gender and age groups. Therefore, this database currently only focuses on reported cases, even though for most countries other information exists in the raw files as well. These can be easily extracted from the scripts provided.
Combining data across countries
Countries in European define regions differently, and therefore, making data homogeneous is a challenging task. For consistency, the European Commission and the Eurostat, have homogenous units called NOMENCLATURE OF TERRITORIAL UNITS FOR STATISTICS or NUTS. NUTS0 are countries, NUTS1 are typically provinces, NUTS2 are typically districts, and NUTS3 are typically sub-districts. Most countries release information at administrative units lower than NUTS3. These are referred to as Local Administrative Units or LAUs, where LAU1 (districts) and LAU2 (municipalities) were formerly NUTS4 and NUTS5 regions respectively. Currently only one level of LAU is documented by the European Commission. Several countries provide data at finer LAU levels.
The NUTS regions are redefined every afew years (2013, 2016, 2021). Currently the 2021 NUTS regions have come into effect since 1st January 2021. But since most of the regional data on Eurostat is at the 2016 level, this tracker homogenizes data records also at the NUTS 2016 boundaries. The process of homogenization has its own set of challenges. While some countries just rename regions, others actually change, merge, and shift boundaries. Not all countries match perfectly to NUTS 2016 boundaries. For example, the data for Italy was always released at the 2021 NUTS boundaries definitions. This causes problems with a couple of small regions on the islands. Similarly, data for Finland is released at the hospital district level. These do not perfectly align with NUTS 3 boundaries. The rate of error is minimial since most of the regions affected by boundary shifts are very small. Additionally, the raw data is available which allows the data to be used as it is or aggregated to other administrative levels.
Not all countries in Europe are in the European Union, and hence are not subject to Eurostat reporting/data sharing requirements. While all countries have correspondence tables between their own region definitions and NUTS, providing NUTS level information is not mandatory for non-EU countries. This list includes, the UK (post Brexit), Norway, and the Switzerland. While some countries provide detailed regional information on COVID-19 (for example, Norway), they don’t have the latest LAU-NUTS correspondence tables available. They way around this problem is to spatially overlay LAU and NUTS boundaries and extract the information based on boundary overlaps. While in theory the overlaps should be perfect, in practice, small errors might persist based on slight differences in boundaries, differences in resolution of spatial files, and simply some LAUs might cut across NUTS boundaries (UK is a good example of this problem).
European regions and availability of COVID-19 data
The table below shows the national regional classifications that correspond to NUTS tables. Number of regions for each NUTS level are given in brackets and the NUTS level at which the data is available is highlighted in bold. For smaller countries, NUTS 0 and NUTS 1 are the same administrative regions.
|Country (NUTS 0)||Code||NUTS 1||NUTS 2||NUTS 3||LAU|
|Austria||AT||Gruppen von Bundesländern (3)||Bundesländer (9)||Bezirke (35)||Gemeniden (2096)|
|Belgium||BE||Gewesten / Régions (3)||Provincies / Provinces (11)||Arrondissementen / Arrondissements (44)||Gemeenten/Communes (581)|
|Croatia||HR||-||Regija (4)||Županija (21)||Gradovi i općine (556)|
|Czechia||CZ||Území(1)||Regiony soudržnosti (8)||Kraje (14)||Obce (6258)|
|Denmark||DK||-||Regioner (5)||Landsdele (11)||Kommuner (99)|
|Estonia||EE||-||-||Maakondade grupid (5)||Linn, vald (79)|
|Finland||FI||Manner-Suomi, Ahvenananmaa / Fasta Finland, Åland (2)||Suuralueet/Storområden (5)||Maakunnat/Landskap (19)||Kunnat / Kommuner (311)|
|France||FR||Zones d’études et d’aménagement du territoire (14)||Régions (27)||Départements (101)||Communes (34970)|
|Germany||DE||Länder (3)||Regierungsbezirke (38)||Kreise (401)||Gemeniden (11087)|
|Greece||EL||Geografikes Perioches (4)||Periferies (13)||Periferiakon Enotiton (52)||Topikes Koinotites (6134)|
|Hungary||HU||Statisztikai nagyrégiók (3)||Tervezési-statisztikai régiók (8)||Megyék + Budapest (20)||Települések (3155)|
|Ireland||IE||-||Regions (3)||Regional Authority Regions (8)||Local Election Areas (166)|
|Italy||IT||Gruppi di regioni (5)||Regioni (21)||Provincie (107)||Comuni (7926)|
|Latvia||LV||-||-||Statistiskie reģioni (6)||Republikas pilsētas, novadi (119)|
|Netherlands||NL||Landsdelen (4)||Provincies (12)||NUTS3 (40)||Gemeenten (355)|
|Norway||NO||-||Landsdeler (7)||Fylker (18)||Kommuner (356)|
|Poland||PL||Makroregiony (7)||Regiony (17)||Podregiony (73)||Gminy (2478)|
|Portugal||PT||Continente + Regiões Autónomas (3)||Grupos de Entidades Intermunicipais + Regiões Autónomas (7)||Entidades Intermunicipais + Regiões Autónomas (25)||Freguesias (3098)|
|Romania||RO||Macroregiuni (4)||Regiuni (8)||Judet + Bucuresti (42)||Comuni + Municipiu + Orase (3181)|
|Slovenia||SI||-||Kohezijske regije (2)||Statistične regije (12)||Občine (212)|
|Slovak Republic||SK||-||Oblasti (4)||Kraje (8)||Obce (2927)|
|Spain||ES||Agrupación de comunidades autónomas (7)||Comunidades y ciudades Autónomas (19)||Provincias + islas + Ceuta, Melilla (59)||Municipios (8131)|
|Sweden||SE||Grupper av riksområden (3)||Riksområden (8)||Län (21)||Kommuner (290)|
|Switzerland||CH||-||Grossregionen (7)||Kantone (26)||Gemeinden/Communes (2222)|
|United Kingdom||UK||Government Office Regions (12)||Counties (41)||Upper tier authorities (179)||Lower Authority Districts (LADs/LTLAs) (315)|
Source: Extended from Eurostat LAU page.
Sources of country datasets
The table below shows the links of the official insitutions that are responsible for COVID-19 data in their respective countries, and links to the actual databases from where the data is pulled.
Note: The links are subject to change. If you find an error or a better data source, then please let me know.
The following workflow is used to compile the data at the NUTS3 or NUTS2 level:
The date range for countries:
The scatter plot of daily cases per 10k population at the NUTS level:
In order to validate the data, the regional level information is aggregated up to country-level totals. These are compared with Our World in Data’s (OWID) COVID-19 tracker numbers. OWID is a key source for COVID-19-related information and is referenced frequently in scientific research and the media. OWID was utilizing country-level information provided by the European Center for Disease Control (ECDC) till November 2020. In November 2020, ECDC announced that it will switch to weekly data releases under The European Surveillance System (TESSy) where countries submit NUTS2-level data. As a response, OWID switched to the John Hopkins University’s (JHU) data, which provides COVID-19 information at the global level.
For validation, both this Tracker and OWID data is merged on a country-date combination and the difference between the daily cases is calculated. The figure above plots these differences by countries. After October 2020, the mismatch in the totals goes up significantly and persists till today. Two explanations for these trends. First, before October 2020, daily data was provided by ECDC which was taking information directly from European countries. Since this Tracker is pulling data from the countries directly, the match pre-October 2020 is very close with the exception of a few outliers. Second, since the data source of this Tracker remains unchanged, while OWID updated its source to broader (less-verified) database after October 2020, this Tracker provides a more accurate picture of country-level aggregates including regional variations. As of March 2021, ECDC is again releasing daily country-level data but gaps exist between the latest data series and the pre-November 2020 updates.
Share of cases and deaths in Tracker countries
The Tracker countries are shown in orange, while the rest of Europe and Central Asia is shown in yellow.
All data points (Jan 2020 to present):
Cases on the last data point for each NUTS region:
Change in cases in the past 14 days:
Individual country maps:
In the media
Articles in the press related to the Tracker:
- 22 Jul 2021: All countries updated. Changes to the map script to automatically drop regions which don’t hav updates for the past 14 days. Since Portugal data is released every two weeks, it contains information on daily cases for the day of the data release. This allows us to calculate bi-weekly changes in cases. NUTS regions which are fall off the maps will be checked. Either their data is not updated or they have issues in their names in the original data files.
- 01 Jul 2021: All countries updated. Ireland’s data is still not being updated so it has been dropped from Europe maps. New dedicated webpage created for this repository: https://asjadnaqvi.github.io/COVID19-European-Regional-Tracker/. In the maps the rise of the delta variant is also visible.
- 12 Jun 2021: England data source changed to the offical ONS data at the LTLA level which maps to NUTS 3. Since the raw files are very large the original version has been removed from the directories. Please see the dofiles for the links. All countries updated. This will be the official May/June 2021 release version.
- 31 May 2021: All countries updated. Minor fixes to Ireland’s dofile. England’s data from ODI Leeds is no longer being updated regularly. For this update, England’s data is still the old version from April. One option is to replace England data with the official ONS information but its mapping to NUTS2/NUTS3 needs to be checked.
- 18 May 2021: Negative change in daily cases were being dropped in the master file. These have been added back in. It is up to the users to decide on how to deal with them. Negative changes in daily cases exist in the raw files and are mostly likely corrections to the data. A flag variable has been added to the master file which equals 1 if the daily_cases variable is negative. These are 0.18% of the data at the time of this update. Other minor fixes include dropping redundant variables and ordering the columns. The scatter plot for daily NUTS cases per 10k population now shows the complete data series.
- 13 May 2021: Fixed Poland’s (PL) repository. The underlying data structure changed for the files were not compiling correctly. Switzerland’s (CH) data file had missing data points wrongly showing up as 0 cases. These have been fixed.
- 01 May 2021: All files updated for the May release. Minor errors fixed in dofiles. Population file has been updated to include 2020 regional population data. For the UK 2019 values are used since regional information no longer exists in the Eurostat database due to Brexit.
- 06 Apr 2021: All files updated for the April release. Maps switched back to Viridis color scheme.
- 22 Mar 2021: Scotland data is now from the official NHS website. The code has also been corrected. Other minor fixes to the remaining countries. I am taking out Portugal from the maps. Portugal’s data is bi-weekly and it is not possible to elicit daily information. The raw data files are still in the database. Region names from the maps have been removed. A new map has been added which shows percentage change in cases in the last 14 days. Note that for Europe maps, the last available data entry of each NUTS region is used. This is to ensure that maps are as complete as possible since some data points for the latest date are missing.
- 04 Mar 2021: All files checked and updated. Minor fixes to the code. Path to access raw data for Spain fixed. Folders cleaned up further.
- 13 Feb 2021: All files checked and updated. Major fixes to the code. Raw data is now in the 04_master folder in .csv and .dta format. Daily cases fixed for several countries. Previously they were calculated as the difference between the observations and not the dates. Thus if a country had skipped several days, the daily cases would show a huge jump. These observations are now set to missing. As aa result there are more gaps now. If a country has 0 cases for a given date, that date is now dropped from the homogenized dataset. For example, Portugal which changed the reported to weekly and bi-weekly frequeny now has large gaps. The original data still contains all the dates and the values. Estonia’s data fixed and it now reflects the correct values. A validation file added which aggregates country level data for each date and compares it with Our World in Data (OWID) values. This update is released as v1.3 on Zenodo.
- 25 Jan 2021: All files checked and updated. Minor fixes to code. Some file paths changed. Country level graphs now show the last data point for a region. This is just for presentation. Please see the data files for actual information.
- 05 Jan 2021: All files checked and updated. Data for Poland, Greece, Switzerland fixed. Portugal is releasing data at weekly intervals only and therefore daily cases per capita need to be imputed correctly. The scripts have been updated to Stata 16. There should not be compatibility issues with earlier versions but please report if the files don’t compile. New maps added for cumulative cases and cumulative cases per capita for 2020.
- 07 Dec 2020: All files updated. Romania JSON has been scripted in Stata. Portugal status remains the same. Last regional updated was 26 Oct 2020. Portugal and Greece have been removed from the Europe map but individual files remain in the database.
- 25 Nov 2020: All files updated. Portugal has not updated the official dataset since 26 Oct, 2020. Greece data is also patchy. Maps are now organized in alphabetical order of the 2-letter country code rather than when they were added to this repository.
- 17 Nov 2020: All files added to the directory for public release. Zenodo badge created. Tables have been updated. All dofiles were checked and reworked for updates, new datasets, paths. Dofiles for country level maps will be added soon.
- 01 Nov 2020: Scotland and Romania added. All data files and scripts were rechecked. The maps were homogenized across countries. The data range of countries was fixed. Some countries only release data periodically at regional levels.
- 25 Oct 2020: Deprecated links fixed. Date ranges removed from table and replace with a figure. If data sources for missing countries are not found, they will be replace by country level data from ECDC to complete the map.
- 17 Oct 2020: Ireland repository fixed. New Youtube video uploaded. Maps are now mix-domain NUTS3 and NUTS2 so populations are normalized accordingly.
- 04 Oct 2020: Countries with JSON datasets have been now been automated. Ireland dataset is no longer being updated on Github but the official website now provides more accurate information. This will be added soon. Still looking for UK minus England data. Potentially also looking for Lithania, Bulgaria, Romania and other counties between Croatia and Greece.
- 21 Sep 2020: Croatia and Denmark added to the maps. Ireland data is no longer updating since the Github repository is now dormant. NUTS2 population needs to be added to cases per population map.
- 16 Sep 2020: Poland and Greece NUTS2 data has been merged with the main file and added to the map. Data for Croatia and Denmark will be integrated next. Next task is to find Lithuania and Ukraine data sets.
- 07 Sep 2020: Improved documentation of the maps. All maps are now displayed above. Youtube video of changes in NUTS-3 level cases added. Map of cases and cases per pop added.
- 31 Aug 2020: Estonia, Latvia, Slovakia added to the database.
- Estonia only provides case ranges in bands of 10 (0-10, 11-20, etc). NUTS 3 level data is approximated by taking mid-points of each range for each date/region combination and then aggregating to the NUTS 3.
- 29 Aug 2020: Switzerland and Greece added to the database. Greece is data is only available at the NUTS 2 level.
- 27 Aug 2020:
- Portugal: taken out for now for data checking since there are issues with the series continuity.
- France: Historical data before 13th May added. There is a huge jump in the number of tests and reported cases for the few observations that overlap. This is because before 13th May, data was only being collected from 3 labs before proper testing protocols were introduced. There is no way of back correcting this information but maybe some form of data interpolation might help.
- 26 Aug 2020: Github repository created with documentation of regions in European countries.