Cultural heritage, tourism and local economies in Italy
This project explores how cultural heritage impacts tourism and economic performance across Italian regions by analyzing the relationship between cultural assets, tourism revenues, and employment indicators. Through interactive maps and data visualizations, the project provides insights to support more effective use of cultural heritage for sustainable economic growth and balanced regional development.


Exploring the Economic Impact of Cultural Heritage Across Italian Regions
This project aligns data from various sources to analyze how cultural heritage, tourism, and economic development interact across Italy. By examining regional GDP, employment, tourism revenues, and household income, we aim to understand the contribution of cultural assets to local economies and reveal regional disparities.
Our investigation focuses on two key questions:
Key Research Areas
- Correlation between cultural sites and tourism revenue
- Impact of cultural heritage on employment and income
The results offer practical insights for policymakers, local authorities, tourism boards, and cultural institutions. The findings support strategic decisions to strengthen regional economies, enhance tourism potential, and advocate for increased investment in cultural heritage β promoting both cultural preservation and regional prosperity.

Our Goal
To provide data-driven insights that promote sustainable growth and balanced regional development through the better utilization of Italy's cultural heritage.
Datasets Overview
This project integrates original datasets and mashup data sources to analyze the interplay between tourism, cultural heritage, and economic development across Italian regions. Below is a summary of the key datasets utilized.
D1: Accommodation Infrastructure
Tourist establishments, bedrooms, and bed-places across NUTS 2 regions (Eurostat).
D2: Peer-to-Peer Stays
Guest nights via collaborative platforms (NUTS 3 regions, Eurostat Experimental Data).
D3: Monthly Occupancy Trends
Occupancy rates by type of tourist accommodation with guest origin breakdown (ISTAT).
D4 & D5: Arrivals and Overnight Stays
Regional arrivals and nights spent by tourists at accommodation establishments (Eurostat).
D6: Occupancy Metrics
Net occupancy rates of hotels and similar accommodations by NUTS 2 regions (Eurostat).
D7: Cultural Sites Inventory
Luoghi della cultura dataset from MiC catalog, detailing cultural heritage locations in Italy.
D8: GDP Breakdown
Italy's GDP composition by economic sector (agriculture, industry, services) from ISTAT.
D9: Employment by Sector
Regional employment distribution by economic sector based on 2010 classification (ISTAT).
D10: Household Income
Household net income by source, analyzing regional disparities and income structure (ISTAT).
M1: Mashup 1
Mashup dataset of all the tourism data, along with the cultural heritage institutions data.
M2: Mashup 2
Mashup dataset of all the economic data, along with the cultural heritage institutions data.
Detailed Dataset Evaluation
Overview of completeness, consistency, and reliability indicators per dataset.
Question (Criterion) | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Time-Series Completeness | 100% (2019-2023) | 100% (2019-2023) | 100% (2019-2023) | 100% (2019-2023) | 100% (2019-2023) | 100% (2019-2023) | 0.04% missing | 23.57% (2023) | 23.26% (2023) | 71% (2021/22) |
Geographical Coverage | 100% NUTS-2 | 100% NUTS-3 | 100% NUTS-2 | 100% NUTS-2 | 100% NUTS-2 | 100% NUTS-2 | All regions present | All NUTS levels | All NUTS levels + NACE Rev.2 | All NUTS levels |
Completeness Eval | β | β | β | β | β | β | β | β | β | β |
Accuracy | β Syntactic | β Syntactic | β Syntactic | β Syntactic | β Syntactic | β Syntactic | β Syntactic | β Syntactic | β Syntactic | β Syntactic |
Semantic Accuracy / Data Consistency | β Semantic | β Semantic | β Semantic | β Semantic | β Semantic | β Semantic | Province boundaries outdated | 3 bil diff 2022/23 | Cross-check Eurostat incomplete | 4 thousand diff 2021/22 |
Coherence | β No contradictions | β No contradictions | 12% violated rule | β No contradictions | β No contradictions | β No contradictions | β No contradictions | β No contradictions | β No contradictions | β No contradictions |
Timeliness | β | β | β | β | β | β | Daily | Annual | Annual | Annual |
Detailed Legal Compliance by Dataset
Assessment of privacy, intellectual property, licensing, and access limitations for each dataset.
Question (Criterion) | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Is the dataset free of any personal data as defined in the Regulation (EU) 2016/679? | β | β | β | β | β | β | β | β | β | β |
Is the dataset free of any indirect personal data that could be used for identifying the natural person? | β | β | β | β | β | β | β | β | β | β |
Is the dataset free of any particular personal data (art. 9 GDPR)? | β | β | β | β | β | β | β | β | β | β |
Is the dataset free of any information that combined with common data available in the web, could identify the person? | β | β | β | β | β | β | β | β | β | β |
Is the dataset free of any information related to human rights (e.g., refugees, witness protection, etc.)? | β | β | β | β | β | β | β | β | β | β |
Did you use a tool for calculating the range of the risk of deanonymization? | β (NOT APPLICABLE) | β | β | β | β | β | β | β | β | β |
Are you using geolocalization capabilities? | β | β | β | β | β | β | β | β | β | β |
Did you check that the open data platform respect all the privacy regulations (registration of the end-user, profiling, cookies, analytics, etc.)? | β | β | β | β | β | β | β | β | β | β |
Do you know who, in your open data platform, is the Controller and Processor of the privacy data of the system? | β Eurostat | β Eurostat | β ISTAT | β Eurostat | β Eurostat | β Eurostat | β Eurostat | β ISTAT | β ISTAT | β ISTAT |
Have you checked the privacy regulation of the country where the datasets are physically stored? | β | β | β | β | β | β | β | β | β | β |
Do you have non-personal data? | β | β | β | β | β | β | β | β | β | β |
Question (Criterion) | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Have you created and generated the dataset? | β | β | β | β | β | β | β | β | β | β |
Are you the owner of the dataset? | β Eurostat | β Eurostat | β ISTAT | β Eurostat | β Eurostat | β Eurostat | β MiC | β ISTAT | β ISTAT | β ISTAT |
Are you sure not to use third party data without the proper authorization and license? | β | β | β | β | β | β | β | β | β | β |
Have you checked if there are any limitations in your national legal system for releasing some kind of datasets with open license? | β | β | β | β | β | β | β | β | β | β |
Question (Criterion) | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Did you release the dataset with an open data license? | β | β | β | β | β | β | β | β | β | β |
Did you include the clause: "In any case the dataset can't be used for re-identifying the person"? | β | β | β | β | β | β | β | β | β | β |
Did you release the API (in case you have it) with an open source license? | β | β | β | β | β | β | β | β | β | β |
Have you checked that the open data/API platform license regime is in compliance with your IPR policy? | β | β | β | β | β | β | β | β | β | β |
Question (Criterion) | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Did you check that the dataset concerns your institutional competences, scope and finality? | β | β | β | β | β | β | β | β | β | β |
Did you check the limitations for the publication stated by your national legislation or by the EU directives? | β | β | β | β | β | β | β | β | β | β |
Did you check if there are some limitations connected to the international relations, public security or national defence? | β | β | β | β | β | β | β | β | β | β |
Did you check if there are some limitations concerning the public interest? | β | β | β | β | β | β | β | β | β | β |
Did you check the international law limitations? | β | β | β | β | β | β | β | β | β | β |
Did you check the INSPIRE law limitations for the spatial data? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Question (Criterion) | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Did you check that the dataset could be released for free? | β | β | β | β | β | β | β | β | β | β |
Did you check if there are some agreements with some other partners in order to release the dataset with a reasonable price? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Did you check if the open data platform terms of service include a clause of βnon liability agreementβ regarding the dataset and API provided? | β | β | β | β | β | β | β | β | β | β |
In case you decide to release the dataset to a reasonable price did you check if the limitation imposed by the new directive 2019/1024/EU are respected? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
In case you decide to release the dataset to a reasonable price did you check the e-Commerce directive and regulation? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Question (Criterion) | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Do you have a temporary policy for updating the dataset? | β Annual | β Annual | β Monthly + Annual | β Annual | β Annual | β Annual | β Annual | β Annual | β Annual | β Annual |
Do you have some mechanism for informing the end-user that the dataset is updated at a given time to avoid mis-usage and so potential risk of damage? | β | β | β | β | β | β | β | β | β | β |
Did you check if the dataset for some reason cannot be indexed by the research engines (e.g., Google, Yahoo, etc.)? | β | β | β | β | β | β | β | β | β | β |
In case of personal data, do you have a reasonable technical mechanism for collecting request of deletion (e.g., right to be forgotten)? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Ethical Analysis by Dataset
Evaluation of human-centric principles, data control, and user empowerment aspects per dataset.
Question | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Is your data processing based on the fact that you borrow data from the users (not owner of their data)? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Do you ensure that the userβs rights are prioritised, rather than commercial or institutional interests? | β | β | β | β | β | β | N/A | β | β | β |
Do you ensure that primarily users benefit from their own data β not just the organisation? | β | β | β | β | β | β | N/A | β | β | β |
Do you use privacy-by-design principles, and can you describe them clearly and transparently? | β | β | β | β | β | β | N/A | β | β | β |
Question | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Do you ensure that usersβ data β as far as possible β is processed directly on the usersβ own device(s)? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
When the processing of data is necessary other than on the userβs own devices, such as your server or a cloud solution, is collected data not related to an identifiable person? | β | β | β | β | β | β | N/A | β | β | β |
Do you use profiling? If so, do you allow the user to influence and determine the values, rules and input that underlie the profiling? | β | β | β | β | β | β | β | β | β | β |
Do you use data to predict individual-level behaviour or only patterns? | β | β | β | β | β | β | β | β | β | β |
Question | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
In which country is your data stored? | Luxemburg | Luxemburg | Italy | Luxemburg | Luxemburg | Luxemburg | Italy | Italy, under ISTAT infrastructure | Italy, under ISTAT infrastructure | Italy, under ISTAT infrastructure |
Where is the storage solutions provider headquartered? | Brussels/Lux. | Brussels/Lux. | Rome | Brussels/Lux. | Brussels/Lux. | Brussels/Lux. | N/A | ISTAT | ISTAT | ISTAT |
Does the transmission of data go through countries outside of the EU? | β | β | β | β | β | β | β | β | β | β |
Do you use machine learning / artificial intelligence? If so, can you explain the algorithms β the criteria and parameters? | β | β | β | β | β | β | β | β | β | β |
Do you use personal data to influence user behaviour? | β | β | β | β | β | β | β | β | β | β |
Do you ensure that it is transparent when the use of personal data may influence a userβs behaviour? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Do you ensure that the design does not create addiction and thus influences the personβs self-determination and empowerment? | β | β | β | β | β | β | β | β | β | β |
Do you operate with open source software, so others can use it and possibly develop it further? | β Eurostat publishes API code snippets | β Eurostat publishes API code snippets | β | β Eurostat publishes API code snippets | β Eurostat publishes API code snippets | β Eurostat publishes API code snippets | β | β the project uses open source tools (e.g., Python, Jupyter) and publishes code openly | β the project uses open source tools (e.g., Python, Jupyter) and publishes code openly | β the project uses open source tools (e.g., Python, Jupyter) and publishes code openly |
Question | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
When do you anonymise personal data? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Do you use end-to-end encryption of data? | β | β | β | β | β | β | N/A | β | β | β |
Do you minimise the use of metadata and explain how it is done? | β | β | β | β | β | β | β | β | β | β |
Do you use zero knowledge as a design principle? | N/A | N/A | N/A | N/A | N/A | N/A | β | N/A | N/A | N/A |
Do you sell data to third parties? | β | β | β | β | β | β | β | β | β | β |
Do you sell data as personal identifiable data? | β | β | β | β | β | β | β | β | β | β |
Do you sell data as patterns on an aggregated level? | β | β | β | β | β | β | β | β | β | β |
If you sell data, are you making sure that it is fully anonymised information only describing patterns, not individuals? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Do you use third-party cookies? | β | β | β | β | β | β | β | β | β | β |
Does this include SoMe (social media) cookies and SoMe logins? | β | β | β | β | β | β | β | β | β | β |
Do you use Google Analytics or similar tracking tools? | β | β | β | β | β | β | β | β | β | β |
If you use third-party cookies, are your users fully aware that your cookie use leads to sharing of data about your users with third parties and do they agree with it? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Do you enrich data with external data, such as social media data, bought data or web scraping? | β | β | β | β | β | β | β | β | β | β |
Does this enrichment occur in response to, or in cooperation with, your users? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Do you have an individual or a department responsible for the ethical managing of data? | β | β | β | β | β | β | β | β | β | β |
How is the work with data ethics embedded in the organisation? | Managed by Eurostat within the European Statistical System (ESS) | Identical framework to D1 | Legislative Decree 322/1989 (Italian statistical law) and ISTATβs Code of Conduct. | Identical framework to D1 | Identical framework to D1 | Identical framework to D1 | N/A | ISTAT integrates ethics through institutional data policies, transparency rules and compliance checks | ISTAT integrates ethics through institutional data policies, transparency rules and compliance checks | ISTAT integrates ethics through institutional data policies, transparency rules and compliance checks |
How do you ensure that your data ethics guidelines are respected? | automated disclosure-control software, dual sign-off workflows, audit trails, external peer reviews, and supervisory bodies | automated disclosure-control software, dual sign-off workflows, audit trails, external peer reviews, and supervisory bodies | version control & audit log | automated disclosure-control software, dual sign-off workflows, audit trails, external peer reviews, and supervisory bodies | automated disclosure-control software, dual sign-off workflows, audit trails, external peer reviews, and supervisory bodies | automated disclosure-control software, dual sign-off workflows, audit trails, external peer reviews, and supervisory bodies | N/A | Internal audits, legal compliance teams, and mandatory adherence to GDPR and open data principles | Internal audits, legal compliance teams, and mandatory adherence to GDPR and open data principles | Internal audits, legal compliance teams, and mandatory adherence to GDPR and open data principles |
Can the processing of data be audited by an independent third party? | β | β | β | β | β | β | N/A | β | β | β |
Do you require and control the data ethics of your subcontractors and partners? | β | β | β | β | β | β | N/A | β | β | β |
Question | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Do you engage in dialogue with your users on a public platform? | β | β | β | β | β | β | β | β | β | β |
Do you have guidelines for using the platform? | β | β | β | β | β | β | N/A | β | β | β |
Do you moderate the platform in order to remove sensitive personal data? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | β | β | β |
If your services are offered to children, do you ensure parental consent? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Is data used to develop or train an algorithm? | β | β | β | β | β | β | N/A | β | β | β |
Do you ensure that the use of data does not lead to discrimination? | β | β | β | β | β | β | N/A | β | β | β |
Do you ensure that the use of data does not expose the vulnerabilities of individuals? | β | β | β | β | β | β | N/A | β | β | β |
Do you ensure that the use of artificial intelligence / machine learning is to the benefit of the individual and does not cause physical, psychological, social or financial harm to the individual? | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Detailed Technical Analysis
Overview of formats, metadata, provenance, and accessibility details per dataset.
Dataset | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Format | XLSX, TSV, CSV, XML, JSON | XLSX, TSV, CSV, XML, JSON | XLS, CSV, JSON | XLSX, TSV, CSV, XML, JSON | XLSX, TSV, CSV, XML, JSON | XLSX, TSV, CSV, XML, JSON | JSON-LD, RDF/TTL, RDF/XML | CSV | CSV | CSV |
Dataset | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 |
---|---|---|---|---|---|---|---|---|---|---|
Metadata | Covers: scope, purpose, variables, definitions, coverage, legal basis, statistical units, data sources | Same as for D1 | SIQual, quality notes, scope, concepts, classifications | Same as for D1 | Same as for D1 | Same as for D1 | Cultural site categorization, province/region breakdown, MiC updates | Sectoral & regional breakdown, national accounts alignment | Employment by sector, territorial detail, NACE/ATECO classification | Household income breakdown, source categories, regional aggregation |
- D1: Eurostat
- D2: Eurostat
- D3: ISTAT
- D4: Eurostat
- D5: Eurostat
- D6: Eurostat
- D7: MiC (Ministry of Culture)
- D8: ISTAT
- D9: ISTAT
- D10: ISTAT
RDF Serialization of Metadata
All datasets and the project catalogue have been described with metadata, following the DCAT Application profile for data portals in Europe (DCAT-AP).
Visualizations
Visualizations to better explain the datasets and how they relate to each other in a visually engaging way.
Considerations | Mashup Used |
---|---|
What is the relationship between available tourist accomodations and occupancy rates? What can this imply about the presence of cultural heritage institutions? | Mashup 1 |
What is the relationship between tourist arrival numbers and the number of cultural heritage institutions present? | Mashup 1 |
What is the relationship between the number of nights spent per region? Is there any correlation between this and the number of cultural heritage institutions? | Mashup 1 |
What is the trend of total accomodation establishments versus cultural heritage institutions between 2021 and 2023? | Mashup 1 |
What is the overall correlation between all touristic data and the number of cultural heritage institutions? | Mashup 1 |
What is the overall average trend of the economic data between 2021 and 2023? | Mashup 2 |
What is the relationship between cultural heritage institutions and the economic information? | Mashup 2 |
What are the regional averages of economic data and are there any visible correlations with the number of cultural heritage institutions? | Mashup 2 |
What is the overall correlation between all economic data and the number of cultural heritage institutions? | Mashup 2 |

Discussion of Findings
For the first resarch topic, as we intended to investigate the relationship between the presence of cultural heritage institutions and tourism activity, we looked at the number of available accomodation and their occupancy rates per region. From this comparison, it could be said that the regions with higher numbers of available accomodations do not necessarily have higher rates of occupancy (i.e. Veneto and Toscana regions). This could mean that the occupancy rate is not necessarily related to the total number of available places, which is supported by the low rate of correlation (r = 0.35).
In addition to the occupancy rate, we investigated the number of tourist arrivals against the number of cultural heritage institutions per region. There seems to be a reasonable relationship between these two parameters, as the higher number of arrivals are accompanied with the higher number of institutions, and the lower number of arrivals occur in the regions with lower number of institutions. Even though there are exceptions to this trend, such as Veneto and Trentino-Alto Adige regions, we believe it contributes to the initial hypothesis.
While investigating the nights spent based on the aggregated numbers from collaborative platforms and accomodation establishements per region, we found a trend similar to our second point, such that there seems to be a relationship between the higher number of cultural institutions and the higher number of nightly stays. The seperate correlation coefficients of r = 0.61 for the collaborative platforms data and r = 0.53 for the accomodation establishments seem to support the existence of a relationship.
In order to have an overall look, we used the biggest data we have for the tourism activity and we plotted the number of cultural institutions against it. As the graph suggests, there seems to be some correlation between the two data, although there is a bit more discrepancy in this point compared to other points. The correlation coefficient is r = 0.5, which seems to suggest the existince of a relationship, although it is not a strong one.
For our second research topic, we investigated if there is a relationship between the existence of cultural heritage institutions and economic data. For that, we looked at the trends between only the economic data before plotting it against the institution data. Overall, GDP and employment data seem to be highly related to each other, however as the household net income data has a large amount of missing data, based on the data we have, it seems to not have a visible relationship with the other categories of data.
According to our correlation matrix, the number of cultural heritage institutions seems to have a reasonable relation with GDP (r = 0.59) and a higher relation with the number of employment (r = 0.72).
Overall, there seems to be some overlap between tourism activity and the number of cultural heritage institutions, as well as the presence of cultural heritage institutions and their impact on the local economies. However, the data we currently have is not enough to make a solid conclusion, more data and more investigation of data is required based on the findings of our project.
Dataset Update Sustainability
Ensuring long-term relevance and reliability of datasets through institutional sources and transparent update cycles.
The sustainability of updating the datasets over time is ensured through the use of official, institutional data sources with established update cycles.
For datasets D1 to D6, provided by Eurostat and ISTAT, updates follow a predictable schedule, typically annual or quarterly, depending on the indicator. These updates are transparently published on official portals, accompanied by metadata that informs users about version changes and data currency.
For D7, maintained by the Italian Ministry of Culture (MiC), the dataset is part of the national DBUnico system, which is continuously managed and updated as new cultural sites are added or existing records change.
For economic datasets D8 to D10, ISTAT guarantees regular revisions aligned with national accounting practices, labour statistics releases, and household income surveys, ensuring data remains relevant for long-term research and policy analysis.
The reliance on standardized data sources, open data platforms, and clear versioning policies contributes to the long-term sustainability and reliability of these datasets for research, policy-making, and open-access projects.