2.3Issues related to the conceptual model for SU, PD and HH
2.3.1Semantic harmonised data Population Distribution theme
Title
1 - Semantic harmonised data Population Distribution theme
Description
The Population Distribution theme describes a technical harmonisation only, but not a semantic one. INSPIRE IR SDSS contains no requirements on semantic harmonisation. For example, there is no common definition on population counts (e.g. which people are included/excluded in a count).
Impact
With the lack of a common semantic model, comparing datasets from various member states will not make any sense.
Recommendations
Eurostat harvest semantically harmonised population information from the Member States in SDMX.
In the statistical community harmonised datasets are covered by an international standard called SDMX (Statistical Data- and Metadata exchange). This information is structured, it is machine readable. The SDMX dataset structure is not very different from the GML. It is strongly recommended to let the Statistical world use this existing data instead of harmonizing it into something new.
TC facilitator evaluation:
OK to pursue. SDMX is the only pan-European model for statistical data that is harmonized semantically. However further research is needed to check if data provision via SDMX can be mapped onto INSPIRE PD
Kathi Schleidt: General note to HH: there’s an old issue with the NoiseMeasure Type, has 2 levels:
1: the noise source has been added to the Unit of Measurement instead of the the noiseMeasure, semantically wrong
2: in the encoding process, the NoiseSource was dropped as the UoM is just an attribute in the GML schema
Result: the noise source cannot be provided. I’ve put together a document on this a while back and circulated, can’t find now. Would be easy to fix by shifting the attribute source of NoiseSourceTypeValue from the type UomNoise to the type NoiseMeasure
Please note that while the SDMX model is quite clear and usable for statisticians, it is not at all transparent for the spatial community. Transforming SDMX data into a form that can be ingested by spatial tools requires great effort.
If there is a desire to open up this (quite spatially based) data to the spatial community, provision via spatial data encodings (i.e. GML) would be of great benefit!
2.3.2Population Distribution object and data types
Title
2 - Population Distribution object and data types
Description
The stereotype of the StatisticalDistribution object has been defined as a Feature Type to align it to the GML standard. This stereotype mainly acts as a container to store (meta-) data about StatisticalDistribution and contains common properties of each component of the StatisticalDistribution object.
Impact
Unnecessarily clutters the data model and may confuse data providers and consumers.
Recommendations
It is recommended to remove the object type StatisticalDistribution from IR SDSS, as it is deemed not useful.
Consider to adapt the data types of INSPIRE in IR SDSS with those widely used in the statistical world and by Eurostat. By doing this, Statistical offices in the EU member states can continue using the existing procedures for data delivery and publication. They will then not be burdened with new harmonization and data deliveries.
TC facilitator evaluation:
Agree that the StatisticalDistribution being the only FeatureType in the PD model causes a lot of confusion for data providers.
2.3.3Geometry in Population Distribution and Human Health
Title
3 - Geometry in Population Distribution and Human Health
Description
The data model for Population Distribution and human health does not include the geometry of the used statistical units. It only contains the area of Dissemination, which describes the area for which the statistical data is available and / or the geographical area selected by the user.
This part of the model is a kind of description of metadata, that is not covered by the ISO-19115/9 metadata standards and should be added as a header in the dataset.
Impact
To enable integration in geospatial applications (GIS), INSPIRE is to be provided as GML, which should contain geometry. Population distribution objects do not include geometry and will primarily come from production processes that are not very close to the world of GIS. By that it’s not possible to publish this information with GML in a useful way.
Recommendations
All statistical data are spatially referenced (indirectly linked to a statistical unit), which is expressed by an common identifier, in SDMX called geographical dimension. In the geographical word (in GIS) these identifiers are used to identify the corresponding geometry and to join the tabulated data. These ID build the bridge (link) between the statistical and geographical world.
It is recommended to use Table Joining Services (TJS) to make this semantic harmonised statistical information, usable in CAD/GIS applications. A Table Joining Service (TJS) is an online web service, that links statistical tables to map services. The geometry can originate from existing geospatial information services, or map services for INSPIRE harmonized statistical Units. The TJS performs an online task that is normally done by a GIS specialist. The TJS is an OGC (Open Geospatial Consortium) implementation standard.
Although INSPIRE compliance is the Member States responsibility, disseminating data centrally by Eurostat (instead of Member States) is recommended.
Support, initiatives like one from the the Task Force on the future EU censuses of population and housing, Luxembourg, 7 – 8 December 2016, on Implementing INSPIRE for population grid statistics using the Census Hub.
TC facilitator evaluation:
OK to pursue. Further research is needed to find a solution for feasible PD data provision, whether it is through TJS or WCS. TJS research ongoing as a project by Statistics Netherlands, WCS is an idea worth investigating
2.3.4Current Population distribution data model not feasible for data providers and not usable for data users
Title
5 - Current Population distribution data model not feasible for data providers and not usable for data users
Description
The current Population Distribution (demography) model is considered not feasible for data providers and users. There are on-going and planned research activities which aim at solving this issue. Those include: Web Coverage Services (WCS) and Table Joining Services (TJS) in combination with SDMX.
Impact
In both cases the output data model will probably not be compatible with the current Population Distribution (demography) model.
We need to research whether SDMX, TJS or WCS output models could be mapped onto existing Implementation Rules for Population Distribution (demography)? If this mapping proves unsuccessful there will be a need for changes in the Population Distribution (demography). Specific changes cannot be determined at this time and further research needs to be performed for the above mentioned alternative dissemination channels.
Recommendations
It is possible that a research could be conducted for a whole new and usable Population Distribution (demography) model taking in regard the SDMX, TJS and WCS solutions.
TC facilitator evaluation:
OK to pursue. Further research is needed to find a solution for feasible PD data provision, whether it is through TJS or WCS. TJS research ongoing as a project by Statistics Netherlands, WCS is an idea worth investigating
2.3.5Current Population distribution data model not feasible for data providers and not usable for data users
Country /Issue number:
PL
Affected article / annex:
Annex III.10
Theme(s):
Population distribution (demography)
Subject:Current Population distribution data model not feasible for data providers and not usable for data users
Observations / problem description:
The current Population Distribution (demography) model contains only one featureType (StatisticalDistrubution), which forces data providers to store multiple values (for different statistical units and different classification elements) in a single feature. The geometry of this feature is a polygon covering the whole area of dissemination (e.g. a country), which causes even more confusion.
Proposed legislative change(s):
There are ongoing and planned research activities which aim at solving this issue. Those include: Table Joining Services (TJS) and Web Coverage Services (WCS). In both cases the output data model will not be compatible with the current Population Distribution (demography) model.
It is possible that TJS or WCS output models could be mapped onto existing Implementation Rules for Population Distribution (demography) but if the mapping proves unsuccessful there will be a need for changes in the Population Distribution (demography). Specific changes cannot be determined at this time and further research needs to be performed for the above mentioned alternative dissemination channels.
It is possible that a research should be conducted for a whole new and usable Population Distribution (demography) model.
Rationale for change(s):
The model in its current state is not feasible for data providers and not useful for data users.
Expected impacts (including benefits):
The proposed change will improve the quality of output data for Population Distribution (demography) with benefit for data users, which will receive data in an intelligible form. Implementation burden for data providers will also be reduced.
TC facilitator evaluation:
OK to pursue. Further research is needed to find a solution for feasible PD data provision, whether it is through TJS or WCS. TJS research ongoing as a project by Statistics Netherlands, WCS is an idea worth investigating
2.3.6The SU data model is missing a SU-type attribute
Title
6 - The SU data model is missing a SU-type attribute
Description
The SU data model is missing a SU-type attribute to filter upon. Examples of types are: communities, nuts regions, neighbourhoods and districts valid for different years.
Impact
If a country wants to serve more than one type of SU, it is difficult to serve them in one service, since only one layer for SU is accepted. As a consequence they are all put together in one layer which makes it very difficult to separate them again as a user. In the Dutch case we are talking about 460 different types of SU. Creating one service per type is no option, because it will become far to expensive.
Recommendations
Ad a SU-type attribute to the general SU feature type to make filtering easier for users, possibly by means of predefined stored filters.
Another solution could be the use of group layers. But then we need to accept group layers as served by Geoserver and to be less rigid in the validation. At this moment Geoserver group layers are not accepted in the existing INSPIRE validators.
TC facilitator evaluation:
TC link(s):
2.3.7Use SDMX for Population distribution theme data provision
Country /Issue number:
PL
Affected article / annex:
Annex III.10
Theme(s):
Population distribution (demography)
Subject:Use SDMX for Population distribution theme data provision
Observations / problem description:
Member States already provide statistical data in a structured data and metadata model called SDMX (Statistical Data and Metadata exchange). It is a semantically harmonized data model for disseminating structured and machine readable data and it is used by National Statistical Institutes for reporting to Eurostat.
Proposed legislative change(s):
Consider letting Member States publish Population Distribution data in SDMX instead of the current model.
Rationale for change(s):
Data in SDMX is harmonized semantically, allowing comparison of datasets from different Member States. Comparing datasets which are not harmonized semantically does not make any sense.
Expected impacts (including benefits):
The proposed change will reduce implementation burden for Member States allowing them to use an existing, well established and functional data model.
TC facilitator evaluation:
OK to pursue. SDMX is the only pan-European model for statistical data that is harmonized semantically. However further research is needed to check if data provision via SDMX can be mapped onto INSPIRE PD