Skip to the content

User guide > Data Browser > Data access via API > API - Detailed guidelines > SDMX3.0 API > data query

Overview

Data queries allow retrieving statistical data. Entire datasets, individual observations, or anything in between, can be retrieved using filters on dimensions (including time).

The data retrieved can be retrieved in a variety of formats (JSON, XML, CSV, etc.).

Depending on the request, a data query can result in a (potentially very) large response in which case data is delivered asynchronously. For more information please read the page API - Detailed guidelines - Asynchronous API

 It is important to remember that only the last version of each statistical observation is made available in the system. When a statistical observation is being updated, the previous value of the observation is lost and cannot be returned.

Query URL syntax

Generic SDMX3.0 syntax is the following

protocol://ws-entry-point/data/{context}/{agencyID}/{resourceID}/{version}/{key}?{c}&{firstNObservations}&{lastNObservations}&{attributes}&{measures}&{returnData}

Parameter Type Description Default Multiple values?
context Must be set to dataflow Data can be reported against a data structure, a dataflow or a provision agreement. This parameter allows selecting the desired context for data retrieval. * No
agencyID A string compliant with the SDMX common:NCNameIDType The agency maintaining the artefact for which data have been reported. * No
resourceID A string compliant with the SDMX common:IDType The id of the artefact for which data have been reported. * No
version A string compliant with the allowed SDMX versioning schemes The version of the artefact for which data have been reported. * No
key A string compliant with the KeyType defined in the SDMX Open API specification. The combination of dimension values identifying the slice of the cube for which data should be returned. Wildcarding is supported via the * operator. For example, if the following key identifies the bilateral exchange rates for the daily US dollar exchange rate against the euro, D.USD.EUR.SP00.A, then the following key can be used to retrieve the data for all currencies against the euro: D.*.EUR.SP00.A. Any dimension value omitted at the end of the Key is assumed as equivalent to a wildcard, e.g. D.USD is equivalent to D.USD.*.*.* * No
c Map

Filter data by component value. For example, if a structure defines a frequency dimension (FREQ) and the code A (Annual) is an allowed value for that dimension, the following can be used to retrieve annual data: c[FREQ]=A.

Multiple values are supported, using a comma (,) as separator: c[FREQ]=A,M. The comma effectively acts as an OR statement (i.e. FREQ is A OR FREQ is M).

The plus (+) can be used whenever an AND statement is required, such as for example, for attributes with multiple values or for time ranges.

Operators may be used too (see table with operators below). This parameter can be used in addition, or instead of, the key path parameter. This parameter may be used multiple times (e.g. c[FREQ]=A,M&c[GEO]=LU), but only once per Component.

  Yes
firstNObservations Positive integer The maximum number of observations to be returned for each of the matching series, starting from the first observation   No
lastNObservations Positive integer The maximum number of observations to be returned for each of the matching series, counting back from the most recent observation   No
attributes String This parameter specifies the attributes to be returned. Possible options are: dsd (all the attributes defined in the data structure definition), msd (all the reference metadata attributes), dataset (all the attributes attached to the dataset-level), series (all the attributes attached to the series-level), obs (all the attributes attached to the observation-level), all (all attributes), none (no attributes), {attribute_id}: The ID of one or more attributes the caller is interested in. dsd No
measures String This parameter specifies the measures to be returned. Possible options are: all (all measures), none (no measure), {measure_id}: The ID of one or more measures the caller is interested in. all No
returnData   Only available for the TSV and SDMX-CSV formats. 
Supported values are:
ALL All time-series are returned in the output, including the ones having no data / flag.
DATA_ONLY (default value) Only the time-series for which data or flag exists are returned in the output.
   

The following rules apply:

  • Multiple values for a parameter must be separated using a comma (,).
  • Default values do not need to be supplied if they are the last element in the path.
  • Operators can be used to refine the applicability of the c query parameter:
Operator Meaning Note
eq Equals Default if no operator is specified and there is only one value (e.g. c[FREQ]=M is equivalent to c[FREQ]=eq:M)
ne Not equal to  
lt Less than  
le Less than or equal to  
gt Greater than  
ge Greater than or equal to  
co Contains  
nc Does not contain  
sw Starts with  
ew Ends with  

Operators appear as prefix to the component value(s) and are separated from it by a (e.g. c[TIME_PERIOD]=ge:2020-01+le:2020-12).

Response types

The following media types can be used with data queries:

  • application/vnd.sdmx.data+xml;version=3.0.0
  • application/vnd.sdmx.data+csv;version=2.0.0;labels=[id|name|both];timeFormat=[original|normalized];keys=[none|obs|series|both]
  • application/vnd.sdmx.genericdata+xml;version=2.1
  • application/vnd.sdmx.structurespecificdata+xml;version=2.1
  • application/vnd.sdmx.data+csv;version=1.0.0;labels=[id|both];timeFormat=[original|normalized]

The default format is highlighted in bold.

SDMX-CSV offers the possibility to set the value for two parameters via the media-type. These parameters are label and timeFormat; both are optional. The default values for these parameters are marked with * in the above media-type (i.e. id and original respectively).

Key parameter

The key parameter defines values of the dimensions in the order of structure.

The component "c" parameter can be used to additionally define filters on Dimensions on top of key parameter.

The c parameter does not filter on attributes, neither measures values.

Key parameter supports wildcard "*"

Wildcard "*" means that no filtering on the dimension is applies.

protocol://ws-entry-point/sdmx/3.0/data/dataflow/ESTAT/T2020_20/1.0/*.*.*.DE

protocol://ws-entry-point/sdmx/3.0/data/dataflow/ESTAT/T2020_20/1.0/*.*.T2020_20.DE

protocol://ws-entry-point/sdmx/3.0/data/dataflow/ESTAT/T2020_20/1.0/*.PC_GDP.T2020_20.DE

Partial key definition (omitting the last dimension(s)) is supported

Only the last dimension position(s) can be omitted. E.g.

protocol://ws-entry-point/sdmx/3.0/data/dataflow/ESTAT/T2020_20/1.0/A.PC_GDP.T2020_20.DE

protocol://ws-entry-point/sdmx/3.0/data/dataflow/ESTAT/T2020_20/1.0/A.PC_GDP.T2020_20

protocol://ws-entry-point/sdmx/3.0/data/dataflow/ESTAT/T2020_20/1.0/A.PC_GDP

Attributes and measures parameter

SDMX 3.0 parameters attributes and measures combinations can be named according to the previously known SDMX 2.1 Data detail parameter

attributes \ measures no value / "all" "none"
no value / "dsd" / "all" supported (equivalent to detail = "full")  not supported
"series"  not supported supported (equivalent to detail = "nodata")
"obs"  not supported  not supported
"none"

supported (equivalent to detail = "dataonly")

supported (equivalent to detail = "serieskeysonly")

Request FULL

protocol://ws-entry-point/sdmx/3.0/data/dataflow/ESTAT/T2020_20/1.0?attributes=all&measures=all

Accept: application/vnd.sdmx.data+csv; version=2.0.0

All data is returned (default)

STRUCTURE,STRUCTURE_ID,freq,unit,indic_eu,geo,TIME_PERIOD,OBS_VALUE,OBS_FLAG,TARGET_FLAG,TARGET
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,AT,2016,3.12,e,,3.76
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,AT,2017,3.06,,,3.76
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,AT,2018,3.09,e,,3.76
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,AT,2019,3.13,,,3.76

Request DataOnly

protocol://ws-entry-point/sdmx/3.0/data/dataflow/ESTAT/T2020_20/1.0?attributes=none&measures=all

Accept: application/vnd.sdmx.data+csv; version=2.0.0

The observations (OBS_VAL only) are returned for each series, but not the attributes

STRUCTURE,STRUCTURE_ID,freq,unit,indic_eu,geo,TIME_PERIOD,OBS_VALUE
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,AT,2016,3.12
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,AT,2017,3.06
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,AT,2018,3.09
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,AT,2019,3.13

Request NoData

protocol://ws-entry-point/sdmx/3.0/data/dataflow/ESTAT/T2020_20/1.0?attributes=series&measures=none

Accept: application/vnd.sdmx.data+csv; version=2.0.0

Only the series level attributes are returned for each series

DATAFLOW,LAST UPDATE,freq,unit,indic_eu,geo,TARGET_FLAG,TARGET
ESTAT:T2020_20(1.0),14/12/21 23:00:00,A,PC_GDP,T2020_20,AT,,3.76
ESTAT:T2020_20(1.0),14/12/21 23:00:00,A,PC_GDP,T2020_20,BA,,
ESTAT:T2020_20(1.0),14/12/21 23:00:00,A,PC_GDP,T2020_20,BE,,3
ESTAT:T2020_20(1.0),14/12/21 23:00:00,A,PC_GDP,T2020_20,BG,,1.5

Request SeriesKeysOnly

protocol://ws-entry-point/sdmx/3.0/data/dataflow/ESTAT/T2020_20/1.0?attributes=none&measures=none

No data or attributes are returned for each series, only the series key

STRUCTURE,STRUCTURE_ID,freq,unit,indic_eu,geo
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,AT
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,BA
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,BE
dataflow,ESTAT:T2020_20(1.0),A,PC_GDP,T2020_20,BG

Return data parameter

In SDMX, the dataset boundaries can be expressed as DataConstraint describing the available positions for each dimension. It means that a position is used at least once in a data time-series.

This listing of positions for each dimension is used as the basis when building a data result for a received query.

For example, supposing a dataset exposing yearly since 2000 where data for Croatia is available only since 2004. The following query in TSV:

GEO = "Croatia" And TIME_PERIOD >= 1990 and TIME_PERIOD <= 2006

is returning a TSV file containing the (empty) columns 2000, 2001, 2002 and 2003 in addition to the columns 2004, 2005 and 2006.

Also, to be noted that in some specific filtered results some positions could be present while no actual data being present for them.

This information is encoded in the JSONSTAT format extension "positions-with-no-data". To retake the above example, the positions 2000, 2001, 2002 and 2003 for the TIME_PERIOD dimension are represented as positions which did not match any data.

This is enough to understand the boundaries defined for cube-oriented JSONSTAT format, however TSV as a time-series oriented format has the specific feature to not include lines for non-existing time-series.

For example, the following query 

https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/DEMO_GIND/1.0?format=tsv&compress=false&c[FREQ]=A&c[INDIC_DE]=NATT,POPTRT&c[GEO]=FR,FX&c[TIME_PERIOD]=ge:2012+le:2021

Returns:


 

Only the existing time-series are actually present in the output thus this TSV result contains only 3 lines because the time-series c[FREQ]=A&c[INDIC_DE]=POPTRT&c[GEO]=FX does not exist.

 This response is the default one to reduce the response size and be more accurate. It is expected from API client to be in power to process such results in general as it is how Eurostat is offering its content via the heavily used Bulk Download facilities.
In case this behavior proves itself blocking to some client applications, an extra parameter &returnData=ALL could be used to let the TSV contains these expected "missing" lines.
This option is also available for the SDMX-CSV format.