Eionet

European Topic Centre on Air Pollution and Climate Change Mitigation

Topic Centre of European Environment Agency

**AirBase**

- Data query and retrieval
- Annual report and other info
- Data aggregation and statistics

AirBase provides statistical data on air pollutant parameters additional to the reported national measurement data.

It follows the 'criteria for the aggregation of data and the calculation of statistical parameters' of Annex IV in
Commision Decision 2001/752/EC.

**1. Hourly and daily values**

**Aggregation of data**

The air quality statistics in AirBase are based on *hourly values*, *daily (24-hour) average values*
and *daily 8-hour maximum values*. However, most of the reported measurement data are in hourly time episodes. To
obtain the daily and 8-hour based statistical parameters the hourly values (if available) are aggregated to derive
daily and 8-hourly values. If a country reports both hourly and daily values, the reported daily values will be
ignored. The calculated daily values will be used instead for calculating the statistics. If 3-hourly data are
delivered, these data are aggregated into daily values.

For the aggregation of hourly data to longer averaging periods (8 hourly, daily) a minimum data capture of 75% is required to calculate a valid aggregated value:

- a
*daily averaged*(24-hourly) concentration is calculated when at least 18 valid hourly values are available - a
*8-hourly averaged*concentration is calculated when at least 6 valid hourly values are available - a
*daily 8-hour maximum*is calculated when at least 18 valid running 8-hour averages per day are available

For the aggregation of 3hourly data to daily values we have the following rule:

- a
*daily averaged*concentration is calculated when at least 5 valid 3-hourly values are available with not more than 2 successive 3-hourly values missing.

**Statistics calculation on annual basis**

The following types of annual statistics are calculated depending on the component:

*General*concentration statistic: annual mean, 50, 95 and 98 percentiles and maximum (only SO_{2}also 99.9 percentile based on hourly values)*Exceedances*: hours/days with concentration > y µg/m^{3}(with y = limit or threshold value) and the k^{th}highest value*AOT40*: ozone concentrations accumulated dose over a threshold of 40 ppb (AOT40 definition see below)*SOMO35*: ozone concentrations accumulated dose over a threshold of 35 ppb (SOMO35 definition see below)

The annual statistical parameters of the table are routinely calculated and stored in AirBase. The statistical
parameters are calculated irrespective of the proportion of valid data (data capture) with one exception: all hourly and
daily statistics, which are based on one day or less, are excluded. So statistics with a data coverage lower than 0.275% are not
calculated. For each statistic the data coverage^{(*)} percentage is calculated. This is done as follows:

*Data Coverage = (N _{valid} / N_{year}) · 100%*

where *N _{valid}* is the number of valid hourly/daily values and

^{
(*) Definition of Coverage:
In the Air Quality Daughter Directives the terms Data Capture and Time Coverage have been defined.
Time Coverage is the percentage of measurement time in a given period.
Data Capture is the percentage of valid measurement values in a given data set.
For each yearly time series the so called Data Coverage has been calculated in AirBase and is defined as:
Data Coverage = Data Capture · Time Coverage
The data capture and time coverage, and hence the data coverage, include losses of data due to the regular calibration or the
normal maintenance of the instrumentation. In the Directives these losses are excluded.
}

The following table gives an overview of the annual statistical parameters available in AirBase. The annual statistical
parameters are calculated on basis of the *hourly values*, the *daily average values* and the *daily 8-hour
maximum values*. Be aware that the same statistical parameter, but based on a different (averaging) time period,
will lead in most cases to a different outcome. This has its cause in the difference in the definitions on how to derive
the hourly, the daily average and the daily 8-hour maximum values.

*For example*:

When there is a peak in the 1-hour values on a day, this peak value will be appointed as the maximum 1-hour value of that day. The annual maximum is the maximum of all those daily maximum values of that year.

In deriving the daily average value, such peak will be leveled out by the other hourly values of that day. If the criterion "a daily averaged concentration is calculated when at least 18 valid hourly values are available" is not fulfilled, the daily average value will even not be calculated for that day. In that case the peak value will not be recorded and stay completely 'invisible' as maximum daily average for such a day.

This leads in most cases to large differences between an annual maximum based on hourly values (table column *1-hour values*)
and an annual maximum based on daily average values (table column *daily average values*).

**Table: calculated annual statistics in AirBase**

Component | Annual statistical parameter based on | ||
---|---|---|---|

1-hour values | daily average values | daily 8-hour maximum values | |

Sulphur dioxide (SO _{2}) |
• annual mean • 50 percentile • 95 percentile • 98 percentile • 99.9 percentile • maximum • hours with conc. > 350 µg/m ^{3} • 25 ^{th} highest value |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum • days with conc. > 125 µg/m ^{3} • 4 ^{th} highest value |
- |

Nitrogen dioxide (NO _{2}) |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum • hours with conc. > 200 µg/m ^{3} • 19 ^{th} highest value |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |
- |

Nitrogen monoxide (NO) |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |
- |

Nitrogen oxides (NO _{x}) |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |
- |

Ozone (O _{3}) |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum • AOT40 |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum • days with conc. > 120 µg/m ^{3} • 26 ^{th} highest value• SOMO35 |

Carbon monoxide (CO) |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |

Particulate matter (PM _{10}) |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum • days with conc. > 50 µg/m ^{3} • 8 ^{th} highest value • 36 ^{th} highest value |
- |

Other | • annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |
• annual mean • 50 percentile • 95 percentile • 98 percentile • maximum |
- |

**Calculation of specific aggregations and statistics**

**A. For all components**

**Annual mean**

The annual mean is calculated as follows:*Annual mean = Σ*_{i}C_{i}/ N_{valid}where

*C*is the valid hourly/daily/day8hmax concentration and the summation is over all valid hourly/daily values measured in the year._{i}*N*is total number of valid hourly/daily values in the year._{valid}**Percentiles**

The*y*^{th}percentile should be selected from the measurement values (valid hourly/daily/day8hmax concentrations). All the values should be listed in increasing order:*X*_{1}≤ X_{2}≤ X_{3}≤ … ≤ X_{k}≤ … ≤ X_{N-1}≤ X_{N}The

*y*^{th}percentile is the concentration*X*, where the value of_{k}*k*is calculated as follows:*k = (q · N)*with

*q = y/100*and*N*the number of values. The value of*(q · N)*should be rounded off to the nearest whole number (values < 0.499999… are rounded to 0, values = 0.5 are rounded to 1).**Maximum**

The (annual) maximum is calculated as follows:*Maximum = max(C*_{i})where C

_{i}are the valid hourly/daily/day8hmax concentrations and*i*is running over all valid hourly/daily/day8hmax values measured in the year.

**B. Only for SO _{2}, NO_{2}, PM_{10} and O_{3}**

*k*highest value^{th}

The*k*highest value should be selected from the measurement values. All the values should be listed in decreasing order:^{th}*X*_{1}≥ X_{2}≥ X_{3}≥ … ≥ X_{k}≥ … ≥ X_{N-1}≥ X_{N}The

*k*highest value is the concentration^{th}*X*._{k}*Example*: the limit value for the protection of human health for PM_{10}is that the daily average of 50 µg/m^{3}will not be exceeded on more than 35 days per year. If the 36^{th}highest value is more than 50 µg/m^{3}, the limit value for PM_{10}has been exceeded.**Number of hours/days with concentration >***y*µg/m^{3}

The*n*number of hours/days with concentration >*y*µg/m^{3}(with*y*= limit or threshold value) can be calculated from the measurement values:*X*_{1}, X_{2}, X_{3}, … , X_{k}, … , X_{N-1}, X_{N}*N*is the number of*X*-values for which_{k}*X*>_{k}*y*µg/m^{3}. If*n*> 35 in the example on PM_{10}at the previous bullet, the limit value for PM_{10}has been exceeded.

**C. Only for O _{3} and CO**

**8-hour running mean**

The 8-hour running averaged value for each hour is calculated as the average of the values for that hour and the 7 foregoing hours (averaging period). So, the averaging period of*hour*of_{1}*day*is_{n}*hour*of_{17}*day*until_{n-1}*hour*of_{1}*day*. The averaging period of_{n}*hour*of_{24}*day*is_{n}*hour*of_{16}*day*until_{n}*hour*of_{24}*day*._{n}**Maximum daily 8-hour mean**

The maximum daily 8-hour for a day is the maximum of all the 8-hours running averages for that day.

**D. Only for O _{3}**

**AOT40 (crops)**

(__A__ccumulated dose of ozone__O__ver a__T__hreshold of 40 ppb)

AOT40 means the sum of the differences between hourly concentrations greater than 80 µg/m^{3}(= 40 parts per billion) and 80 µg/m^{3}:*AOT40*_{measured}= Σ_{i}max(0, (C_{i}- 80))where

*C*is the hourly mean ozone concentration in µg/m_{i}^{3}and the summation is over all hourly values measured between 8.00 – 20.00 Central European Time^{(**)}each day and for days in the 3 month growing season crops from 1 May to 31 July.^{(**) In AirBase the time zone was disregarded. So the values between 8.00 - 12.00 of the reported time have been taken. }AOT40 has a dimension of (µg/m

^{3})·hours. AOT40 is sensitive to missing values and a correction to full time coverage has been applied:*AOT40*_{estimate}= (AOT40_{measured}· N_{period}) / N_{valid}where

*N*is the number of valid hourly values and_{valid}*N*is the number of hours in the period._{period}**SOMO35**

(__S__um of__O__zone__M__eans__O__ver 35 ppb)

For quantification of the health impacts the World Health Organisation recommends the use of the SOMO35 indicator. SOMO35 stands for the sum of the differences between maximum daily 8-hour running mean concentrations greater than 70 µg/m^{3}(= 35 parts per billion) and 70 µg/m^{3}:*SOMO35*_{measured}= Σ_{i}max(0, (C_{i}- 70))where

*C*is the maximum daily 8-hour running mean ozone concentration in µg/m_{i}^{3}and the summation is over all days per calendar year.SOMO35 has a dimension of (µg/m

^{3})·days. SOMO35 is sensitive to missing values and a correction to full time coverage has been applied:*SOMO35*_{estimate}= (SOMO35_{measured}· N_{period}) / N_{valid}where

*N*is the number of valid daily values and_{valid}*N*is the number of days per year._{period}

**2. Other than hourly and daily values: n-day (n>1), n-week, n-month, year and var **

Non automatic measured components (e.g., the components from the 4^{th} Daughter Directive (Heavy Metals and PAHs) have
also other averaging times than hour and day: week, 2-week, 4-week, month, 3-month, year, etc.).

These measurements consist of samples with a start date/time and an end date/time.

The averaging time is the period of the sample (end date/time minus start date/time). If the sample periods of a
component differ 25% or more from a constant averaging time, the averaging time has been defined as "var".

*For Example:* if all periods of 4-week samples are within 21 and 35 days, the averaging time is not "var" but still 4-week.

The 100% period for a n-month sample has been defined as the period starting from the start date/time of the sample and ending on
the same day number and time *n* months later.

*Some Examples:* *(1)* the sample starts at 5 March at 00:00, the 100% 1-month period is until 5 April at 00:00;
*(2)* the sample starts at 30 January at 00:00, the 100% 1-month period would be until “virtual” 30 February, which
is actually 2 March at 00:00 (no leap year).

The only statistics calculated for these averaging times are:

- annual mean
- 50 percentile
- 95 percentile
- 98 percentile
- maximum

^{Note: n-hour values are aggregated into daily values. The statistics are based
on these daily values.}

All statistics calculations are done in analogy to the hourly/daily statistics calculations. The only exception is the data coverage calculation and the annual mean calculation. The data coverage is calculated as follows:

*Data Coverage = Σ _{i}N_{valid,i} / N_{year} · 100%*

where *N _{valid,i}* is the number of hours in the valid sample period

The annual means are calculated according to the formula:

*Annual mean = Σ _{i}N_{i}C_{i} / Σ_{i}N_{i}*

where *C _{i}* is the concentration in the valid sample period

*Remark:* if a period is partially outside the year, only hourly values are taken into account between 1 January and 31 December
of the year.

**3. Calculation of NOx values**

To obtain a better coverage of NO_{x} measurements in AirBase, there are in AirBase NO_{x} values available which
have been derived from reported NO and NO_{2} measurements following the formula:

*C _{NOx} = C_{NO2} + ((M_{NO2} / M_{NO}) * C_{NO})*

where

*
C _{NOx} = NO_{x} concentration in µg NO_{2}/m^{3} *

C_{NO2} = NO_{2} concentration in µg/m^{3}

C_{NO} = NO_{} concentration in µg/m^{3}

M_{NO} = Molecular mass of NO = 30

M_{NO2} = Molecular mass of NO_{2} = 46

For defining the measurement configuration of the derived NO_{x} measurements, the information is used of the
measurement configuration of NO. In case NO, NO_{2} and NO_{x} are all reported, the reported NO_{x} values will
have priority over the derived NO_{x} values.

Last modified 2014/08/20.
[Validate HTML]

Please, email any comments on this website to the webmaster.

Please, email any comments on this website to the webmaster.