Project On-line Deliverables: D05
Model Calibration and Uncertainty Analysis
Technical Report (FINAL)
This deliverable describes model calibration rules, sensitivity studies and uncertainty analysis, and documents the results.
Model calibration and uncertainty analysis are two important tasks to encourage potential users to use the developed software and to ensure a maximum level of truthfulness. The use of state-of-the-art methods for sensitivity analysis (Automatic Differentiation) allows the quantification of the sensitivity of a model output with regard to its input parameters for a given input data set. This forms a basis for a pre-selection of parameters which have a dominant influence on the model result. The methodology is documented for a selected spill release model.
Monte Carlo simulation techniques are generally used for varying input parameter sets of release submodules. Rather than using a single value, a time dependent source strength probability function serves as the input for the dispersion calculation. Usually, the evaluation of atmospheric and aquatic dispersion models is based on large field experiments and cannot be executed within the framework of the HITERM project. However, a comparison of test cases with observed structures or a comparison of numerical model results with analytical solutions for simple cases was carried out, whenever it was possible.
Furthermore, due to the crucial role of the technical reliability and required execution time of the complete system, the overall performance is documented.
Table of Contents
2 UNCERTAINTY AND SENSITIVITY ANALYSIS
3 MODEL EVALUATION
Model performance evaluation is realized by a comparison of the model results with the real world. In the case of atmospheric and air pollution models, an enhanced model evaluation is quite difficult. Data from field campaigns together with data from continuous measuring nets has to be compared with simulated data sets.
This is not within the scope of this project. Moreover, given the possibility to compare simulated with real data, it is hard to distinguish between real model errors and data incorrectness or inadequateness.
A model may be valid (this includes an appropriate modelling of the desired features, a proper selection of the numerical schemes, and a careful implementation), but the outcome can be insufficient due to a lack of appropriate input data.
A validation requires the usage of a perfect input data set to distinguish between model and data. Additionally, there is a principle validation problem which is still the subject of scientific investigations. The reason can be found in the different nature of point measurements and gridded model results as well as in the stochastic behavior of local turbulence patterns.
Most of the selected software components already have a broad range of applications, and can therefore be considered as sufficiently evaluated within their application range. This is the case for the wind field model which has been derived from EPA's DWM (Douglas 1990) and for the different release models. Nevertheless, an evaluation of the system is documented.
The model evaluation methodology in the context of the HITERM project focuses on the following features:
The major source of uncertainties is given by the, in general, sparse and insufficient data structure of emergency cases. Especially the differences caused by unknown release parameters can lead to uncertainties in the concentration calculation of up to one order of magnitude.
2 UNCERTAINTY AND SENSITIVITY ANALYSIS One of the most important input parameters for all kind of dispersion models is the time dependent emission rate of the source. This rate is sufficiently known in the case of a continuously operational source, such as the stack of a power plant or an industrial estate. In the case of an accident, this parameter is very rarely known. Very often, a rough approximation of the total release mass is given and, in some cases, this release mass is unknown as well. This requires the incorporation of different source-strength models. But release and spill models for emergency management tasks suffer from a high degree of uncertainty regarding input parameter set. Depending on the sensitivity of the source function, this is the reason for many unrealistic concentration patterns.
Two different methods have been used to determine the uncertainty range of the source module: Monte Carlo Simulation and Automatic Differentiation. Monte Carlo methods are directly included in the model system whereus Automatic Differentiation has been used in the implementation phase.
The validation of models is a central concern of Applied Mathematics. Sensitivity analysis is carried out in order to examine the robustness of numerical results with respect to changes of the model structure, input parameters, algorithms, machine accuracy etc. The effects of perturbations concerning data or computations can be studied by the application of the Monte Carlo method. Accurate sets of data are perturbed by random data corresponding in their distribution and correlation as closely as possible to those of the real data inaccuracy (Überhuber 1995).
Within HITERM the Monte Carlo approach is used to find a source-strength probability function with the help of measured input parameters and user specified uncertainty ranges for these parameters. The computation of the probability function requires thousands of runs of the same code with varying parameters. In the case of a very complex multi-parameter release module this can be very time-consuming. Additionally, the dependencies of the individual parameters are hidden.
A more general approach is used in automatic differentiation. With automatic differentiation, the sensitivity of the result with regard to the individual parameters can be quantitatively specified. This is especially useful for the preselection of parameters which have to be varied with the Monte Carlo method. If a new release module is added, a sensitivity analysis can be executed using automatic differentiation to filter out the most sensitive parameters for a later Monte Carlo simulation. Additionally, the automatic differentiation approach defines which parameters have to be determined with a maximum of accuracy.
2.1 Automatic Differentiation for the Example:
The variables denote the following parameters:
|t||time elapsed since the spill [s]|
|evap||evaporation rate [g/m**2s]|
|sa||solar altitude [rad]|
|cloud||cloud cover [%]|
|ubar10||wind speed measured at 10 m [m/s]|
|tempa||ambient air temperature [K]|
|diamp||pool diameter or downwind length of pool [m]|
|depthp||initial depth of the pool [m].|
For the results shown in the first row of Table 1, the initial values
of the parameters are given as follows:
sa = /3, cloud = 10, ubar10 = 2, tempa = 290, diamp = 10, depthp = 2.
For the second row of Table 1, only the value of sa was changed to 2/3. The third row of Table 1 represents the results for sa = 2/3 and cloud = 90.
On the one hand, the results reflect the influence of each parameter on the evaporation rate with respect to the selected set of parameters. On the other hand, it is possible to examine the interaction between the single influences. The higher the absolute value of a derivative is, the greater influence it has on the calculation of the evaporation rate.
Moreover, it is quite visible that a change of the percentage of the cloud cover diminishes the influence of the solar altitude whereas its own influence increases. ADIFOR is a convenient tool for selecting the model parameters which should be studied more intensively by a Monte Carlo simulation. Moreover, the system provides facilities to examine the interaction between single parameters with reasonable effort.
The results of a single ADIFOR run are only valid for the selected parameter space. But the systems behaviour is very different for a different set of input parameters. Selecting typical sets of input parameters, ADIFOR can give important advise for a preselection of the input parameters which should be varied by a Monte Carlo method, but it is not suitable for replacing the Monte Carlo approach for constructing a time dependent source strength probability function.
To allow a more realistic determination of the release, a Monte Carlo method is used to construct a probability function of the source-strength for different times of the release duration. With this method, all uncertain parameters are varied in a selected range.
The parallel implementation of the source term models uses a generalized mask to run different types of source term models. The user must specify the number of input parameters which are not precisely known. In addition, an uncertainty range must be given for every parameter. This range can vary in positive or negative direction.
The same evaporation subroutine already used as an example for the application of the automatic differentiation technique has been used to demonstrate the Monte Carlo approach (see Appendix A).
A probability function of the source-strength is constructed using millions of runs of a deterministic release model. The input parameters of this model are varied over the given uncertainty range using a random generator. Figure 1 shows an example of a source-strength probability function for evaporating pool release for a given time.
The input parameter variations for this test run were:
The results exhibits an asymmetric distribution whereas neither the source-strength at the mean nor at the maximum of the probability function is equal to the "exact value" (which is the deterministic outcome of the model by neglecting the uncertainties).
If the release is a complex function of parameters (which is normally the case), the Monte Carlo run is very useful for finding even the most probable values for the source-strength. This is due to the fact that even the most probable values can be considerably different form the exact one.
The mean and the emission rate at the 95th percentile of the time dependent source term probability function are used as an input for the Lagrangian dispersion model. The mean represents the most probable emission rate, whereas the 95th percentile represents a worst case scenario for the emission rate excluding a small probability of 5 percent of higher emission rates.
The diagnostic wind field model is based on the well-established and evaluated DWM (EPA). To allow the user a faster selection of the meteorological condition where only sparse observational data is given, some new parametrization features were added.
Two examples show the typical wind flow in a hypothetical complex terrain. The measured meteorological values at 5 m above ground parameters were:
The assumed roughness length was 0.5, a quite typical average value. The numerical grid has a horizontal resolution of 100 m x 100 m. The complex orography is characterized by fairly steep rise to a mountain range in the easterly parts of the domain, some isolated smoother hills and some sharp gashes of valleys between the range. The simulation was carried out for the same measured values but for different hypothetical stability classes.
Figure 2 (above) represents the flow under very stable conditions, occurring quite often during the night. The wind is forced to flow around the sides of the obstacles. Directly upwind of the hill, some of the air is blocked and becomes nearly stagnant. Additionally, there is a visible tendency to drainage winds (valley winds). The underlying background color represents the orography.
In Figure 3 (above), the meteorological condition is very different. An unstable vertical stratification in the lower atmospheric boundary layer (happens often on hot summer days) leads to a considerably different flow despite the same measured wind direction and speed. The wind flows over the mountain range and exhibits its highest values at the top of the summits. Additionally, there is a thermally generated uphill flow, the so-called mountain winds. As demonstrated above, the diagnostic wind model is able to describe the main features of air flow in complex terrain.
3.2.1 Comparison of LPD results with Gaussian solution for a simple case Gaussian plume models are most commonly used for modelling point source emissions. In case of an (inert) air pollutant and homogeneous conditions the resulting plume represents an exact solution. The restrictive conditions comprise a wind field with a rigid advective wind which is constant in time and space and an even homogeneous orography. Thus the horizontal and vertical wind fluctuations are constant as well. The emission rate at the point source is permanent and constant.
For preconditions given as described, the concentration dispersion of
an arbitrary (inert) air pollutant at ground level is simulated by the LPD
model. It is compared to the analytical Gaussian solution calculated for the
A horizontal grid size of 51 x 51 with a resolution of 100 m is given for the meteorological (input) and the output grid. For the vertical resolution of the input the stratification of a labil wind field is applied while for the output equidistant layers of 10 m are chosen.
The inputs for both models are as follows:
Advective wind field:
The standard deviation of the velocity fluctuation and the Lagrangian time-scale are required for the v and w component and the y, z component, respectively. They are set to the following values:
The coordinates of the emission source are:
For the simulation time of 20 min the emission rate is assumed to be constant with
In the LPD model the number of emitted particles per time step is set to 200. Hence for a time step of 10 s each particle is associated with a mass of .
For the Gaussian solution the air pollutant concentration is calculated by:
with (x,y,z) the position in the three-dimensional coordinate system and H the effective stack height. For the test case the concentration was calculated at H= qz and the level of z = coord_z.
The standard deviations in the y and z direction, respectively, are obtained from the Taylor Theorem based on Taylor's homogeneous diffusion theory (Taylor 1921, Yamada 1987):
is constructed analogous.
Figure 4 (above) represents the resulting dispersion of an air pollutant at the end of the simulation time (20 min) under the described conditions and the level of the source height. The Lagrangian dispersion in the right figure shows a good reproduction of the Gaussian solution (left).
The quality of the Lagrangian dispersion is also displayed in the figures below (figures 5 and 6). The diagrams show the alteration of the concentration along an intersecting line for the Gaussian solution and for the LPD dispersion at identical locations.
In Figure 5 (above), a line intersecting parallel to the main wind direction was chosen whereas in Figure 6 (below) the intersecting line is nearly perpendicular to the main dispersion direction.
the diagram in Figure 5 proves the apparent good reproduction of the Gaussian solution by the Lagrangian dispersion calculation along the centerline of the pollutant plume. The further the distance from the centerline (figure 5) and the emission source, the larger becomes the stochastic error in the LPD model. The stochastic effect is due to the principle of the Lagrangian model theory. It can be reduced by enlarging the number of particles emitted per time step (see Section 3.2.2).
Comparing the LPD dispersion and the Gaussian solution, figure 6 indicates that the error effects are relatively small (see also Section 3.2.2). Discussing this phenomenon, it has to be considered that the solution of Gaussian models gives an exact analytical resolution and hence are not subject to any stochastic influence.
Lagrangian models are based on the determination of trajectories of individual particles. The figures in this section show the dispersion plume for different numbers of emitted particles per time step - from left to right, top to bottom: 50, 100, 200, 300, 400, and 500 particles. The conditions and parameter settings are the same as presented in Section 3.2.1 for the LPD model.
The number of particles emitted per time step influence the quality of the output, as well as the computation time. The more particles are released the clearer is the shape of the plume. The figures also show that with a certain number of particles asymptotic effects appear. These effects are depended on the presently given conditions, like meteorology, orography, etc., and the time parameters (time step, simulation time).
A limiting factor for the number of particles emitted per time step is the computational burden and the available memory. To achieve reasonable results in a reasonable time - which for risk management has to be as short as possible - a compromise has to be made for the appropriate number of particles.
The present section and the results of Section 3.2.1 show that the LPD model represents a good reproduction of the Gaussian plume under convective conditions. It can be assumed that the modelling of pollution dispersion by the LPD model also holds for stable meteorological conditions.
The number of particles emitted should be chosen in the range of 100 to 200 particles per time step. The default value in the LPD model version is set to 200. The smaller the number of particles, the larger is the effect of the stochastical error, and thus the coarser the result (see figures). A larger number of emitted particles often shows a nearly proportional increase in computation time.
For a complex orography the effect of different meteorological conditions on the dispersion calculation by the LPD model is discussed. A point source for the pollutant emission was chosen and the horizontal reference layer set to the source height. The isolines in Figure 8 (below) denote the height above sea level in m. Equal preconditions as for the emission rates, simulation time, time step, etc. are assumed.
Representative for the tests, two cases have been chosen for display.
The left part of Figure 8 shows the dispersion calculated for a slightly
unstable wind field with a mixing height of 800 m (stability class 3).
In the right of Figure 8, the dispersion in case of slightly stable
conditions with a mixing height of 300 m (stability class 5) is displayed.
For unstable meteorological conditions, the area affected by the pollutant is much wider (left part of Figure 8) than for stable case (right part of Figure 8) in the considered layer. The left part also shows that high concentrations of the pollutant occur only near to the emission source due to larger diffusion, causing, e.g., faster rising effects in an unstable wind field.
Under the given stable conditions the pollutants rest more at the ground/source level and perturbation across the main wind direction is less. Therefore the region affected by the pollutant dispersion is smaller but concentrations are higher (right part of Figure 8).
The simulated effects are as expected (also for other than the displayed meteorological conditions) and reproduce known observations gained by field experiments. This implies that the LPD model provides reasonable results for different kinds of meteorological conditions.
In combination with the argumentation in Section 3.2.1 and Section 3.2.2, the conclusion can be drawn that the LPD model represents an appropriate modul for the dispersion calculation for the given requirements.