Project Workplan: Technical Aspects

The project involves the following technical aspects:

Communication Interfaces:

To be practical, and economically feasible, HITERM will have to interface to available information resources and information systems, as well as take advantage of available computer resources, linked into distributed parallel machines; the definition of interfaces, efficient communication protocols, and possibly on-line conversion and filter methods will be an important aspect of the practical implementation.

The High-Performance Network will connect several types of nodes distributed over Europe (or a country) providing a service of message exchanging:

  • Data acquisition nodes
  • High-Performance computing nodes
  • Fixed clients
  • Mobile clients (including hand-held computers)

Some of the functional needs of the High-Performance Network are:

  • Typical very fast, short messages between nodes
  • Very high reliability and resilience, i.e., certain messages may have to be delivered by alternative or back-up means;
  • Automatic choice of alternative physical connections should the default ones fail
  • Security (with emphasis on authentication, crypting, integrity of data, etc.)
  • Automatic choice of physical connections and protocol based on a message's urgency and size
  • Connection of mobile nodes
  • Support for multi-media
  • Support of wireless connections
  • Point-to-point or broadcast connections.

The Network should be based as much as possible on existing infra-structures and protocols: it will use, transparently:

  • existing Internet connections;
  • dedicated Euro-ISDN (with bundled 64 Kbit/s lines);
  • the emerging broad-band B-ISDN (ATM based);
  • and GSM for wireless connections.

B-ISDN (ITU-T I.121) is expected to deliver up to 2.5 Gbit/s with SDH (Synchronous Digital Hierarchy) starting at 51 Mbit/s, based on Standards Optical Carrier (OC-1 to OC-48). It is one of the basic tasks of the project to find the optimal strategies, with open upgrade pathes, to achieve this functionality, available at least as an API, ideally based on existing protocols or products.

Parallel simulation modeling:

Available environmental simulation models have been designed for classical architectures; their adaptation for efficient parallel processing, and the linkage of the simulation models with the data acquisition channels and the data interpretation, visualization, and DSS components will be a major element of HITERM. This includes transport (Eulerian, Lagrangian) and chemistry models of the atmosphere as well as dispersion models for soils and water bodies.

To guarantee a flexible, modular system where alternative models can be integrated as required, an object oriented approach will be used. Depending on the specific conditions of an accident (location, environmental media involved, substance and release conditions), alternative models can be used (based on a small set of selection criteria and rules) for the most efficient overall systems performance.

For the simulation of the air pollution dispersion in case of emergency releases Eulerian and Lagrangian atmospheric models will be used. If the released substances are reactive, an air-chemistry model must be added. For the simulation of pollution dispersion in water and soil, classical finite difference or finite element models will be used for one, two, and three dimensional representations of combined flow and transport models. Parallelization can be based on geometrical decomposition.

Depending on the architecture used (shared or distributed memory, i.e., several CPU's sharing the same memory on a single machine, connected to a common high-speed bus, or distributed memory system which can be independent, but networked computers) different strategies will be used: for shared memory, the use of threads supporting various synchronization strategies will be explored. On parallel HPC hardware with distributed memory, the parallel implementation of the simulation models will be performed by means of the message-passing approach. In this case the source program (commonly in FORTRAN) must be extended by primitives for sending/receiving messages, initiating and suspending processes, etc. This program redesign is supported by various software tools for the message-passing programming model like PVM (Parallel Virtual Machine), MPI (Message-Passing Interface) which are both hardware independent and some hardware-dependent packages like PARIX of Parsytec GmbH.

Several layers of parallelization will be used to flexibly exploit different HPCN hardware and systems: at the level of the overall system, a task-parallel implementation will be used, based on an object oriented approach that controls and coordinates the HPC resources as methods available to the various objects. In this scheme, one compute resource, for example pre-processes the monitoring data, one estimates the wind field, one or several run the particle/plume models, one does impact calculations, etc. Each of the compute resources may, in turn, be a parallel machine using further, lower-level parallelization (instruction parallel). Providing objects with alternative methods, and strategies to choose methods depending on context, should allow implementation on a wide range of different computer equipment including WAN resources.

Uncertainty:

The field of environmental modeling and forecasting involves considerable uncertainties, both in terms of model and parameter uncertainty, as well as in the input data. This involves both the initial and boundary conditions of the system to be simulated, including important parameters such as actual release conditions, environmental variables such as (atmospheric or hydrological) flow fields, and the reactions rates of the substances involved. A possible approach here is to simultaneously run more than one model, with more than one set of parameters, and possibly different (selected or re-sampled) input data sets. This obviously can lead to a combinatorial explosion that, given the requirements for HPCN in the first place will quickly become technically intractable. Therefor, intelligent strategies to dynamically allocate available computational resources to real-time uncertainty analysis more efficient than straight-forward Monte-Carlo methods but useful for complex, spatially explicit non-linear dynamic models, will have to be developed.

The Monte-Carlo approach is based on the definition of a priori probability distributions or density functions (PDF) for model inputs or parameters. From these distributions, repeated (or parallel) samples are drawn (preserving the autocorrelation structure of the input data to ensure consistency, e.g., of hydro-meteorological data), each resulting in one complete set of input data for a model run. These independent model runs can then be executed in parallel on HPC (parallel) computers. The model results are then integrated (convoluted) into an expected (average) behaviour and a probability density function around that mean, resulting from the propagation of the initial input uncertainty.

With observation data becoming available in real-time, subsequent series of model runs can be re-calibrated (thus narrowing down the sampling space of model inputs), and use the observed systems state as the initial condition for subsequent runs. Given the initial uncertainty about key parameters that usually characterizes industrial or transportation accidents, this uncertainty is an important part of the decision making process.

Real-time data interpretation:

The volume of information generated by large-scale 3D dynamic models, in particular when considering more than one model solution generated in parallel, is enormous and overwhelming to the human observer. Therefor, related methods of real-time interpretation (including visualization, pattern recognition, classification, etc.) will be required to translate the model output into useful, decision relevant information that can be presented, in multi-media formats, to the end user in real time.

An important element is the preparation of topical maps, using local GIS data around the accident site. Maps, as a familiar format, are an effective basis for the communication of complex information by providing a familiar context. On this basis, and supported by other display styles, graphs, and possibly synthesized text and audio signals, a clear picture of the state and expected evolution of the system, including the uncertainty of the forecasts, have to be summarized. This includes spatial interpolation, 3D reconstruction, rendering, and animation, as well as various forms of statistical analysis.

The main objective of the visualization is the presentation of the large volume of data, including the representation of uncertainty, in a directly understandable, easy to comprehend form, i.e., largely symbolic and graphical, that guarantees safe interpretation even under the special situation of emergency management. Since most of the graphic rendering is again compute intensive, the use of HPC is also required at this stage of the overall information processing system.

Decision support:

In addition to visualization, the information generated by the simulation models should also be translated into decision support, advise, and instructions for the human operator in the field, or a trainee in a simulated emergency management exercise.

The basic DSS paradigm that will be used is based on the multi-criteria selection from a set of discrete alternatives. Similar to the Monte-Carlo simulation approach, representing input and model uncertainty, alternative decisions (e.g., on emergency measures such as evacuation of population or the closing of roads) can be simulated in parallel. From these alternative scenarios of possible action, a non-dominated, Pareto-set can be automatically generated (see, for example, Zhao, Winkelbauer and Fedra, 1985 for an application in hazardous substances transportation); the final selection of the preferred alternative can then either be done interactively by the operator, or automatically, using a reference point (a description of a preferred solution implying certain trade-offs) defined by the operator to find an efficient solution as the default proposed to the operator. This procedure is based on a multi-dimensional metric in the normalized performance space of the model, where a feasible, non-dominated solution nearest (in terms of the metric employed) to operator defined optimal reference solution is found.

For a dynamic process, which can involve a very large number of possible alternative courses of action that must continually be updated, the use of the reference point approach is one possibility to channel the combinatorial explosion of options a posteriori, but based on an analysis of a larger set of plausible strategies rather than an more or less arbitrary a priori choice that is much less flexible and may lead to a sub-optimal response.

As an additional element, and to provide another (parallel) check on the advisory output from the system, will be based on a classification approach. Using either a rule-based expert system, or a (artificial) neural net, the available information in an emergency will be classified, and the pre-processed response generated on this basis. The classification into standard situations, and the association of standard responses to these situations, will be based on the large volume of simulation experiments generated in the project. Automatic learning algorithms such as ID3 will be used to generate simple production rules, and a neural net can be trained with the simulation examples.

Again, this approach is only possible through the use of HPC, which provides the means for the generation, and subsequent analysis, of a large set of alternatives. The approach, similar to the Monte-Carlo approach to uncertainty analysis, is well suited to a task parallel implementation on distributed parallel, but also massive parallel, machines.


© Copyright 1995-2016 by:   ESS   Environmental Software and Services GmbH AUSTRIA | print page