School of Information Technology and Electric Engineering, The University of Queensland, Brisbane, Qld 4072, Australia
With the movement of magnetic resonance imaging (MRI) technology towards higher field (and therefore frequency) systems, the interaction of the fields generated by the system with patients, healthcare workers, and internally within the system is attracting more attention. Due to the complexity of the interactions, computational modeling plays an essential role in the analysis, design, and development of modern MRI systems. As a result of the large computational scale associated with most of the MRI models, numerical schemes that rely on a single computer processing unit often require a significant amount of memory and long computational times, which makes modeling of these problems quite inefficient. This paper presents dedicated message passing interface (MPI), OPENMP parallel computing solvers for finite-difference time-domain (FDTD), and quasistatic finite-difference (QSFD) schemes. The FDTD and QSFD methods have been widely used to model/analyze the induction of electric fields/currents in voxel phantoms and MRI system components at high and low frequencies, respectively. The power of the optimized parallel computing architectures is illustrated by distinct, large-scale field calculation problems and shows significant computational advantages over conventional single processing platforms.
1. Introduction
Recent progress in MRI
superconducting technology has lead to a considerable increase in human
exposure to very large static magnetic fields of up to several Tesla. A
typical commercial MRI scanner has a main field strength between 0.5 Tesla and
4 Tesla (in the region of uniformity). However, MRI systems above 4 Tesla, such
as 7 Tesla and 9.4 Tesla, are currently used for research [1–3]. Motion of patients and occupational workers
through these fields can cause the induction of currents in the body [4, 5]. In
addition, new imaging sequences demand larger amplitudes and switching rates in
magnetic field gradients, thus increasing the likelihood of peripheral nerve
stimulation (PNS) [6–9]. When
radiofrequency fields are transmitted to excite a spin ensemble during imaging,
electromagnetic energy is coupled to the tissue and deposited within, which can
cause regional temperature elevations within the patient, thus leading to
possible tissue/cell injury [10, 11]. Apart from interacting with the patient,
electromagnetic fields produced by the system also couple to the conducting
materials in the MRI system to induce eddy currents that degrade image quality
[12]. In practice, it is difficult to measure the unperturbed fields/currents
in the human numerical prediction of the induced fields, can then be very
useful for the evaluation/design of electromagnetic devices. The aim of this
study is to improve the performance of the conventional (quasistatic)
finite-difference method, which has been extensively used to model the
induction of electric fields/currents in inhomogeneous voxel phantoms of man
and animal when exposed to oscillating or static magnetic field sources. The
method has the advantages of simplicity
and can model any shaped object and various excitation sources.
Numerical computational techniques are indispensable
components for analyzing,
predicting, and obtaining approximate solutions to the real world problems. The finite-difference time-domain (FDTD) method is well suited to high-frequency electromagnetic analyses due
to its simplicity and efficiency in wave modeling and the ability to handle field-material interactions
and nonlinear phenomena [13]. In contrast, the quasistatic finite-difference
(QSFD) method is ideal for linear low-frequency problems with time-harmonic
dependency. One example of this is
the modeling of the interaction of magnetic fields due to MRI gradients and
main superconducting coils with the patient or occupational worker [14–16].
In
recent years, we have developed a series of single processor-based
finite-difference schemes [14–24], which can be
used to simulate a wide range of problems related to MRI and other
environments. Our long term aim is to generate a complete temporal model of an
MRI system including all the field generating units, passive system components,
and an electrical, voxel model of the patient. A better understanding of the
complex temporal interaction of the fields within the patient and the system
itself can provide useful insight into system design.
Often,
numerical modeling of electromagnetic field—material
interactions in MRI requires high-spatial resolution, which imposes a
significant computational burden. In solving
large-scale-high-resolution (LSHR) models, single processor-based methods are
limited and often incapable of managing the required large memory and
computational time requirements. Therefore, to improve the performance of these
finite difference solvers, it is necessary to further explore computational
strategies such as parallelization. Parallel environments such as the message
passing interface (MPI), OPENMP, and
LAM/MPI have been used in large-scale computational problems [25–27]. Due to the
geometrical topology of these problems, finite-difference methods and related hybrid
algorithms are very adaptable to parallel computing frameworks, in which the
computing task is divided and assigned to many processors with distributed or
shared memory allocations. This paper outlines MPI and OPENMP parallel
computing solvers dedicated to the FDTD and QSFD method, respectively. In the
case of the QSFD scheme, a parallelized conjugate gradient technique is applied
to solve the sparse matrix
equations typical in these situations. A variety of LSHR problems has been
investigated using parallel schemes. Compared to previous single processing
algorithms, the proposed parallel platforms offer significant advantages in
terms of improved numerical convergence and reduced computing time/memory
usage. Some of the problems have been solved at high-spatial
resolution, which is beyond the capability of single processing methods.
2. Methodology
2.1. Parallel Cartesian and Cylindrical FDTD Methods
Maxwell’s equations can be expressed in Cartesian or cylindrical
coordinate systems with components of the magnetic and electric field intensity formulated in a general form [28], as given by the following
expressions: where and for Cartesian (cylindrical) coordinates .
The variable is the radial displacement [in meters]
employed in the cylindrical FDTD method. For the Cartesian FDTD method, for all values of .
In the discrete FDTD computational domain, the dimensions of the Yee cell
are defined by and .
The relative permeability ,
relative permittivity and material conductivity are assigned at the center of the Yee cell.
Parameter signifies the impressed current density [in ]. The FDTD update coefficients are space-dependent and can be written
as follows: where is the angular frequency. The abbreviations RF and LF denote
radiofrequency and low frequency, respectively. The parameter is the FDTD time step, which in general
coordinate notation, is limited by the Courant-Friedrich-Levy (CFL) stability
condition where and are the permeability and permittivity of free
space, respectively, is the safety coefficient that ensures
numerical stability, and is the minimal radial component to be studied
(here: ). In the Cartesian coordinate system, .
During the updating of the E-field components, the coefficients are linear for
the RF case and nonlinear for modeling exponential
decay of propagating waves inside good conductors at LF.
In LF applications, the conventional FDTD method suffers from prohibitively
long computational time [20]. The weakly coupled-Maxwell’s equations can be
adapted to the low-frequency regime by downscaling the speed of light constant,
which permits the use of larger time steps while maintaining the validity of
the CFL stability condition. In our recent study, this modification was
accomplished by scaling up the permittivity
of free space by a scaling factor α: .
It is clear that when ,
the conventional FDTD methodology for RF applications is obtained. In the cylindrical formulations, the
numerical singularity associated with the polar axis needs to be
considered. A series expansion provides an approximation in the radial
direction that satisfies regularity conditions. The combination of cylindrical
and Cartesian FDTD methods for both LF and HF applications can be particularly
powerful. For instance, the cylindrical FDTD method is particularly suitable
for modeling of problems with cylindrical symmetry such as the generation of
eddy currents in a conducting cryostat vessel during magnetic field gradient
pulsing or the interaction of EM fields produced by radiofrequency resonators
and surrounding system materials. The Cartesian FDTD method allows one to
evaluate the electromagnetic wave propagation through tissue-equivalent body
models. These FDTD algorithms can be easily adapted to a parallel
architecture with the MPI library. With its reputed stability, optional
communication routines, and robust compatibility, MPI is considered to be one
of the prominent environments to support parallel computing.
In FDTD, the problem space has to be specifically defined before the
field calculations commence. In the parallel framework, the
computational space is divided into several domains of approximately equal size
(see Figure 1) and each domain is managed by one process. All processes run on several processors and execute the same
program, while each process operates on memory (computational) domain that is
specifically allocated to it. Therefore, mutual access to the memory among the
parallel processors is often not straightforward and requires management. With
FDTD, additional memory can be assigned to boundary regions of computational
subdomains to maintain a smooth transition between neighbouring EM fields on
these interfaces. In addition, communication routines can be applied between
the neighbouring processes to transmit the E- and H-field components with the
consideration of the half time-step difference of these two fields to maintain
unified electromagnetic field propagation throughout the entire computational
space. Within the MPI frame, a load-balancing scheme is often desired to manage
the task distribution in the processing network to improve the computational
efficiency.
Figure 1: Schematic representation of a parallel network with 125 process domains; transmission of field data between neighbouring processes is indicated.
2.2 Parallel BiCG Method for QSFD Scheme
At low-source
frequencies, where the dimensions of the material medium are small compared to
the wavelength, the induced fields can be treated quasistatically. According to
Faraday’s law, the electric field in a conducting sample is induced by a time
varying magnetic field due to a source. In terms of electric and magnetic potentials,
the total electric field inside the body model can be split into primary field and secondary field , according to Here, denotes the vector magnetic
potential due to the source, and is the scalar electric
potential. The primary electric field causes a flow of current in a conductive sample with
the conductivity . Any conductivity differences along the path of the current,
including the air-tissue interfaces cause nonuniformity of accumulating
electric charges, giving rise to scalar potential , the negative gradient of which is the secondary field , which causes a flow of current .
Based on our
implementation of QSFD method, the computation of the electric fields is given
by the governing surface integral equation where represents the surface of the integral region and is the conductivity of the
sample. The governing equation (5) is subject to a boundary condition of the
Neumann type: which indicates
that the normal component of the induced eddy current on the tissue-air boundary is
zero inside the body. To calculate the scalar potential , we divide the computational space into a large number of cubic
cells and then (5) is approximated for each elementary cell. After
discretization and rearrangement, the scalar potential for cell can be expressed as in which
subscripts and indicate the cell indices, subscript indicates two faces (0 or 1) in directions, respectively, is the unit vector normal to the cell faces, is the cell size, and is the local harmonic
averaged conductivity. To build the matrix formation, we transform (7) into Equation (8) can then
be expressed as a linear relation in the matrix form of , where
where is the coefficient matrix, is the scalar potential solution we seek, and is the known source vector matrix of
form is the number of unknowns and appears
here as matrix dimension. Assuming
the problem scale is ,
all 3-dimensional elements
are indexed as .
Here, represents
the linear coefficients between elements and , and the matrix elements for
row (see (9)), for instance, are defined as follows: The biconjugate gradient (BICG) method is useful here as the system
matrix is nonsymmetric and nonsingular. To solve the linear equations, a row-indexed
compact storage method [29, 30] has been utilized for the coefficient matrix . With the storage of about twice the
number of nonzero matrix elements, compared to other methods requiring as much
as three or five times. This scheme also provides sufficient auxiliary
information on the matrix and efficient matrix operations using continuous
memory. In this way, only nonzero elements have been processed. According to
(9), coefficients are also assigned to air elements that neighbor elements of
the problem space (i.e., those that have nonzero conductivities), thus
resulting in sparsity of matrix in
this linear system. BICG requires computing a matrix-vector product and a transpose product. In certain applications, the latter product
may be impossible to perform, for example, if the matrix is not formed
explicitly and the regular product is only given in operation form, as a
function call evaluation. In a parallel computing environment, the two
matrix-vector products can be performed simultaneously. However, in a
distributed memory environment, there will be extra communication costs
associated with one of the two matrix-vector products, depending upon the
storage scheme for . A duplicate
copy of the matrix alleviates this problem, at the cost of doubling the storage
requirements for the matrix. A shared memory parallel scheme OPENMP is applied
in this case. Without having duplicates on each subprocess, multiple threads
compute on one allocated memory. In that way, one can control the number of
processing threads to maximize efficiency.
2.3. Practical Applications
2.3.1. The RF FDTD Application Examples
(a) Rotary Phased Array Head Coil—Spherical Phantom
To demonstrate the performance of the parallel FDTD
method, we investigate a new type of receive-only, 4-element rotary phased
array (RPA) head coil that is designed to improve the sensitivity
profile at the center of the sample. Shown in Figure 2(a) is the modeled RPA
head coil loaded with a spherical phantom. To mimic a
simple head model, it was
assumed that the FDTD-modeled spherical phantom has a conductivity of 0.616 Sm−1 and permittivity of 48.8 Fm−1 [31]. Each coil element has a “paddle-like” structure
consisting of two main conductors. The main conductors of each coil element
will carry equal currents but in opposite directions, as a result, each coil
element will produce a plane of maximum sensitivity along the axis of the
cylindrical space. That
is, the plane of sensitivity of each coil element will cut radially or
diametrically through the cylindrical space thus improving the sensitivity at
the center of the head coil. The method of moments (MoM) available from a
commercial software package FEKO is firstly used to model the RPA head coil and
also to ensure that coil elements are decoupled, tuned, and matched to 85 MHz.
In addition, MoM is used to calculate the initial Huygence equivalent surface
sources. Once the initial Huygence equivalent surface sources are obtained and
mapped onto FDTD discrete domain, the iteration process commences and is
carried out until convergence is achieved. fields at the mid-section
of the spherical phantom, corresponding to each individual decoupled coil
element of the RPA head coil, are calculated and thereafter used for
calculating the signal intensity (SI) profiles. The calculated SI profiles are used for comparison with the MR images obtained
from the prototype RPA head coil. After obtaining positive simulation results,
a prototype RPA head coil was constructed. Shown in Figure 2(b) is the
constructed prototype RPA head coil loaded with a spherical phantom. The
prototype RPA head coil was tested in a Bruker S200 2T whole-body MRI system,
equipped with four receiver channels. Using a MSME pulse sequence with TR =
1000 msec, TE = 19.3 msec, and axial slices located at the mid-section
of the spherical phantom were acquired in parallel by each coil element of the
RPA head coil. For cylindrical FDTD modeling, both the conventional single processing
and the new parallel FDTD methods were employed and their computational
performances compared.
The parallel cylindrical FDTD method was implemented
on a 3-server cluster network, each with 2 XEON 3.6 GHz processors and 4 GB of
memory, using an MPI library and managed by a MPICH parallel computing
platform. In contrast, the single processing FDTD framework was evaluated on a 3 GHz/1 GB
RAM single-CPU machine, which presents identical computing performance on
servers. Due to multiuser stimulation, it is very hard to find the best
environment to test the single processor FDTD simulation in our server, so we
reported the comparison in this way.
Figure 2: (a) FDTD model setup of the RPA head coil and spherical phantom, (b) prototype RPA coil and (c) Signal intensities corresponding to each coil element evaluated with the FDTD method and obtained experimentally during MR imaging.
(b) Birdcage Resonator—Human Body Model
To further demonstrate the enhanced computing power,
we have subjected the parallel Cartesian FDTD algorithm to a very high-spatial
resolution of 1 mm. For this purpose, the tissue-equivalent male whole-body
models from the Brooks Air force Database [32] at spatial resolutions of 1 mm
and 4 mm were chosen. The models were subjected to radiofrequency fields
generated by a 16-rung whole-body birdcage (volume) resonator operating at 340 MHz
(8 Tesla). The dielectric properties of all body-identified tissue types are
frequency scaled and kept constant at the designated frequency [33]. At
high frequencies, it is impossible to have a global uniform flip angle in the
whole imaging slice, field focusing scheme is therefore implemented. The input
power of the birdcage resonator is numerically adjusted (a scaling is done to
the fields) to provide an averaged 90-degree flip angle in the subregion of the
abdomen and chest, such as the heart. The space domain enclosing the birdcage resonator, perfectly
matched layers (PML) and the human phantom was modeled with cells
at 1 mm and cells at 4 mm resolution. After the steady-state is obtained,
the specific absorption rate (SAR) is computed using the following equation: where is the coordinate of the voxel in designated
coordinate system, is the conductivity of tissue is the peak recorded electric field intensity , and is the specific mass density of tissue .
The models were implemented on an SGI high-performance
computer (HPC) with 10 computing nodes, where each node is equipped with 2
Itanium 2nd CPU 1.5 GHz and 4 GB of memory. The 1 mm- and 4 mm-resolution cases were
performed using 10 and 5 computing nodes, respectively.
2.3.2. The QSFD Application Example
When MRI
operators (radiographers or MR technicians) attend anxious, sedated, or
intubated patients during an MRI exam, they can be repeatedly exposed to switched
magnetic field gradients near the bore entrance at the imager end [5]. In this
study, we use the QSFD-BiCG method to evaluate the exposure of an MRI
occupational worker in the vicinity of the MRI machine. In particular, the
computational performance of the single processing versus parallel QSFD method
is assessed.
(a) Body Models
Anatomically, accurate
whole-body voxel models of male (BROOK) and female (NAOMI, [33]) are used to
represent the occupational workers. The female model was engaged at different
voxel resolutions to study the computational performances of the single processing
and parallel QSFD method. The scales of the female
model in Cartesian dimensions were and (in , and ) at model resolutions of 2 mm,
4 mm, 6 mm, and 8 mm, respectively. In a computationally extreme case, the male model was
studied at an ultra-high resolution of only 1 mm, whereby the scales of the
model in Cartesian dimensions were . The experimentally-determined
conductivity values by Gabriel et al. [35] of some forty body-identified tissue types were aptly
scaled to the frequency of interest and assigned to the appropriate body
voxels.
(b) Gradient Coils
An
actively-shielded whole-body assembly, consisting of the -, -, and -axis
gradient coil is used to evaluate the electric fields and current densities
induced in the worker during gradient switching [12, 35]. For mere
demonstration purposes, all three gradient coils have been designed to have
approximately same axial length of ~1.4 m, yet to remain radially separated, that is, the six coil layers (primary and
secondary) are allocated to different radii assuming a layer thickness
(including former, etc.) of 5 mm. The length of the gradient set is assumed to be
of the same length as the imager including the cryostat vessel. Table 1 lists
some important parameters while Figure 3(a) illustrates the geometries of the
three gradient coils (note that the -gradient coil is identical to the
-gradient coil when rotated by 90 degrees). It is assumed that each gradient
coil generates a normalized gradient field strength of 1 mT/m in the working
volume, and that the gradient coil currents are pulsed trapezoidally at the
frequency of 1 kHz and 100 microseconds rise time.
Table 1: Geometrical
properties of gradient coils.
Figure 3: (a) Three-dimensional plot of transverse and longitudinal gradient coil geometry (the primary and secondary windings are indicated in black and grey, resp.) and (b) corresponding spatial gradients of
. DSV is the diameter spherical volume.
(c) Computational Setup
Figure 4(a) illustrates a sketch of the position and orientation for
both body models near the gradient set end outside the MRI machine. An
iterative method based on the SOR algorithm [14–16] and a new parallel
BiCG method developed in this study are engaged to compare the numerical
convergence and computing performance. All numerical modeling with the parallel QSFD-BiCG method were performed on the aforementioned SGI
HPC with 10 computing nodes. Ten subthreads are assigned by the OPENMP scheme.
The single processor simulations were performed on the same computing platform
with only one computing node assigned.
Figure 4: (a) Typical body posture near the imager where gradient coils are assumed to be pulsed as described in the text, (b) axial slices illustrating the spatial distributions of induced electric field and current density and (c) comparison of the average current density along the superior-inferior axis between the QSFD-BiCG and QSFD-SOR methods indicating less than 1% of relative deviation.
3. Results and Discussion
In our experience,
numerical studies that involve high-resolution (~1 mm) models of human head,
pelvis, or other parts of the body have been published. There is, however, no
(or lack of) work that involves high-resolution whole-body models, such as 1 mm.
This work reports interesting results involving a 1 mm resolution whole-body
model, which previously was not possible on single CPU platforms.
3.1. Example RF FDTD Applications
(a) Current Carrying Loop—Spherical Phantom
Shown in Figure 2(c) are the SI profiles obtained using the MoM/FDTD method and experimentally acquired
MR images of the spherical phantom using the prototype RPA head coil. It can be
observed that both numerical and experimental results agree well and that the
sensitivity at the center of the spherical phantom, as anticipated, has been
improved using the RPA head coil. Both the single processing and parallel FDTD
methods require around 100 MB RAM and yield identical results. The computing
time using the nonparallelized FDTD method was around 182 seconds. In contrast,
the parallel FDTD method is around six times faster than the conventional
scheme.
(b) Birdcage Resonator—Human Body Model
Figure 5(a) shows the model setup involving body and the birdcage resonator. Figure 5(b) compares the specific absorption rate (SAR) at the body-model
resolution of 1 mm and 4 mm. Based on the current hardware settings, the
whole-body model at 1 mm spatial resolution required approximately 28 GB of
memory and 20 hours of computing time for 6000 FDTD iterations to get converged
results. The low-resolution simulation (4 mm) required around 500 MB of memory
and 1.8 hours of computing time. Compared to spatial SAR distribution at 4 mm,
the 1 mm resolution model provides us with clearly more detail and better tissue
specificity.
In terms of problem scale, the 1 mm resolution
case is around times larger than the 4 mm simulation. To each
processing node involved, the time consumption for 1 mm processing is only about
11 times larger than for the lower-resolution case, which illustrates the
higher-computational efficiency of the parallel scheme.
The single processing FDTD routine at 1 mm
resolution could not be implemented due to the very large amount of memory
required. In this sense, the parallelism cannot only improve the computational
efficiency of FDTD, but enable it to solve those problems that conventional
FDTD cannot. In the parallel FDTD computing framework, data communication
occupies much less time than the computation of E-H field components. Thus,
implementations with extensive task distribution can improve computational
efficiency, but the additional communication overhead may affect the overall
performance. In theory, the time complexity is inversely proportional to the
number of parallel threads, provided there is enough random access memory (RAM)
and CPU processing resources. With the inevitable communication overhead
imposed on the individual processes, however, execution efficiency is often
significantly diminished. A load-balancing scheme is applied in the parallelization to reduce the latency time between subprocesses which eventually
alleviates both the data conflicts and iterative latency in the data
transmission. This parallel FDTD scheme has remarkable computing advantages
over conventional single
processing methods, which could be also seen in many other parallel computing
scenarios though in different applications [25–27].
Figure 5: (a) Model setup involving the whole-body birdcage resonator and the body model, (b) coronal and sagittal profiles of in situ SAR spatial distributions in the male Brooks Airforce model (BROOK) at 1 mm and 4 mm voxel resolution.
3.2. Example QSFD Application
Figure 4(a) depicts the typical posture of the radiographer on the side
of the patient bed, which was used in the QSFD computation. Figure 4(b)
represents the spatial distributions of the in situ electric field and current density-induced in the female
model during the exposure to the combination of switched gradient coils. Figure
4(c) shows a comparison of the induced average current density versus
superior-inferior axis obtained using the parallel BiCG and SOR.
Although the large-scale model used is very inhomogeneous and the source
field varies dramatically in the spatial domain, both methods provided
remarkably similar simulation results in terms of resulting field magnitudes
and spatial distributions, with less than 1% relative difference. According to
the convergence performance results detailed in Table 2, the QSFD-BiCG method
clearly outperforms the SOR-based QSFD technique at all nominated resolutions
of the female body model.
Table 2: Computing performance.
According to Table 2, the coefficient matrix dimension grows
dramatically up to 8 million elements with 2 mm resolution, which brings about
tremendous computational burdens. Via the computing parallelization, both SOR
and BiCG methods have been improved, while the latter shows better convergence
performance.
Using the 1 mm resolution male body model, the parallel QSFD-BiCG method
took around 23 hours and 28 GB of RAM to evaluate in situ electric fields/currents. The number of elements in
matrix A was ~.
In contrast, the single processing equivalent was unable to perform this
particular simulation due to immense memory requirements. This example
demonstrates one of the clear advantages of the parallel QSFD computing
platform over its conventional single processor-based counterpart. Figures 6(a)
and 6(b) illustrate very fine details of induced electric field and current
density distributions for the male body model, respectively. According to the
simulation results, the largest electric fields and current densities are
located in the frontal left region of the body, as this is the part of the body
that is closest to the gradient coil conductors. With well-conditioned band
diagonal sparse coefficient matrices, the proposed parallel QSFD-BiCG scheme demonstrates
robust performance in the handling of low-frequency electromagnetic fields
problems. The BiCG method also shows efficient memory usage, as it only
consumes about 1/3 of memory compared to the standard SOR scheme.
Figure 6: Coronal, sagittal and axial distributions of (a) electric field and (b) current density induced in the 1mm-resolution male body model during gradient pulsing. Figure
5(a) illustrates the body posture of the model relative to the gradient set.
This level of high-resolution analyses is imperative and significant for
those interested in a variety of numerical problems such as but not limited to:
thermodynamic modeling involving a fine vascular structure, cardiac electric
field propagation, hyperthermia, eddy current modeling, occupational exposure
of workers to electromagnetic fields, and so forth. Most of these and other
related problems would benefit from employing the whole body in the computational
process, as this represents a more realistic physical scenario.
4. Conclusion
Computationally intensive numerical software has become necessary to
handle the increasing complexity of electromagnetic field problems in MRI,
particularly at high-field strengths. In this work, high-performance finite-difference
solvers architected within a parallel framework have been presented. The
potential of the optimized parallel scheme has been demonstrated in typical MRI
applications. The case studies indicate that the power of the software can enable
straightforward adaptation to applications involving optimization of system
components.
Acknowledgments
The financial support for
this project from the Australian Research Council and The Health and Safety
Executive (UK) (the project managed by MCL , UK)
is gratefully acknowledged.