Journal of Sensors

Volume 2015, Article ID 267462, 10 pages

http://dx.doi.org/10.1155/2015/267462

## Uncertainty Quantification in Application of the Enrichment Meter Principle for Nondestructive Assay of Special Nuclear Material

^{1}International Atomic Energy Agency, 1400 Vienna, Austria^{2}Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA^{3}Pacific Northwest National Laboratory, Richland, WA 99354, USA

Received 5 May 2015; Accepted 18 June 2015

Academic Editor: Jesus Corres

Copyright © 2015 Tom Burr et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Nondestructive assay (NDA) of special nuclear material (SNM) is used in nonproliferation applications, including identification of SNM at border crossings, and quantifying SNM at safeguarded facilities. No assay method is complete without “error bars,” which provide one widely used way to express confidence in assay results. NDA specialists typically partition total uncertainty into “random” and “systematic” components so that, for example, an error bar can be developed for the SNM mass estimate in one item or for the total SNM mass estimate in multiple items. Uncertainty quantification (UQ) for NDA has always been important, but greater rigor is needed and achievable using modern statistical methods. To this end, we describe the extent to which the guideline for expressing uncertainty in measurements (GUM) can be used for NDA. Also, we describe possible extensions to the GUM by illustrating UQ challenges in NDA that it does not address, including calibration with errors in predictors, model error, and item-specific biases. A case study is presented using gamma spectra and applying the enrichment meter principle to estimate the ^{235}U mass in an item. The case study illustrates how to update the ASTM international standard test method for application of the enrichment meter principle using gamma spectra.

#### 1. Introduction

As world reliance on nuclear energy increases, concerns about proliferation of materials that could be used for weapons also increase. Fissile nuclear materials can be detected and/or characterized by observing radiation released by fission, such as gamma-rays and neutrons [1]. Therefore, neutron and gamma detectors are deployed in many nonproliferation efforts such as cargo screening at border crossings and assay at facilities that process special nuclear material (SNM), which is the main application we consider.

Nondestructive assay (NDA) of items containing SNM uses calibration and modeling to infer item characteristics on the basis of detected radiation such as neutron and gamma emissions. For example, the amount of ^{235}U in an item can be estimated by using a measured net weight of uranium U in the compound and a measured ^{235}U enrichment (the ratio ^{235}U/U). Enrichment can be measured using the 185.7 keV gamma-rays emitted from ^{235}U by applying the enrichment meter principle (EMP), which we consider here as our case study [2].

Uncertainty quantification (UQ) for NDA has always been important, but currently it is recognized that greater rigor is needed and achievable using modern statistical methods and by letting UQ have a more prominent role in assay development and assessment. UQ is often difficult but, if done well, can lead to improving the assay procedure itself. Therefore, we describe the extent to which the guideline for expression of uncertainty in measurements (GUM) can be used for NDA [3–5]. Also, this paper takes steps toward better UQ for NDA by illustrating UQ challenges that are not addressed by the GUM. These challenges include item-specific biases, calibration with errors in predictors, and model error, especially when the model is a key step in the assay. A case study is presented using low-resolution NaI spectra and applying the enrichment meter principle to estimate the ^{235}U mass in an item. The case study illustrates how to update the current international standard test method (ASTM) for application of the enrichment meter principle using gamma spectra from a NaI detector. The paper is organized as follows. Section 2 gives additional background on NDA and UQ for NDA. Section 3 describes the GUM. Section 4 is the EMP case study. Section 5 is a discussion and summary.

#### 2. Background on NDA and UQ for NDA

NDA is widely used in nuclear nonproliferation because most detectors are rugged and portable and so can be brought to the location of the item for an in situ measurement [6–9]. In contrast, destructive analytical chemistry assay (DA) methods such as mass spectrometry require that a sample from the item be brought to the instrument. Typically, NDA has smaller sampling errors but also tends to have larger overall errors than DA, because there is no item preparation step, although there are exceptions. Overall error includes all types of random and systematic error and describes the total variation around the measurand’s true value [10]. An error is systematic if it impacts or could impact more than one assay. For example, errors in estimated calibration model parameters lead to systematic errors for all assays made during the same calibration period with the same instrument. An error due to variation in the container thickness of an item is systematic to that item but is most likely to be random across items. We discuss container thickness as an example of item-specific systematic error in the EMP case study.

In NDA nonproliferation applications, items emit neutrons and/or gamma-rays that provide information about the source material, such as isotopic content. However, item properties such as density, which relates to neutron and/or gamma absorption behavior of the item, can partially obscure the relation between the detected radiation and the source material; this adds a source of uncertainty to the estimated amount of SNM in the item. One can express item-specific impacts on uncertainty using a model such aswhere CR is the item’s neutron or gamma count rate, is the item mass, and are auxiliary predictor variables such as item density, source SNM heterogeneity within the item, and container thickness, which will generally be estimated or measured with error and so are regarded as random variables [6]. Regarding notation, we use capital letters to denote random variables. The random error term can include variation in background that cannot be perfectly adjusted for, Poisson counting statistics effects, and random effects related to estimating the counts in a spectral region that are associated with the particular source SNM. In the EMP case study, the spectral region centers on the 185.7 keV gamma full-energy peak that is the basis for estimating the ^{235}U enrichment in a sample.

In principle, the could be estimated for each item as part of the assay protocol. However, there would still be modeling error because the function must be chosen or somehow inferred, possibly using purely empirical data analysis applied to calibration data [6, 11], or physics-based radiation transport codes such as Monte-Carlo-n-particle (MCNP [12]). Typically, only some of will be measured as part of the assay protocol, as we illustrate in the EMP case study.

Readers familiar with errors in variables (also known as errors in predictors) might wonder if errors in variables techniques will be needed [13–16]. We consider errors in variables in the EMP case study. Readers familiar with Bayesian data analysis might wonder if the true item mass is to be regarded as a random variable, as would be done in a Bayesian approach. This paper regards the mass as a random quantity, so we use capital or capital (for true value).

##### 2.1. UQ for NDA

Perhaps surprisingly, a thorough approach for quantifying and reporting uncertainty does not yet exist even for the relatively simple and widely fielded EMP technique (which is our case study). As the complexity of the measurement system increases (such as instruments deploying multiple correction algorithms and operated in unattended mode [8]), exactly how to do effective UQ becomes less clear.

By comparison to UQ for NDA, UQ for DA is more mature. Space constraints do not permit a full comparison of how UQ is done in DA versus in NDA. However, one needs to ensure that any such comparison is between similar quantities. For example, when samples are collected and DA results are reported, sampling uncertainty (how representative the samples are of the entire item) is often not carried throughout the entire calculation. In other cases, some uncertainties may not be adequately evaluated to propagate through to the total measurement uncertainty. For example, there are uncertainties due to the fact that gamma measurements only sample the surface of an item because the sample itself attenuates and absorbs gamma-rays emitted from the central region of the item. Differences in DA and NDA arise primarily from the fact that, in DA, the sample is often modified (chemically treated) to match the analysis technique, allowing for more control of the measurement conditions, while, in NDA, the analysis technique is modified to match the item and measurement conditions. This means that NDA requires process knowledge in order to determine some components of the uncertainty, while DA requires process knowledge in order to prepare and measure the sample. Also, standards used in DA are much closer to the sample being measured than in NDA because it is not feasible to prepare a set of standards (isotopics, matrix, packaging, etc.) to fit all NDA measurement regimes. The DA and NDA communities both estimate uncertainty associated with certified standards [17]. And, both communities endorse sample exchange programs in which multiple laboratories measure the same measurand, providing data for a “top-down” approach to UQ. This paper is concerned with a “bottom-up” approach to UQ for NDA, where each estimated quantity in the assay procedure is assessed for its contribution to the estimate of the overall uncertainty.

NDA is used for many material types, including well-characterized and consistent product material, and poorly characterized and inconsistent scrap and waste. Particularly for the less well-characterized and/or inconsistent material types, some type of model is used to adjust radiation count rates as in (1). Therefore, uncertainty in the model itself (how well the item conforms to the model assumptions) can be an important and difficult-to-characterize source of uncertainty.

UQ for NDA typically needs to allow for both an overall bias and for an item-specific bias, as well as include the catch-all “random” error. Therefore, one of the simplest but most useful error models iswhere is the th measurement on the th item, is the unknown true value, is the overall bias, is the item-specific bias, and is the random error [10]. Although not shown explicitly, the GUM [5] endorses a reduced version of (2) in top-down UQ (not our focus here), given by , which redefines to include and the in (2). Because item-specific systematic error propagates across items in the same way that random error does, this is sometimes adequate. However, it has been demonstrated that (2) is needed in some NDA settings [8, 10, 18]. To complete specification of the measurement model given by (2), one further assumes that , and the variance of , denoted , and the variance of , denoted , can be estimated from data [8, 10] in top-down UQ.

Model uncertainty impacts and in (2). One component of model error is model parameter estimation error, which is addressed in the EMP case study. Uncertainty in nuclear data, such as attenuation coefficients and emission intensities, is a special case of model parameter estimation error, which is addressed in [19]. Also addressed in [19] is model error itself. For example, model-based estimates of detector response functions derived, for example, from a radiation transport model such as MCNP [12], are used in one option to infer the relative abundance of the isotopes of Plutonium in samples using gamma spectroscopy. References [6, 11] also consider model error in the simpler setting of fitting multiple candidate models to the same calibration data.

#### 3. The GUM

In metrology, uncertainty is a parameter that characterizes the dispersion of the* estimates* of a true quantity known as the measurand, and the GUM describes one main approach to estimate uncertainty. The GUM did not attempt to be comprehensive, and so, it is not surprising that subsequent specialized supplements have been developed, mostly influenced by UQ needs for DA. In the case of NDA, there are ASTM guides for every commonly used NDA method. The GUM, its eight technical appendices, and the supplements to the GUM are too lengthy to fully review here. But briefly, the main technical tool is a first-order Taylor approximation to the measurand equation which relates input quantities to the measurand . Some of the input quantities can be estimates of other measurands, or of calibration parameters, so the measurand equation is quite general. Note that (3) does not include model error, which is sometimes needed [4, 11, 18, 19]. Also, note that (3) is aimed primarily at bottom-up UQ, using either steps in the assay method and uncertainties in the quantities or using calibration data (see the EMP case study in Section 4). However, supplements to the GUM describe analysis of variance in the context of top-down UQ using measurement results from multiple laboratories and/or assay methods to measure the same measurand. The GUM does not explicitly present any measurement error models such as (2) but only considers the model for the measurand (3). However, the GUM endorses the notion of a measurement error model such as (2) in its top-down UQ. Note that (1) can be expressed as a model for the measurand, , by algebraic rearrangement and redefining so that it can be included among the and also including the measured CR among the . Although it is beyond our scope here, one could conceivably impose the effects of model error and/or measurement bias in the probability distributions for some of the and then reexpress (3) in terms of a measurement error model such as (2).

The purpose of a measurement is to provide information about the measurand, such as the SNM mass. Both frequentist and Bayesian viewpoints are used in estimating the measurand and in characterizing the estimate’s uncertainty. Elster [4] and Willink [20] point out that the GUM invokes both Bayesian and frequentist approaches in a manner that is potentially confusing. To modify the GUM so that a consistent approach is taken for all types of uncertainty, [3] suggests an entirely frequentist approach while others suggest an entirely Bayesian approach. Bich [3] also points out confusion between frequentist and Bayesian terminology and approaches in the GUM, which is one reason he believes it would be useful to revise the GUM. No matter which approach is used, making it clear which quantities are viewed as random and which are viewed as unknown constants will avoid needless confusion. However, the real challenges involve choosing likelihood for the data, a model to express how the measurand is estimated, and a model to describe the measurement process. In NDA, a model is often used to adjust test items to calibration items. These challenges are present in both frequentist and Bayesian approaches.

Ambiguities in the GUM arise for at least three reasons [3, 4, 20]: (1) The GUM divides the treatment of errors into those evaluated by type A evaluation (traditional data-based empirical assessment) and those addressed by type B evaluation (expert opinion, experience with other similar measurements). However, type B evaluations are primarily Bayesian (degree of belief) without explicitly stating so (and need not be), while type A evaluations are primarily frequentist (and need not be). The jargon used in describing type B evaluations implies that the true value has a variance (a Bayesian view based on quantification of our state of knowledge). The jargon used in describing type A evaluations is frequentist, with statements such as , with the interpretation that varies randomly around the fitted quantity , where is the known measurement error standard deviation. We endorse either view, when clearly explained, but typically write , where the hat notation conveys that the standard deviation is an unknown parameter that must be estimated, so . (2) The GUM uses the same symbol for a measurement result and for a true value, which also confuses the frequentist and Bayesian views. (3) There is vague use of the term “quantity.” And, although the GUM attempted to clarify confusion between “error” and “uncertainty,” it did not clearly use the term “error” when measurement error (which has a sign, positive or negative) was meant. Willink [20] aims to resolve these ambiguities by paying attention to notation and jargon, being careful to separate Bayesian from frequentist views, and pointing out a confusion of true values with measurements of true values. A recent issue in Metrologia devoted several papers (see, e.g., [3, 4]) to the success of the GUM and to features of the next version(s) of the GUM. Clearly delineating what terms are random and what terms are fixed but unknown is important. Also, the GUM does not explicitly address calibration; however, because calibration is almost never a completely straightforward application of ordinary regression, we agree with [4] that UQ for calibration deserves attention, as we illustrate next.

#### 4. Case Study: Enrichment Meter Principle

##### 4.1. EMP Description

The enrichment meter principle (EMP) aims to infer the fraction (enrichment) of ^{235}U in U by measuring the count rate of the strongest-intensity direct (full-energy) gamma from decay of ^{235}U, which is emitted at 185.7 keV [2, 18]. The EMP assumes that the detector field of view into each item is identical to that in the calibration items, that the item must be homogeneous with respect to both the ^{235}U enrichment and chemical composition, and that the container attenuation of gamma-rays is identical or at least similar to that in the calibration items [2] so that empirical correction factors have modest impact and are reasonably effective. If these three assumptions are met, the known physics implies that the enrichment of ^{235}U in the U is directly proportional to the count rate of the 185.7 keV gamma-rays emitted from the item; and, it has been shown that, under good measurement conditions, the EMP can have a random error relative standard deviation of less than 0.5% and bias of less than 1%, depending on the detector resolution, stability, and extent of corrections needed to adjust items to calibration conditions.

##### 4.2. EMP Calibration

Calibration is performed using standard certified reference materials of “known” enrichment. Here, “known” is in quotes because both NDA and DA communities provide uncertainty statements for primary standard reference materials [17]. Typically “uncertainty” is defined as the total error (random plus systematic) standard deviation, of an item (a test item or a standard reference material) and is sometimes expressed as . Corrections are made for attenuating materials between the uranium-bearing material and the detector and for chemical compounds different from the reference materials used for calibration. Detectors of any resolution (such as scintillators or semiconductors) can be used, and the EMP method can be used for the entire range of ^{235}U fraction (enrichment) as a weight percent, from 0.2% ^{235}U to 97.5% ^{235}U.

There are several analysis options for EMP data. First, regarding the measurement itself, one must choose how to estimate the net peak area count rate associated with 185.7 keV gamma-rays. Then, one must choose how to deal with errors in the estimated count rate of 185.7 keV gamma-rays during both calibration and testing. Finally, one must determine the impact on uncertainty of model departures, such as variations in attenuation of the 185.7 keV gamma-rays due to variations in container thicknesses, for example.

##### 4.3. EMP Examples

Regarding the measurement data, Figure 1 plots the counts in energy bins near the 185.7 keV energy from an item measured at Los Alamos National Laboratory using a relatively low-resolution hand-held NaI detector. Notice that the peak occurs at an apparent energy slightly higher than 185.7 keV. This is not unusual; it can occur because of energy calibration drift or because of interfering gamma energies from other isotopes slightly above 185.7 keV. One could use some type of peak fitting or background fitting to improve the estimate of the 185.7 keV count rate. In our analyses here, we use an option that fits the known enrichment in each of several standards to observed counts in a few energy channels near the 185.7 keV energy as the “peak” region and to the counts in a few energy channels just below and just above the 185.7 keV energy to estimate background, expressed aswhere is the enrichment, is the observed peak count rate near 185.7 keV, is the observed background count rate in a few neighboring energy channels near the 185.7 keV peak region, and is random error (see Figure 1). Calibration data is used to estimate and . One could constrain the estimates of and of to be equal in magnitude in the case where the same number of energy channels is used for both the peak and background, that would correspond to assuming a constant (nonsloping) background throughout the peak region, which does not appear to be appropriate for data such as in Figure 1. Therefore, in this example, we do not force the constraint .