Abstract

This study couples an iterative sparse coding in a transformed space with an ensemble smoother with multiple data assimilation (ES-MDA) for providing a set of geologically plausible models that preserve the non-Gaussian distribution of lithofacies in a channelized reservoir. Discrete cosine transform (DCT) of sand-shale facies is followed by the repetition of K-singular value decomposition (K-SVD) in order to construct sparse geologic dictionaries that archive geologic features of the channelized reservoir such as pattern and continuity. Integration of ES-MDA, DCT, and K-SVD is conducted in a complementary way as the initially static dictionaries are updated with dynamic data in each assimilation of ES-MDA. This update of dictionaries allows the coupled algorithm to yield an ensemble well conditioned to static and dynamic data at affordable computational costs. Applications of the proposed algorithm to history matching of two channelized gas reservoirs show that the hybridization of DCT and iterative K-SVD enhances the matching performance of gas rate, water rate, bottomhole pressure, and channel properties with geological plausibility.

1. Introduction

Calibration of a subsurface system is an essential process to forecast fluid behaviors in a variety of geoenvironments such as aquifers, geothermal reservoirs, and petroleum reservoirs. History matching is an inverse process to find reservoir model parameters honoring observations by integration of static (e.g., core, logging, and seismic) and dynamic data (e.g., oil and gas rate, water cut, bottomhole pressure, and subsidence/uplift) [1]. Ensemble-based data assimilation approaches have been successfully utilized for history matching to provide subsurface models that are well conditioned to observations. For example, the ensemble Kalman filter (EnKF) [25], ensemble smoother (ES) [6, 7], and ES with multiple data assimilation (ES-MDA) [810]. However, the ensemble-based data assimilation approaches have difficulty in preserving non-Gaussian distributions of model parameters such as lithofacies [1114]. In the ensemble-based data assimilation approaches, model parameters lose the non-Gaussianity of their original distributions that are initially constrained and the distributions of the model parameters get close to Gaussian ones.

Shin et al. [14] and Zhou et al. [15] suggested using normal score transform in the ensemble-based data assimilation approaches to preserve non-Gaussian distributions of model parameters. Non-Gaussian model parameters are transformed into Gaussian model parameters using normal score transform, and then finally, updated model parameters are backtransformed. Moreover, transformation can take advantage of parameterization if the number of essential transformed parameters is smaller than the number of original parameters in terms of saving computational cost and figuring out main features of parameters. For example, the discrete cosine transform (DCT) [1621], fast Fourier transform [22, 23], grid connectivity transform [24], level set [13, 25], and sparse geologic dictionaries [26, 27] have been applied to reservoir characterization. In particular, Fourier transform-based methods such as DCT are capable of capturing essential traits such as main shapes and patterns of a facies channel reservoir [16, 17] but reveal a deficiency in describing a crisp contrast among different facies because of data loss from inverse transformation [28].

Sparse coding refers to the process of computing representation coefficients based on the given signal and dictionaries [29]. In sparse coding, the dictionaries indicate groups of features capable of brief expressions to represent various phenomena in the environment [30]. In geological modeling, sparse geologic dictionaries are used to represent models with sparse linear combinations of basis vectors that are essential geologic features of a reservoir [29]. Extracting essential geologic features and reducing the number of reservoir variables can be accomplished using sparse coding, thereby facilitating ensemble-based history matching [27]. Aharon et al. [29] showed the efficacy of K-singular value decomposition (K-SVD) resulting from the accelerated convergence for image reconstruction, which led Sana et al. [31] to build an archive of essential geologic features called sparse geologic dictionaries from thousands of static reservoir models using K-SVD and calibrate reservoir models with the dictionaries using EnKF. One drawback of K-SVD is its large size of sparse geologic dictionaries. References showed that sparse coding with a transformation of parameter space could reduce both computational complexity and costs that are simultaneously required for model calibration [26, 27, 32]. In this study, we note that the previous works have not considered the quality of sparse geologic dictionaries. The quality indicates how well reservoir models can be properly reconstructed by prototypes of dictionaries. Also, we expect an improvement in the history-matching performance by enhancing the quality of sparse geologic dictionaries.

This study proposes a hybridized ES-MDA algorithm that implements sparse coding in a transformed space to outperform previous history-matching methods by providing more accurate reconstructions of highly non-Gaussian model parameters. The proposed algorithm transforms multimodal facies into coefficients of discrete cosine functions using DCT. Invoking DCT is followed by iterating K-SVD for updating sparse geologic dictionaries. In each assimilation of ES-MDA, the combination of DCT and iterative K-SVD is performed to update the dictionaries and improve the quality of reservoir models. For brevity, the proposed algorithm with updated sparse geologic dictionaries is called ES-MDA-DCT-i-K-SVD in this paper. The performance of ES-MDA-DCT-i-K-SVD is tested with applications for channelized gas reservoirs and is compared with those of four ES-MDA algorithms: conventional ES-MDA, ES-MDA coupled with DCT (called ES-MDA-DCT in this paper), ES-MDA coupled with K-SVD (called ES-MDA-K-SVD in this paper), and ES-MDA coupled with DCT and K-SVD (called ES-MDA-DCT-K-SVD in this paper).

2. Methodology

The novelty of the proposed algorithm ES-MDA-DCT-i-K-SVD is the integration of ES-MDA (Section 2.1), the dimensionality reduction of the parameter space using DCT (Section 2.2), and construction (Section 2.3) and updating (Section 2.4) of geologic dictionaries using sparse coding in the reduced space. Section 2.5 describes how the methods operate in the framework of ES-MDA-DCT-i-K-SVD in a complementary manner.

2.1. ES-MDA

The goal of history matching can be formulated as where is the objective function of history matching and is the state vector composed of reservoir variables (e.g., permeability and facies).

The typical form of for ensemble-based history matching is presented as [33] where is the state vector before update and the superscript refers to background; is the covariance matrix of ; is the observed responses; is the dynamic vector composed of simulated responses obtained by running a reservoir simulator for the state vector ; and is the covariance matrix of observation error. Note that the right-hand side of (2) is the addition of background and observation error terms [33]. Because can contain any unknown variables such as facies indexes, coefficients of discrete cosine functions or dictionary coefficients depending on the type of algorithms were used in this study.

can be used to derive the update equation for as [8, 33] where the subscript refers to the ith ensemble member; is the cross-covariance matrix of and ; is the autocovariance matrix of ; is the coefficient to inflate , which is the covariance matrix of the observed data measurement error [8]; is the observation data perturbed by the inflated observed data measurement error; and is the ensemble size (i.e., number of reservoir models in the ensemble). Conventionally, ensemble-based history matching updates reservoir models simultaneously. In (3), refers to Kalman gain , which is computed with regularization by SVD using 99.9% of the total energy in singular values [8].

The main difference between ES and ES-MDA is the update process of the state vector . ES updates the state vector of each ensemble member using observation data measured at all time steps (Emerick and Reynolds, 2011). Compared to the single assimilation of ES, ES-MDA assimilates every state vector times using an inflated covariance matrix of measurement error [8, 9]. Here, is the number of assimilations in ES-MDA.

Definitions of and are as follows: where is the mean of state vectors and is the mean of dynamic vectors.

In ES-MDA, is constrained to

In ES, and due to its single assimilation.

The perturbed observation data shown in (3) is computed as

The second term on the right-hand side of (6) is the perturbation term, which reflects the uncertainty associated with data measurement, processing, and interpretation. Stochastic characteristics of are reflected by . is the random error matrix to observations, which is generated with a mean of zero and a standard deviation of , where is the number of time steps in observations.

2.2. Extraction of Geologic Features Using Discrete Cosine Transform

Discrete cosine transform (DCT) has been utilized as an image-processing tool for characterization of channelized reservoirs due to the periodicity of cosine functions [34]. DCT converts parameters into coefficients of discrete cosine functions. The coefficients are sorted in descending order from the top left, capturing the overall trend of channel patterns, to bottom right, delineating details in channel patterns. Previous studies have shown that non-Gaussian channel patterns can be reproduced sufficiently via inverse transform of essential DCT coefficients [18, 28, 35]. Updating the truncated DCT coefficients can yield a calibrated model set. Another advantage of DCT is the improvement in computational efficiency resulting from data compression, which is effective in constructing sparse geologic dictionaries described in Section 2.3.

Figure 1 illustrates how to extract geologic features from an image of a target channelized reservoir using a truncated DCT and reproduce the target reservoir through an inverse DCT (IDCT). Two images in the first row represent the physical state of sand and shale facies in the target reservoir, that is, the original image on the left and the reproduced image on the right. Let and denote the number of gridblocks of the reservoir model and the number of essential DCT coefficients, respectively. Applying DCT to the original 75 by 75 image yields an image composed of DCT coefficients, as shown in the bottom-left corner of Figure 1. In the bottom-right corner of Figure 1, filtering the coefficient state selects 465 components in the dotted triangle as essential ones. It seems that this small number of components is sufficient to restore the original image of the physical state (i.e., channel patterns) when comparing the two subfigures in the first row.

2.3. Construction of Geologic Dictionaries Using Sparse Coding

Figure 2 presents a procedure of sparse coding to construct geologic dictionaries using static data in the DCT domain. Sparse coding starts from building a library matrix , which is a by matrix. is the number of parameters in each reservoir model, and is the number of reservoir models in . Equiprobable reservoir models are generated using a geostatistical technique. Figure 2(a) shows a training image used for creating initial channelized reservoir models (see Figure 2(b)) by invoking single normal equation simulation (SNESim) [36]. In Figures 2(a) and 2(b), each reservoir model consists of two facies: shale and sand with 0 and 1 indexes, respectively. Each column vector of corresponds to either a reservoir model or an encoded reservoir model because is determined depending on the type of assimilation algorithms in this study. (e.g., in Figure 2(b)) in the conventional ES-MDA, while (e.g., in Figure 2(c)) if DCT is applied to ES-MDA. In Figure 2(c), each column vector of consists of DCT coefficients filtered from the corresponding column vector in . That is, the data compression ratio is . In Figure 2, . Data compression using DCT reduces the dimension of the parameter space, thereby saving computing time required for sparse coding [37]. Meanwhile, a sufficiently large needs to be chosen to cover a variety of geologically plausible scenarios in . Previous investigators adopted in the range of 1000 to 2000 [31, 37]. In this paper, is a constant of 3000 for maintaining the diversity of library models. Note that generating models is computationally inexpensive because all the models are static. No dynamic reservoir simulation is required for any static model.

Library matrix is numerically decomposed into dictionary matrix and weight matrix using K-SVD: . and are an by matrix and an by matrix, respectively, where is the number of reservoir models in . Thus, both and are sets of reservoir models. Each column vector of , called a realization in this paper, represents either an original or an encoded reservoir model. We expect that every realization exhibits distinguishable geological features in a well-organized dictionary. is to be predetermined considering computational costs associated with the sparse coding process. In this paper, is fixed as one third of . As a rule of thumb, .

Figure 2(d) illustrates how to decompose into and using K-SVD in the DCT coefficient domain. Note that column vectors in are sorted at random, while column vectors in are sorted from left to right in descending order of energy to help us conveniently check the reservoir model quality. The term energy indicates the norm of a row vector in corresponding to the realization in . Greater norms in indicate more essential principal components in .

The next task is how to decompose and from . Due to the large size of the three matrices, matrix decomposition is conducted numerically to minimize the discrepancy between and . In other words, where is the Frobenius norm and .

More specifically, matrix decomposition is performed by iterating sparse coding that is an orthogonal matching pursuit (OMP) [38, 39] followed by K-SVD [29], as described by Sana et al. [31]. The first step of matrix decomposition is to initialize and . The second step is to compute with a fixed using OMP. The third step is update and simultaneously using K-SVD through the following three equations. where is the ith column vector (i.e., realization) in and is the ith row vector in .

The right-hand side of (8) is rearranged as follows: where the subscript indicates the pivot and is the discrepancy term.

To achieve (9), optimal and are explored from to . The right-hand side of (9) is rearranged as where is the matrix of which every element is either zero or one. At each , multiplied by returns a row vector of nonzero elements in . The size of is by the number of nonzero elements in . is decomposed using SVD for determining and at each , which results in , where is the first column vector from the left-singular vectors’ matrix, is the first element of the singular values’ matrix, and is the first column vector from the right-singular vectors’ matrix of , respectively. As a consequence, is the first column vector of and is the first column vector of multiplied by the first diagonal element of . is obtained by multiplying the inverse . Note that the updated and are used for the calculation of (9) at the subsequent . Performing (10) at all completes the update of and .

After the update of the dictionary matrices, the second step of matrix decomposition is reinvoked to tune for further achieving (7). Then, the tuned and are reimported to (8) for conducting the third step again. This sequence of the second and third steps is iterated until a convergence criterion is satisfied. The criterion could be sparsity of , a threshold to accept the discrepancy shown in (7) or the maximum number of iterations. In this study, the criterion is the maximum number of iterations (set to ten) for all experiments.

In summary, Figures 2(a)–2(d) describe how to construct the original sparse geologic dictionary matrices , , and using static data. Figure 2(e) is an example to show that performing an IDCT yields while capturing geological features of the original despite a diffusion in facies designation. Note that the diffusion is filtered out using a cutoff method in this study. We employ the arithmetic mean of the two facies indexes (i.e., 0.5) as the threshold to determine either shale (if facies index < 0.5) or sand (if facies index ≥ 0.5) in each gridblock.

2.4. Update of Sparse Geologic Dictionaries during Multiple Data Assimilation

Once original sparse geologic dictionary matrices , , and are constructed using static data, the proposed algorithm updates the dictionary matrices by conditioning dynamic observations (e.g., gas rate and bottomhole pressure (BHP)) to the ensemble. Figure 3 explains how to update the dictionary matrices during assimilations of the proposed algorithm. Let and in the proposed algorithm.

The weight matrix is the state vector of the proposed algorithm. Only for the first assimilation, realizations are randomly selected from in order to compose the initial . Thus, is a by matrix while is a by matrix. Let the superscript a denotes assimilation. Then, denotes the updated weight matrix after assimilation. Using from Section 2.3 (Figure 3(a)), where is the by matrix composed of DCT coefficients of ensemble members.

Through IDCT, the updated DCT coefficients of are restored to facies indexes (Figure 3(b)). Each facies index map is transformed into its facies distribution map using a cutoff (Figure 3(c)): . The facies models are converted into petrophysical models and used for reservoir simulation. The simulation results are compared with observations for checking a quality of each reservoir model in . The quality is quantified in terms of , which indicates the discrepancy between observation and simulation results of an ensemble member: where is the number of data types.

Let be the number of qualified models selected from the ensemble. models are selected among models: , as seen in Figure 3(d). Figure 4, for example, shows to select realizations from realizations regarding gas rate and BHP. It seems that the simulated responses of the selected models are closer to the reference results than the rest of the ensemble members. Figure 3(e) shows that facies of the models are transformed into DCT coefficients: , where is the by matrix composed of DCT coefficients of qualified ensemble members.

After obtaining the qualified models, the library matrix is updated as (Figure 3(f))

Note that (14) enhances both the quantity and quality of geologic libraries by conditioning (which is conditioned to static observation data initially) to dynamic observation data with a relatively small number of reservoir simulation runs () for qualified ensemble members.

The updated is decomposed to obtain newly updated and using OMP and K-SVD, as addressed in Section 2.3 (Figure 3(g)). With the new and in (11), is reupdated using OMP: . This reupdated is used as in (2) as the state vector of the next assimilation.

2.5. Framework of ES-MDA-DCT-i-K-SVD

Figure 5 is a flowchart of the proposed ES-MDA algorithm that is compared with those of the other four ES-MDA algorithms investigated in this study. The general operating procedure for the algorithms is as follows: generation of an initial ensemble, reservoir simulation and selection of the qualified ensemble, Kalman-gain calculation and model update, facies designation, iteration of the above processes until the iteration number reaches the number of assimilations , and acquisition of qualified reservoir models. Differences between the algorithms are found in the types of state vectors and model update processes, as presented in Figures 5(b) and 5(c).

Table 1 compares state vectors and sparse geologic dictionaries for the algorithms. ES-MDA updates gridblock facies indexes (originally assigned 0 and 1 for shale and sand, resp.). ES-MDA-DCT updates filtered DCT coefficients in the facies domain. ES-MDA-K-SVD tunes weights in facies domain with only one application of K-SVD before ES-MDA, while ES-MDA-DCT-K-SVD tunes weights in the DCT domain with only one application of K-SVD before ES-MDA. Finally, the proposed ES-MDA-DCT-i-K-SVD updates weights in the DCT domain with iterative K-SVD during ES-MDA.

3. Results and Discussions

The performance of the proposed algorithm (i.e., ES-MDA-DCT-i-K-SVD) is tested with applications to history matching of two channelized gas reservoirs. The algorithm performance is compared to those of four ES-MDA algorithms described in Table 1 to investigate coupling effects of dimensionality reduction and iterative sparse coding. Note that the developed algorithm updates dictionaries with dynamic data in each assimilation of ES-MDA. Neither ES-MDA nor ES-MDA-DCT adopts dictionaries. Dictionaries generated using static data are unchanged during data assimilation for ES-MDA-K-SVD and ES-MDA-DCT-K-SVD.

3.1. Field Description

Table 2 summarizes gas reservoir properties used in Case 1 and Case 2. For each case, the gas reservoir consists of sand and shale facies. Each facies has its relative permeability curves and absolute permeability value; permeabilities of sand and shale facies are 300 and 0.1, respectively. Reservoir boundaries are surrounded by numerical aquifers modeled by employing pore volume multipliers.

Figure 6 compares training images and reference models of the two cases. The size of the reservoir domain, well location, and well name are the same in both cases; eight vertical wells (P1, P4, P6, P7, P9, P12, P14, and P15) are drilled in the sand formation, and the other eight vertical wells (P2, P3, P5, P8, P10, P11, P13, and P16) are drilled in the shale formation. These 16 facies data at the well locations are regarded as hard data used for model generation. Both the reference model and reservoir models are generated with hard data using SNESim that employs the training image shown in Figure 6(a) for Case 1.

The main differences between Case 1 and Case 2 are the shapes and widths of the channels. First, the reference model of Case 1 contains an S-shaped sand channel in the shale background, while the reference model of Case 2 has three parallel channels for which the orientation is NW-SE direction. Secondly, the channel width of the training image of Case 2 is 1.2 times thicker than that of Case 1, making that the channel width of the reference model of Case 2 is 1.2 times thicker as well. Despite the differences, this study intentionally reuses the initial ensemble designed for Case 1 as the initial ensemble of Case 2. The manipulation of an initial ensemble amplifies intrinsic reservoir uncertainty that is hard to infer from prior information (i.e., training image) for Case 2.

Table 3 describes operating conditions of 16 wells (from P1 to P16), the well coordinates of which are shown in Figure 6(a). Table 4 presents the number of parameters used in the five ES-MDA algorithms. Implementing DCT yields the data compression ratio . All experiments were set up such that and according to (5). The proportion of the ensemble update in sparse geologic dictionaries is 0.67%.

3.2. Case 1

Figure 7 shows the evolution of realizations in achieved by invoking the proposed algorithm. We present five realizations having higher energy than the others, which are the first to the fifth column vectors in , in each assimilation. Qualified realizations vary until the second assimilation is complete, implying a deficiency in the assimilation process due to a large dispersion of realizations. In the third assimilation, it seems that the first and fourth realizations conform to the reference model (Figure 6(a)). The first and second realizations in the fourth assimilation capture the S-shaped channel pattern of the reference model. This increase in the similarity among realizations implies an increase in the probability that sparse geologic dictionaries sufficiently represent the target gas reservoir.

Figure 8 compares history-matching results (day 1–day 3500) and prediction results (day 3501–day 7000) of the initial and updated ensembles against the reference results. Gas production rate and BHP are matching parameters of each ES-MDA algorithm. Water production rate is excluded from the matching dataset because it is a watch parameter. The three data types measured at wells P1, P4, P9, and P15 are shown because these wells installed in sand facies exhibit larger uncertainty than the other wells. In particular, water breakthrough does not occur at well P15 during the history-matching period. Thus, well P15 has the largest uncertainty in this case study. Solid gray, blue, dark blue, and red lines present well behaviors of the initial ensemble, the updated ensemble, the mean of the updated ensemble, and the reference model, respectively. The same initial ensemble is provided for each ES-MDA algorithm for a fair comparison. ES-MDA and ES-MDA-DCT expose a weakness in matching the reference results (Figures 8(a) and 8(b)). Improvements in matching accuracy are accomplished in the simulation results of the updated ensembles obtained using ES-MDA coupled with sparse coding (Figures 8(c)–8(e)). Executing sparse coding in the reduced parameter space has the result that most subfigures include the reference results within the bandwidth of the simulation results of the updated ensembles (Figures 8(d) and 8(e)), thereby improving the matching accuracy. Figure 8(d) results from the execution of ES-MDA-DCT-K-SVD of which the library matrix is constructed using static data only. In Figure 8(d), simulation results of some updated ensemble members deviate from the reference results at wells P4 and P9. It can be said that the update of described in Section 2.4 yields an increase in matching accuracy and a reduction in dispersion of simulation results (Figure 8(e)). For water breakthrough, underestimation of reservoir uncertainty is seen at well P15 in the simulation results obtained using the proposed algorithm. P15 showed the worst matching results among the 16 wells as the breakthrough time is not detected during the matching period. It is reasonable to have moderate results at some matching targets despite the overall good matching performance [40, 41].

The performances of the algorithms in terms of both history-matching (HM) and prediction (PD) periods are summarized in Table 5. The quality of the updated ensembles was assessed using the following: where is the normalized average of error (see (12)) and is the normalized standard deviation of . The superscripts and refer to the updated ensemble member and the initial ensemble member, respectively. Smaller values of indicate a more accurate updated ensemble. Smaller values of indicate a smaller degree of reservoir uncertainty associated with the ensemble. Any value larger than 100% indicates deterioration of the ensemble quality, as revealed in the results obtained using ES-MDA and ES-MDA-DCT. In terms of gas rate, for example, the updated ensemble of ES-MDA-DCT amplifies the degree of dispersion compared to the initial ensemble. Furthermore, the behaviors of its ensemble mean do not follow those of the reference model. The proposed algorithm yields acceptable values and values that are smaller than those of the other ES-MDA algorithms. Most observations are included within the bandwidths of the simulated profiles. The inclusion is also captured in the field cumulative production profiles of gas and water (Figure 9). Nonetheless, a remaining task is the construction of more robust dictionary matrices to improve the matching quality at every well target.

Figure 10 compares ensemble mean maps and histograms of permeability obtained using the five ES-MDA algorithms. A task herein is to investigate whether the ES-MDA algorithms can capture the overall trend of the sand channel while preserving shale facies in the background. In the reference model, for example, the shortest path between wells P6 and P9 is filled with shale. Each ensemble member consists of sand facies with a permeability of 300 md and shale facies with a permeability of 0.1 md. However, the histogram of the initial ensemble mean (Figure 10(a)) does not follow the bimodal distribution. On the other hand, it seems that the histogram of each updated ensemble follows the bimodal distribution. The ensemble mean maps obtained using ES-MDA and ES-MDA-DCT (Figures 10(b) and 10(c)) conform less to the reference model than those obtained using the algorithms coupled with K-SVD. When comparing the log-scale permeability histograms, it seems that the unconformity is proportional to the frequencies of undesirable permeabilities between ln 0.1 (≈−2.3) md and ln 300 (≈5.7) md in the histograms. DCT is advantageous for preserving channel connectivity; however, an artificial sand channel connects wells P6 and P9. In addition, truncated DCT coefficients smooth out borders between the sand channel and background shale, as shown in Figure 10(c). ES-MDA-K-SVD, ES-MDA-DCT-K-SVD, and ES-MDA-DCT-i-K-SVD reproduce channel patterns and the connectivity of the reference model (Figures 10(d)–10(f)), although Figure 10(e) contains undesirable sand facies between wells P6 and P9. A crisp contrast between shale and sand facies is observed in the outcome of the assimilation algorithms coupling DCT and K-SVD (Figures 10(e) and 10(f)). The developed algorithm reveals clearer shale facies in the path than the other algorithms (Figure 10(f)).

Figure 11 compares characterization results of a channelized reservoir with regards to sand-shale facies distribution. The initial five realizations reveal diverse channel patterns in terms of orientation and shape (Figure 11(a)). Pixel-based perturbation using ES-MDA delivers a broken channel with a low-channel connectivity after assimilation (Figure 11(b)). In Figure 11(c), the shortest path between wells P6 and P9 is filled not with shale but sand in the second and fifth realizations, while part of the channel is disconnected in the first and third realizations. Interestingly, data loss during DCT made the channel width irregular. K-SVD helps conserve the connectivity and pattern of the channel (Figure 11(d)). Parameterization followed by sparse coding delivers satisfactory results in preserving overall channel characteristics (Figures 11(e) and 11(f)). Figure 11(e) connects well P6 and well P9 with an artificial sand channel, but Figure 11(f) does not. That is, conditioning dictionaries to dynamic data through the iterative sparse coding provides opportunities to calibrate the rock facies distribution between well P6 and well P9. The five realizations of the proposed algorithm keep the channel patterns similar to the reference, while maintaining shale facies on the shortest path between well P6 and well P9 (Figure 11(f)). Most realizations have a broken sand channel near the upper left corner of the domain. This phenomenon would be an intrinsic limitation of geostatistical methods arising from a lack of observation data.

Table 6 compares computational costs paid for the construction and update of dictionaries. The hardware specification utilized for comparison was Intel® Core™ i5-7200U @ 2.5 GHz with 8 GB RAM. The runtime of the reservoir simulation was excluded from the comparison. ES-MDA and ES-MDA-DCT cost nothing for dictionary construction. ES-MDA-K-SVD was a relatively costly algorithm. It took 218 minutes to obtain the original dictionary matrices and via despite the matrix construction one time. DCT helped save computational costs required for sparse coding. ES-MDA-DCT-K-SVD was the cheapest algorithm (taking 5.7 minutes) thanks to the reduced number of parameters due to DCT. Updating sparse geologic dictionaries in each assimilation resulted in an additional 15.7 minutes in the subsequent four assimilations. Nevertheless, it seems this amount of extra cost is within a reasonable range. ES-MDA-DCT-K-SVD was 38.2 times as fast as ES-MDA-K-SVD, and ES-MDA-DCT-i-K-SVD was 10.2 times faster than ES-MDA-K-SVD.

3.3. Case 2

This case study checks whether the updated ensemble can describe the shape and orientation of three parallel sand channels under uncertainty associated with the underestimated channel width. Figure 12 shows the evolution of realizations in the dictionary matrix obtained using the proposed algorithm during multiple data assimilation. The trend of evolution is consistent with that presented in Case 1 (Figure 7). For example, the first and second realizations in the fourth assimilation sufficiently resemble the reference model. This similarity implies that the quality of geologic dictionaries is improved by adding models conditioned to dynamic observations to the initially static library pool in each assimilation.

Figure 13 compares production profiles obtained using the five ES-MDA algorithms. Each algorithm reuses the initial ensemble introduced in Case 1, as mentioned in Section 3.1. The proposed algorithm yields smaller standard deviations than the other algorithms and reduces reservoir uncertainty with reasonable matching accuracy (Table 7). The matching accuracy is comparable to those of the algorithms coupling sparse coding. Matches and prediction results for gas and water rate are decent. In terms of BHP, the proposed method shows the best conformance to observations except for well P9. The overestimated BHP is related to the underestimation of the water rate at well P9. Our inference is that this happens because of incorrect prior knowledge of the channel width. Field cumulative production profiles of gas and water are presented in Figure 14. Figure 15 compares averaged permeability distribution of the updated ensemble obtained using the five algorithms. The proposed algorithm outperforms the other algorithms by preserving the separation among the channels with NW-SE orientation and a distinct contrast at the borders between the shale background and sand channels. Facies index maps of individual ensemble members obtained after history matching also support the reliability of the proposed algorithm (Figure 16).

4. Conclusions

The hybridized ES-MDA algorithm coupled with iterative sparse coding in reduced parameter space was able to calibrate channelized reservoir models using and updating the repository of geologic features called sparse geologic dictionaries. The first library composed of thousands of reservoir models generated using SNESim was only conditioned to static data. Dimensionality reduction performed using DCT was effective in reducing the size of the library by converting gridblock facies into coefficients of discrete cosine functions. The weight matrix obtained by decomposing the library was imported to ES-MDA as a state vector. In each assimilation of ES-MDA, update of weights resulted in reservoir models that were well conditioned to both static and dynamic data. Adding the history-well-matched reservoir models as new members in the existing library was the critical factor for attaining improved matching accuracy and reduced model dispersion because of the reflection of dynamic data in the updated dictionaries. The unprecedented consideration of dictionary update was the originality of this study and the contribution to researches about combinations of machine learning and ensemble-based data assimilation methods.

History-matching results of two channelized gas reservoirs (i.e., the S-shaped channel for Case 1 and three parallel channels for Case 2) indicated that the achievements arose from an iterative update of dictionaries of the proposed algorithm: the improved matching accuracy in both history and forecast in terms of well and total production, the reduced dispersion of production behaviors and permeability distribution, and the well-connected channel body of reservoir models with geological plausibility. ES-MDA with the dictionary update yielded higher matching accuracy values and lower dispersion values than ES-MDA incorporated with a fixed dictionary matrix. The increase in computational costs paid during the dictionary update was affordable compared to the assimilation algorithms not coupled with sparse coding. Improving the matching accuracy at some well-based levels remains as an outstanding task for the proposed technique despite the overall enhanced matching quality. We also anticipate that future works will adopt a more generalized sparse coding for diversifying the utility of the proposed framework in a variety of geoenvironments (e.g., naturally fractured reservoirs).

Nomenclature

:Covariance matrix of state vectors
:Covariance matrix of observed measurement error
:Autocovariance matrix of simulation data
:Cross-covariance matrix of state vector and simulation data
:Simulation data
:Observation data
:Perturbed observation data
:Column vector of the dictionary matrix (i.e., realization)
:Mean of simulation data
:Dictionary matrix
:Discrepancy between the library matrix and the reconstructed library matrix except the jth
:Reservoir simulator
:Objective function
:State vector of a reservoir model
:Mean of state vectors
:State vector of a reservoir model before update
: by identity matrix
:Kalman gain
:Number of assimilations
:Number of time steps in observations
:Number of essential DCT coefficients
:Number of reservoir models in the dictionary matrix
:Number of ensemble members
:Number of gridblocks in a reservoir model
:Number of reservoir models in the library matrix
:Number of parameters in a reservoir model
:Number of qualified ensemble members
:Number of data types
:Covariance matrix of observation error
:Row vector of the weight matrix
:Weight matrix
:Library matrix
:Reconstructed library matrix
:Random error to observations
:Inflation coefficients of
:Error
:Discrepancy between observation and simulation results of an ensemble member
:Average
:Standard deviation.
Subscripts
:Discrete cosine transform
:Dictionary
:Ensemble
:Library
:Qualified
:Parameter.
Superscripts
:Assimilation
:Initial
:Observation
:Update.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.

Acknowledgments

Dr. Baehyun Min was funded by the National Research Foundation of Korea (NRF) grants (no. NRF-2017R1C1B5017767 and no. NRF-2017K2A9A1A01092734). Dr. Kyungbook Lee was supported by the Korea Institute of Geoscience and Mineral Resources (GP2017-024) and MOTIE projects (NP2017-021, 20172510102090). Dr. Hoonyoung Jeong is thankful to the Research Institute of Energy and Resources, Seoul National University. The authors are grateful for the support of Korea Gas Corporation (KOGAS).