Abstract

Tight oil and gas accumulation commonly has heterogeneities within the reservoir formation. This heterogeneity, however, is hard to investigate by conventional geological and (organic) geochemical tools and thus is critical and challenging to study. Here, we attempted multivariate statistical analysis to reveal the heterogeneity based on a case study in the lacustrine tight oil accumulation in the middle Permian Lucaogou Formation of the Jimusar sag, Junggar Basin, NW China. Clustering heat maps and multi-dimensional scaling analysis revealed the heterogeneity of tight oil accumulation. The heterogeneity is reflected by the complex relationship between the two reservoir sweet spots as well as the oil migration and accumulation vertically and spatially, rather than the previous thoughts that it is a closed system associated with proximal hydrocarbon accumulation patterns. Multiple biomarkers show that the source rocks and reservoirs have similar characteristics in the lower part of the formation, reflecting a proximal hydrocarbon accumulation pattern in the lower sweet spot (near-source accumulation, abbreviated as NA). This represents a relatively closed system. However, the upper sweet spot and the middle section mudstone sequence intervening the two sweet spots are not a completely closed system in a strict sense. These sequences can be divided into three tight oil segments, i.e., lower, middle, and upper from deep to shallow. The lower segment is sited in the lower part of the middle section mudstone sequence. The middle segment is composed of the upper part of the middle section mudstone sequence and the lower part of the upper sweet spot. The upper segment is composed of the upper part of the upper sweet spot and the overlying upper Permian Wutonggou Formation reservoirs. Oils generated in the lower segment migrated vertically to upper sweet spot reservoirs through faults/fractures, and laterally to distal reservoirs. Oils generated in the middle segment were preserved in reservoirs of the upper sweet spot. Oils in the upper segment require accumulation by vertical and lateral migration through faults/fractures. As such, the tight oil accumulation is complex in the Lucaogou Formation. From base to top, the accumulation mechanisms in the Lucaogou Formation were NA, VLMA (vertical and lateral migration and accumulation), NA and VLMA, thereby showing strong heterogeneities. Our data suggest that these processes might be typical of tight oil accumulations universally, and are important for future exploration and exploitation in the region to consider the heterogeneities rather than a closed system. The multivariate statistical analysis is an effective tool for investigating complex oil-source correlations and accumulation in petroleum basins.

1. Introduction

Tight oil is the crude oil accumulated in low permeability rocks, which are typically shales or tight sandstones. Exploration and exploitation of tight oil are becoming increasingly important in the petroleum industry [13]. According to the macro-sedimentary environment of deposition, tight oil can be divided into marine and lacustrine facies [4, 5]. Marine facies tight oil is important in North America and typically includes the Bakken and Eagle Ford resources [68]. However, in China, lacustrine facies tight oil is important and typically includes the Paleogene Shahejie Formation in the Bohai Bay Basin [9], Permian Lucaogou Formation in the Junggar Basin [10], and Triassic Yanchang Formation in the Ordos Basin [11]. No matter if marine or lacustrine, the tight oils exhibit significant heterogeneities within the reservoir formation, either in vertical or in spatial performances [12]. This is significant to understand the tight oil accumulation and reduce the exploration and exploitation risks.

The heterogeneity of tight oil accumulation is commonly related to sand-mud interbeds in the strata on meter and millimeter scales, resulting in complex and multicyclic petroleum generation, migration, and accumulation [12, 13]. However, up to date, it is not clear whether the heterogeneity was the result of the difference of shale hydrocarbon generation potential, or the result of the migration and accumulation of crude oil generated in shale strata to sandstone ones, and how to identify and determine the main mechanisms and processes [14, 15]. This limits our current understanding of tight oil generation and accumulation and, as such, is an important critical research direction [3].

Research into the heterogeneity of tight oil accumulation can involve geological and geochemical approaches. Geological approaches investigate whether the formation of tight oil is heterogeneous in the context of tectonics, paleoclimate, and sedimentology on a macroscale [16]. In contrast, geochemical approaches, such as the use of biomarkers and isotopes [17], may be used to characterize the heterogeneities in a microscale [18, 19]. Geochemical methods are widely used for oil-source correlations. However, some oil and gas geochemistry data are poorly correlated [20]. And multisource mixing, hydrocarbon maturation, and secondary effects can complicate interpretations of geochemical data [17]. Therefore, it is desirable to develop new methods for investigating tight oil accumulation.

Multivariate statistical analysis is an important method for complex data processing. Such techniques can classify and summarize large numbers of different types of data. The most important feature of this approach is that it can quickly process a large number of multidimensional data with 5 V characteristics (i.e., volume, velocity, variety, veracity, and value) [21] that makes multivariate statistical analysis able to evaluate oil and gas geochemistry data in high dimensions [22]. Therefore, multivariate statistical analysis is a promising method to study the heterogeneity of tight oil.

The Jimusar sag is located along the southeastern margin of the Junggar Basin and is currently one of the largest tight oil exploration and development zones in China [23]. The paleoenvironmental setting of the Lucaogou Formation was complex and variable, resulting in lithological heterogeneity and thin and uneven oil layers (0.04–4.52 m thick) [24]. Therefore, the source and accumulation of tight oil in the Lucaogou Formation are an important area of ongoing research. Previous observations have shown that the hydrocarbon source rocks in the Lucaogou Formation comprise mudstones, carbonates, and fine-grained clastic sediments. Hydrocarbon sweet spots in the upper parts of the formation are in carbonates deposited in littoral and shallow lake facies, whereas in the lower parts they are located in fine-grained clastic sediments deposited in a delta front and as sand sheets [16]. Previous studies have suggested that near-source oil accumulation was the main accumulation mechanism in this formation [23]. However, the complex sandstone-mudstone interbeds in the Lucaogou Formation indicate that near-source oil accumulation might not be the only mechanism of accumulation. For example, some biomarkers show variations in the upper sweet spots, such as the C24tetraterpane and gammacerane index values [25], which cannot be completely explained by near-accumulation. Therefore, the accumulation of tight oil in the Lucaogou Formation requires further research.

As such, this study used multivariate statistical methods to investigate tight oil accumulation in the Lucaogou Formation for the first time. The purpose of this research was to testify the applicability of this method, as well as to provide new insights into tight oil accumulation and exploration in the region.

2. Geological Setting

The Jimusar sag is located in the southwestern part of the eastern uplift zone of the Junggar Basin, NW China (Figure 1(a)). The sag is bounded by faults, including the Jimusar fault in the north, Santai fault in the south, and Laozhuangwam and Xidi faults in the west (Figure 1(b)) [12, 2628]. The sag is a depression formed by Early Carboniferous sedimentation onto folded basement, which is deep in the west and shallow in the east. The structure of the sag is gentle and simple (Figure 2). The sag extends ca. 30 km N-S and ca. 60 km E-W and has a total area of ca. 1278 km2. It has experienced multiple tectonic events such as the Hercynian (Middle Devonian; 386–258 Ma), Indosinian (Triassic; 258–205 Ma), Yanshanian (Jurassic-Cretaceous; 205–65 Ma), and Himalayan (Oligocene-middle Pleistocene; 24.6–0.78 Ma). Permian volcanism in the northern Tianshan led to continuous subsidence and deposition in the Jimusar sag, forming the interbedded mudstone, sandstone, and carbonate sequence of the middle Permian Lucaogou Formation [26, 29]. Subsequently, the Yanshanian and Himalayan orogenies resulted in an uplifting in the eastern Jimusar sag and thrust faulting in the western Jimusar sag [30]. The Yanshanian movement also caused the rapid and strong uplifting of the Shaqi uplift, and fractures and faults were relatively developed in the Lucaogou Formation [31, 32]. For example, there are wide fractures developed in Well Ji 174 while they cannot be discerned seismically due to the resolution limits [31, 32].

The strata from Carboniferous to Jurassic in the study area are well preserved [33]. As for the key strata of the middle Permian Lucaogou Formation, they are widely distributed in the entire study area and currently buried at depths of 2500–4255 m and are deep in the west and shallow in the east (Figure 2). The formation was deposited by mixed salt-lacustrine sedimentation and contains complex and variable lithologies, including sandstones, siltstones, mudstones, and carbonates [16]. The strata of the Lucaogou Formation generally comprise two sedimentary cycles from coarse- to fine-grained in its upper and lower parts [34], and can be further divided into P2l12, P2l11, P2l22, and P2l21 from base to top according to logging data (Figure 3). Based on the physical properties of the rocks, there are good reservoirs in the upper (P2l22) and lower (P2l12) sweet spots [12]. The lower reservoirs are thick and concentrated, whereas the upper reservoirs are thin and dispersed. In addition, between the upper and lower reservoirs, there occurs mudstone with high total organic carbon (TOC) contents (0.03–15.51 wt.%) and a thickness of 97–182 m.

In general, the two reservoir sections described above are thick (>200 m) and widely distributed (725 km2) [12, 16]. The crude oil in these reservoirs has a viscosity of 45.6–434.9 MPa.S (at 50°C). The total resource reserve is 1.22 billion tons.

3. Samples and Methods

3.1. Samples

A total of 112 samples were collected to find out the vertical and spatial relationship between the crude oils contained in the strata at different locations and their genetic relationship with source rocks. In detail, there are sixty-nine mudstone samples from Well Ji 174 (28 samples from the upper sweet spot sequence, 10 from lower sweet spot sequence, and 31 from the intervening mudstones). Sixteen samples are sandstones from Well Ji 174 (7 samples from upper sweet-spot reservoirs, 6 from lower sweet-spot reservoirs, and 3 from the intervening mudstone sequences; Figure 3). Well Ji 174, which is located in the depocenter of the Jimusar sag during P2l period, is the representative of the whole area as the Lucaogou Formation is uniform laterally in the depression and thus is the focus of this study [12]. The results and understanding obtained from Well Ji 174 will be tested in the future when more and more samples and data besides Well Ji 174 are available. This cannot be realized at present due to the lack of sufficient samples.

Nineteen samples are crude oils from the Lucaogou Formation throughout the whole study area and from seven different wells, and seven crude oil samples are from four wells in the overlying upper Permian Wutonggou Formation (P3wt). In detail, 19 oil samples from the Lucaogou Formation are collected from 7 different wells, including Wells Ji 30, Ji 33, Ji 28, Ji 173, Ji 251, Ji 174, and Ji 31. They cover the oils discovered from the margin to center of the tight oil system in the Lucaogou Formation (Figure 1). This is aimed at a lateral migration research of the tight oil system. From the overlying upper Permian Wutonggou Formation (P3wt), 7 crude oil samples were collected from four wells, including Wells Ji 002, Ji 003, Ji 014, and Ji 171, with the aim to investigate the vertical migration between Lucaogou and Wutonggou formations.

3.2. Methods

We not only use the cluster analysis and multidimensional scaling analysis of multivariate statistical method but also conduct biomarker analysis and evaluation of source rocks.

The 85 rock samples were extracted using a Soxhlet apparatus with CHCl3 for 72 h, yielding the bitumen extracts. Then, the resulting bitumen extracts were fractionated using open silica gel column chromatography with -hexane to yield saturated hydrocarbons, aromatic hydrocarbons, and NSO fractions. The saturated hydrocarbons were further analyzed by gas chromatography (GC) and gas chromatography-mass spectrometry (GC-MS) for molecular geochemical compositions. The GC analysis used a HP6890 gas chromatograph fitted with a  mm i.d. HP-5 column with a film thickness of 0.25 μm, using N2 as a carrier gas. The GC oven temperature was initially held at 80°C for 5 min before being ramped from 80°C to 290°C at 4°C/min, and then held there for 30 min. The GC-MS analysis was conducted using an Agilent 5973I mass spectrometer 174 interfaced with a HP6890 gas chromatograph fitted with the same type of column as that used during GC analysis, employing He as a carrier gas. The GC oven temperature during the GC-MS analysis was initially held at 60°C for 5 min before being ramped to 120°C at 8°C/min, from 120°C to 290°C at 2°C/min, and then held at 290°C for 30 min.

Organic matter maturity was evaluated by commonly used vitrinite reflectance (VRo) measurements [35, 36]. Eighty-five polished block samples were analyzed using a Zeiss Axiokop 40 Pol incident light microscope with a wavelength () of 546 nm and a oil immersion objective. An yttrium aluminum garnet standard (GWB13401) with a reflectance of 0.588% was used for calibration, and at least 50 measurements were performed on each sample.

The clustering analysis and multidimensional scale algorithm were used to make a scientific and real statistics and division of the accumulation model of the Lucaogou Formation in the Jimsar sag. Pedigree cluster analysis is a method commonly used by predecessors in oil-oil and oil-source comparison studies [37]. However, this method has some shortcomings in the study of oil-source correlation in the Lucaogou Formation: (1) the straight line distance between spatial points is the simplest, while the authenticity remains to be studied; (2) the classification basis of clustering is unclear, and the contribution of each indicator cannot be seen directly, so that subjective factors dominate the interpretation. Based on the two limitations of pedigree clustering, this paper uses R studio programming technology and cluster heat map as a comparison method [38] to objectively analyze and explain the relationship between oil sources in the Lucaogou Formation.

In addition, the use of multidimensional scale can also provide a new idea for oil-source correlation. Multidimensional scaling, also known as similarity structure analysis, is one of the methods of multivariate analysis. The MDS analysis method used in this study represents a statistical handing of biomarker parameters using Bray-Curtis algorithm. New parameters were generated based on similar organic matter information (e.g., organic inputs, types, and maturities). Those were named MDS1, MDS2, MDS3, etc. Therefore, such MDS1 and MDS2 represent classification without any valid geochemical information of the database. Hence, we use MDS1 and MDS2 as two axes in 2-D space to reflect the relationship between samples through the distance. In addition, the smaller the stress value, the better the fitting degree and the higher the reliability. This study has a high reliability as the stress value is low to 0.1097.

This study used high-dimensional statistical analysis of biomarker data. Prior to the analysis, effective parameters that best represent the original characteristics of source-rock organic facies were systematically evaluated. A total of 32 biomarker indicators were finally selected, with the strongest correlations, showing regularities and distinctiveness (see discussion below for detail). The biomarkers include normal alkanes (OEP, CPI, and C21/C22+), acyclic isoprenoids (Pr/Ph, Pr/nC17, and Ph/nC18), β-carotane index, tricyclic terpanes (TT) (C19/C20TT, C19/C21TT, C20/C21TT, C20/C23TT, and TT main carbon peak/C30hopane), tetracyclic terpanes (C24Tet/C30hopane, C24Tet/(C24Tet+C26TT), and C24Tet/C26TT), hopanes (Tm/C30H, C30M/C30H, C29H/C30H, C31H/C30H, and gammacerane/C30hopane), and steranes (percentage of 20S C27–29steranes, diasteranes/steranes, (pregnane+homopregnane)/C29 regular sterane, pregnane/homopregnane, and C29sterane maturity indexes).

We used the Bray–Curtis algorithm to construct and analyze clustering heat maps and conduct multidimensional scaling [39]. The clustering heat map can objectively classify processed data [40], while the multidimensional scaling algorithm can reveal the correlation between different samples [22], thereby identifying the source(s) and accumulation of crude oil.

4. Results and Discussion

4.1. Clustering Heat Map Analysis
4.1.1. Five Groups of Tight Oil Accumulation Zones

Figure 4 shows the clustering heat maps for all the sample data in this study, including from crude oils and rocks in the Lucaogou Formation and crude oils from the Wutonggou Formation (Supplementary Table). A total of 112 samples were analyzed to find out the vertical and spatial relationship between the crude oils contained in the strata at different locations and their genetic relationship with source rocks. Well Ji 174 is the focus of this study as it is the most representative with most samples available up-to-date in the study area.

The distributions of the source rocks, sandstones, and crude oils in the maps are complex, which indicates that without considering the geological context, multivariate statistical analysis and clustering heat maps cannot be applied to geochemical data. As such, removing all the crude oil samples to reduce the complexity, the correlations between sources (mudstones) and reservoirs (sandstones) in each well were first determined. Only source rock and sandstone reservoir samples from Well Ji 174 were used in this analysis. This produced a better classification result in the clustering heat map (Figure 5), which identified five groups. Groups 1–5 are named sequentially from left to right in Figure 5. The biomarker characteristics of Group 4 source rocks and sandstone reservoir samples are different from those of the other four groups, and all Group 4 samples are from the lower reservoir region. In comparison, samples from Groups 1, 2, 3, and 5 were not well distinguished in terms of biomarker parameters, and these samples are all from the upper reservoir region and intervening mudstone section. This might reflect that crude oil accumulation in these sections was more complex than in the lower sweet spot region.

Therefore, the grouping of these samples was further investigated with respect to the stratigraphy of the Lucaogou Formation to understand the vertical characteristics and patterns of oil-source correlation (Figure 6). Results show that the Group 4 samples all belong to the lower sweet spot region and can be readily distinguished from the other four groups in a vertical direction. Ignoring the Group 4 samples, it is evident that Group 1 and 2 samples are distributed in the lower-middle P2l22, and their distribution is relatively concentrated. These are named the “middle tight oil segment”, subdivided into upper and lower parts termed the “upper tight oil segment” and “lower tight oil segment”, respectively. Group 3 and 5 samples are distributed in both the upper and lower parts of P2l11 and P2l22, respectively. As such, Group 3 and 5 samples are distributed in both the upper and lower tight oil segments.

4.1.2. Biomarker Characteristics of Five Groups of Tight Oil

As discussed above, the Group 4 samples are independent of the other samples. This was evidenced in environmental information reflected by the biomarkers. Firstly, from the perspective of acyclic isoprenoids, Pr/Ph, Pr/nC17, and Ph/nC18 all reflect weakly reducing depositional environments, particularly for the relatively low Pr/Ph values. In general, the ∑C21/C22+ values in the Group 4 samples are high, and the β-carotane index values are also higher than those of the other four groups. This indicates that the depositional environment of Group 4 samples was more reduced than for the other samples and also had a higher salinity and greater contribution from aquatic organisms [17, 41, 42]. In addition, the OEP and CPI index values of the Group 4 samples are all about 1.0, indicating that they are in the mature stage. In comparison, values for other groups are greater or less than 1.0, indicating a relatively lower maturity.

The conventional identification map was unable to distinguish the small differences in the terpanes () between the Group 4 samples and other sample groups (Figures 7(a) and 7(b)) and could only show that all samples have lacustrine characteristics. In addition, Figures 7(c) and 7(d) show that the salinity was high and the organic input was mainly algae. The Group 4 C19/C20, C19/C21, and C20/C23 tricyclic terpane ratios are small (Figure 5), indicating that prokaryote was the main organic input. However, these three ratios show that the four other sample groups mainly had a higher plant organic input. This is consistent with the C21/C22+ data described above and shows that the Group 4 samples are different from the other four groups and have more uniform compositions.

The C24 tetracyclic terpane content can also be used to trace organic inputs. It is generally considered that when the tetracyclic terpane content is high, the terrestrial organic input is greater [43, 44]. The C24Tet/C26TT values shown in Figure 5 highlight that the Group 4 samples have low values. However, some Group 1 and 2 samples have generally higher values, while the Group 3 and 5 samples have generally lower values, but still higher than the Group 4 values. It indicates that Group 4 is clearly different from the other four groups in terms of organic input. Group 4 samples had little or no terrestrial organic inputs, whereas the other four groups had some terrestrial organic inputs. In addition, abundant C24 tetracyclic terpanes can also indicate a depositional environment of carbonate and evaporitic rocks [45], thus indicating that some samples from Groups 1, 2, 3, and 5 formed in such settings.

The C29/C30hopane ratio is also an indicator of the depositional environment of carbonate and evaporitic rocks [42, 46, 47]. The samples with high C24Tet/C26TT also have a high C29/C30hopane ratio, consistent with the source rocks in Groups 1, 2, 3, and 5 having formed in a sedimentary environment typical of carbonate and evaporitic rocks.

Steranes have high specificity in biomarker compounds, indicating the input of eukaryotes. The conventional C27–C28–C29 sterane ternary diagram (Figure 8(a)) can identify the organic inputs into a sample [48]. On this diagram, the lower sweet spot region can be distinguished from the upper region and intervening mudstone section. However, the upper sweet spot region and intervening mudstone section are not highly differentiated, which is consistent with the normal alkane and terpane data as discussed above. Based on this analysis, the main organic inputs into the lower sweet spot region was prokaryotic plankton, whereas the upper sweet spot region and intervening mudstone section had inputs from small amounts of bryophytes and terrestrial plants.

The C29sterane 20S/(20S+20R) and C29 sterane ββ/(αα+ββ) values (Figure 8(b)) indicate that the source rocks of the sweet spot region and intervening mudstone section are in the low maturity to mature stage. This implies that shale systems have relatively strong adsorption to hydrocarbons generated from different maturities, combined with the measured current values (0.78–0.94%) is indicative of moderate maturity. The changeable range of maturity of source rock extracts is indicative of the hydrocarbons that are generated in different burial and maturation stages resided in the source rocks. As a result, the sterane maturity indexes present a relatively wide range (from low maturity to mature) within 200 m thickness and the current maturity of the source rocks is mature.

The ratio of diasteranes to steranes is related to the source rock sedimentary environment. However, the controversy regarding the formation mechanism of diasteranes means that the reliability of this ratio is still uncertain [17]. Many studies have shown that anhydrite formed in evaporitic salt marshes has high contents of diasteranes [49]. Diasterane/sterane ratios show that the Group 4 samples contain no evaporitic sedimentary component compared with the other four sample groups (Figure 5). Groups 1, 2, 3, and 5 have high diasterane contents, consistent with an evaporitic depositional environment. The ratio of diasteranes to steranes is also a maturity indicator from the early mature to the overmature stage. In general, thermal decomposition in the mature stage will lead to the break-up of biomarker compounds, resulting in a low diasterane/sterane ratio [50].

Based on the above analysis, it is evident that the lower sweet spot region (Group 4) has different biomarker characteristics as compared with both the upper sweet spot region and intervening mudstone section (i.e., the other four groups) in the Lucaogou Formation. This reflects differences in tight oil accumulation mechanisms. In detail, the main organic inputs in the lower sweet spot region were prokaryotic plankton, whereas elsewhere it was mainly phytoplankton, along with some terrestrial plants and bryophytes. The source rocks of the upper sweet spot region and intervening mudstone section were formed in a carbonate and evaporitic sedimentary environment. During sedimentation, the water salinity was high and the water was stratified. The maturity of the Lucaogou Formation decreases upwards from the mature to low-maturity stage. The lower sweet spot region is in the mature stage.

In summary, it is difficult to differentiate between the lower sweet spot region and overlying regions in the Lucaogou Formation using conventional methods, whereas the clustering heat maps reveal differences. Group 4 samples are clearly different in terms of their organic inputs, sedimentary environment, and water salinity.

4.2. Multidimensional Scaling Analysis

The clustering heat map analysis was used to divide samples into five groups, and Group 4 (from the lower sweet spot region) is very different to the other four groups, as discussed above. However, the other four groups are not clearly distinguishable in the clustering heat maps. Based on the sample names in the heat map (Figure 5), all rock samples in the other four groups can correspond to their depth, respectively. The results are all shown on Figure 6 and Supplementary Table. It can be found that the Group 1 and 2 samples are distributed in the depth range between 3146.16 m and 3183.80 m, but the samples of Group 3 and 5 are distributed in two depth ranges, which are 3114.73–3145.44 m and 3190.57–3292.28 m, respectively. Thus, the whole strata of the upper sweet spots and middle section mudstones can be divided into three segments according to the depth range. We named them as upper-tight-oil-segment, middle-tight-oil-segment, and lower-tight-oil-segment. In particular, the upper tight oil segment and lower tight oil segment both contain Group 3 and 5 samples (Figure 6). As such, multidimensional scaling analysis was used to further analyze and distinguish the Group 1, 2, 3, and 5 samples.

In addition, in view of the overlapping characteristics of the four groups, it is theoretically possible that the crude oil migrated upward to the reservoir region. Therefore, data for crude oil samples from the Lucaogou and Wutonggou formations were used for the further analysis, as shown in Figure 9. Combining the multivariate statistical analysis method and the spatial positions of the samples, the small difference can be measured and have geochemical implications. Groups 3 and 5 can be divided vertically into four subgroups with small differences (Group 3-up, Group 3-low, Group 5-up, and Group 5-low; Figure 9). Group 3-up and Group 5-up samples were more similar in spatial distribution than the Group 3-low and Group 5-low samples and, similarly, Group 3-low samples were more similar in spatial distribution than Group 5-low samples. Such distributions indicate that even though the above clustering heat map analysis subdivides the upper tight oil segment and the lower tight oil segment (Figure 6) into Group 3 and Group 5, these groups show a stronger affinity to each other when closely spatially related. In contrast, Group 1 and 2 samples both cluster and have similar spatial distributions and show a good correlation with adjacent sandstone reservoirs. Therefore, the middle tight oil segment appears to have a near-accumulation mechanism.

In summary, the lower tight oil segment has a good correlation with crude oil data from Wells Ji 174, Ji 30, Ji 33, and Ji 31 of upper tight oil segment in the Lucaogou Formation, so it represents vertical and lateral migration occurred. When the fault developed in the middle tight oil segment, the oils in the lower tight oil segment have a migration pathway, causing the vertical and lateral migration to occur. On the contrary, when no fault developed in the middle tight oil segment, there is no migration pathway for tight oil in the lower tight oil segment, resulting in a weak correlation with crude oil from the upper tight oil segment. Wells Ji 251, Ji 28, and Ji 173 show this feature in Figure 9. Therefore, we propose that the hydrocarbons generated in the lower tight oil segment migrated vertically and laterally upward when faults developed in the middle tight oil segment. Otherwise, migration cannot occur.

Similar features also characterize the upper tight oil segment. Firstly, this segment is closely related to the upper sandstone reservoirs, indicating near-source accumulation. In addition, crude oil in other parts of the Lucaogou Formation also has a strong correlation with the upper tight oil segment, reflecting lateral migration in this segment. Finally, samples from the upper-tight-oil-segment also show a good correlation with crude oils in the upper Permian Wutonggou Formation. Considering that the Wutonggou Formation does not contain source rocks [51], it is concluded that the crude oil in the Wutonggou Formation originated from the source rocks of the upper tight oil segment.

As such, the upper tight oil segment had three accumulation mechanisms: lateral migration, vertical migration, and near-source accumulation. The middle tight oil segment was generally characterized by near-source accumulation, and the lower tight oil segment by vertical and lateral migration accumulation.

4.3. Accumulation Models of Tight Oil

Previous research has suggested that the key of tight oil accumulation in brackish lacustrine systems like in this study is the place that “sweet spots” section generated associated with the development of high-quality reservoirs. As discussed above, such reservoirs are commonly heterogeneous and complex. However, due to the relatively low physical properties and difficulty in hydrocarbon migration, brackish lacustrine tight oil systems have commonly been thought to be characterized by near-source accumulation [12]. In addition, some studies indicated that the oils generated from source rocks have very short vertical migration and not all mudstones across the formation contribute to the Lucaogou tight oil accumulation [10, 52]. All previous research of the Lucaogou Formation indicated that the main accumulation model in this area is near-source accumulation, with some vertical migration in cases [10, 52].

However, in the context of geology [12, 16, 34], the near-source accumulation model of the Lucaogou Formation in the study needs to be reevaluated. First, it is expected that the middle section mudstones might have contribution to shale oils because such mudstones are of good quality, which has average TOC and HI of 3.20% and 315.66 mg HC/g TOC, respectively. In addition, the sandstone and carbonate interlayers within mudstones (sweet spots) are laterally distributed. This implies that a lateral hydrocarbon migration along these layers is possible. Third, some microfaults and fractures are developed even though the formation is generally tight [31, 32]. As a result, a vertical hydrocarbon migration is also possible.

All these assumptions were testified in this study. Our multi-dimensional scaling and clustering heat map analysis can be used to construct new tight oil accumulation models for the Lucaogou Formation in the Jimusar sag. The accumulation mechanisms from bottom to top are near-source accumulation (NA), vertical and lateral migration and accumulation (VLMA), near-source accumulation (NA), vertical and lateral migration (VLMA), and near-source accumulation (NA; Figure 10). In comparison, the whole upper sweet spot region and intervening mudstone section experienced three different mechanisms: (1) vertical and lateral migration accumulation in the lower tight oil segment, (2) near-source accumulation in the middle tight oil segment, and (3) lateral migration and near-source accumulation in the upper tight oil segment and vertical migration into the overlying Wutonggou Formation.

Our results have some implications for the accumulation of lacustrine tight oil, as follows: (1)The lithologies of tight oil sequences are variable and heterogeneous on various scales. For example, the lithologies in the Lucaogou Formation in Well Ji 174 have significant biomarker differences, reflecting heterogeneity caused by changes in the climate during sediment deposition(2)When sufficient high-quality source rocks and reservoirs are present, near-source accumulation will readily occur(3)When the reservoir is tight or there are some faults between the source rocks and reservoirs, migration needs to occur, as was the case for the upper sweet spot region and intervening mudstone section in this study. The upper sweet spot region has less reservoir capacity than the lower sweet spot region. Therefore, the upper sweet spot region cannot completely store the hydrocarbons generated in this region, and lateral and vertical migration must take place. Our multivariate analysis suggests that the upper tight oil section near the faults has more characteristics of mixed oil and gas than the lower tight oil section as fractures caused vertical migration. Such oil mixing effect within the upper sweet spot is more common than the lower sweet spot in the study area(4)Due to the strong heterogeneities of brackish lacustrine deposits, hydrocarbon generation and reservoir development in different locations of the Lucaogou targeted reservoir in this study are quite different. It is not a completely closed system in a strict sense. As such, the possibility of other oil source interference and multistage hydrocarbon generation should be considered. And this is why we use the multivariate statistical method and consider many indexes as possible as best to study the complex oil accumulation in the system. It is showed that multistage hydrocarbon generation occurred in this area, and the vertical and lateral tight oil migration occurred in the system

5. Conclusions

The accumulation of tight oil universally has significant heterogeneities, which are critical and challenging for the petroleum exploration and exploitation. Our study shows that such heterogeneities can be investigated using big data techniques, e.g., clustering heat map and multidimensional scale analysis in this study. The accumulation mechanisms and patterns of the Permian Lucaogou Formation in the Jimusar sag were successfully classified using these techniques, revealing the heterogeneity of tight oil accumulation. From base to top, the accumulation mechanisms in the Lucaogou Formation were NA (near-source accumulation), VLMA (vertical and lateral migration and accumulation), NA, and VLMA.

When the source rocks and reservoirs are well developed and in close proximity, near-source accumulation occurs, e.g., the lower sweet spot region of this study. When the reservoir is not well developed or there is a gap/barrier between the source rocks and reservoirs, migration of hydrocarbons occurs under the condition of fault/fracture development, e.g., the upper sweet spot region of this study.

Data Availability

Data would be available on request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

We thank the engineers from PetroChina Xinjiang Oilfield Company for the insightful discussion with them. This work was jointly funded by the National Natural Science Foundation of China (Grant Nos. 41802145 and 41830425), the PetroChina Science and Technology Major Project (Grant No. 2017E-0401), the National Postdoctoral Program for Innovative Talents of China (Grant No. BX201700114), the China Postdoctoral Science Foundation funded project (Grant No. 2017M621702), and the Fundamental Research Funds for the Central Universities of China (Grant No. 020614380070).

Supplementary Materials

Analytical data for geochemical characteristics of mudstone, sandstone, and crude oil samples in the Lucaogou Formation and Wutonggou Formation, Jimusar sag. MDS1 and MDS2 define a 2-D space that allows the best spatial representation of sample similarities, without any inference of measured biomarker parameters. (Supplementary Materials)