Geostatistics Glossary#
Michael J. Pyrcz, Professor, The University of Texas at Austin
Twitter | GitHub | Website | GoogleScholar | Book | YouTube | Applied Geostats in Python e-book | LinkedIn
Chapter of e-book “Applied Geostatistics in Python: a Hands-on Guide with GeostatsPy”.
Cite this e-Book as:
Pyrcz, M.J., 2024, Applied Geostatistics in Python: a Hands-on Guide with GeostatsPy, https://geostatsguy.github.io/GeostatsPyDemos_Book.
The workflows in this book and more are available here:
Cite the GeostatsPyDemos GitHub Repository as:
Pyrcz, M.J., 2024, GeostatsPyDemos: GeostatsPy Python Package for Spatial Data Analytics and Geostatistics Demonstration Workflows Repository (0.0.1). Zenodo. https://zenodo.org/doi/10.5281/zenodo.12667035
By Michael J. Pyrcz
© Copyright 2024.
This chapter is a summary of essential Geostatistics Terminology.
Motivation for Geostatistics Concepts#
Firstly, why do this? I have received the request for a course glossary from the students in my Data Analytics and Geostatistics undergraduate course. While I usually dedicated a definition slide in the lecture slide decks for salient terms, I know my students would appreciate a single source. Also, since I’ve already written these out it should be pretty efficient to do this.
Let me begin with a confession. There is a Geostatistical Glossary and Multilingual Dictionary written by my good friend Dr. Ricardo A. Olea and excellent geologist and statistician from the USGS. For those seeking the indepth, complete list of geostatistical terms please use this book!
By writing my own I can limit the scope and descriptions to course content. Also, I will eventual populate all the chapters with hyperlinks to the glossary. Finally, like the rest of the book I want the glossary to be a evergreen living document.
Addition Rule (probablity)#
Probability Concepts: when we add probabilities (the union of outcomes), the probability of A or B is calculated with the probability addition rule,
given mutually explusive events we can generalize the addition rule as,
Affine Correction#
Distribution Transformations: a distribution rescaling that can be thought of as shifting, and stretching or squeezing of a distribution. For the case of affien correction of \(X\) to \(Y\),
We can see above that the affine correlation method first centers the distribution, the rescales the dispersion based on the ratio of the new standard deviation to the original standard deviation and then shifts the distribution to centered on the target mean.
there is no shape change for affien correction. For shape change consider Distribution Transformation like Gaussian Anamorphosis.
Aniisotropic or Directional (variogram)#
Variogram Calculation: variogram calculated or modelled such that direction (azimuth) is considered
for an anisotropic experimental variogram set the azimuth tolerance less than 90 degrees, then the experimental variogram is sensitive to direction.
for an anisotropic variogram model set the major range greater than the minor range, then the variogram model is sensitive to direction.
Azimuth Tolerance (variogram)#
Variogram Calculation: the tolerance \(+/-\Delta\) in azimuth applied to pool pairs of data for that specific azimuth.
for example, given an azimuth of 90 degrees (along the positive x axis in map view), we could assign an azimuth tolerance of 20 degrees and then all pairs with azimuths betwee 70 degrees and 110 degree will be pooled to calculate the experimental variogram for this azimuth
it is common practice for directional experimental variograms to use 22.5 degree, this result in a 45 degrees angle in the tolerance. This is seen as a good balance to provide smooth, interpretable variograms without mixing spatial continuity information over many directions
this may be increased to smooth the experimental variogram for improved interpretability
for isotropic (also called omnidirectional) experimental variograms (not sensitive to direction) use an azimuth tolerance of 90.0, resulting in 180 degrees angle in the tolerance.
Bandwidth (variogram)#
Variogram Calculation: the maximum orthogonal deviation from the azimuth vector when identifying data pairs to calculate an experimental variogram.
at large lag distances with azimuth (dip) tolerance data could start to mix between stratigraphic units. Bandwith is typically applied to the vertical direction to reduce this mixing.
bandwidth should be set very large (to remove its influence) for isotropic (also called omnidirectional) variograms
Bayesian Probability#
Probability Concepts: probabilities based on a degree of belief (expert experience) in an event,
updated as new information is available
solve probability problems that we cannot use simple frequencies, i.e., frequentist probablity approach
Bayesian updating is modeled with Bayes’ Theorem
Bayes’ Theorem (probability)#
Probability Concepts: the mathematical model central to Bayesian probability for the Bayesian updating from prior probability, with likelihood probability from new information to posterior probability.
where \(P(A)\) is the prior, \(P(B|A)\) is the likelihood, \(P(B)\) is the evidence term and \(P(A|B)\) is the posterior. If is convenient to substitute more descriptive labels for A and B,
Big Data#
Geostatistical Concepts: you have big data if your data has a combination of these criteria:
Data Volume: many data samples and features, difficult to store, transmit and visualize
Data Velocity - high rate collection, continuous data collection relative to decision making cycles, challenges keeping up with the new data while updating the models
Data Variety - data form various sources, with various types of data, types of information, and scales
Data Variability - data acquisition changes during the project, even for a single feature there may be multiple vintages of data with different scales, distributions and veracity
Data Veracity - data has various levels of accuracy, the data is not certain
For most subsurface applications most if not all of these criteria are met. Subsurface engineering and geoscience is often working with big data!
Big Data Analytics#
Geostatistical Concepts: the process of examining large and varied data sets to discover patterns and make decisions, the application of statistics to big data.
Bootstrap#
Distribution Transformations: a statistical resampling procedure to calculate uncertainty in a calculated statistic from the sample data itself.
uses repeated (\(L\) times) Monte Carlo simulations from the dataset CDF, repeated (number of data samples, \(n\), times) samples from the sample data with replacement.
calculates the entire distribution for uncertainty in any statistic, you calculate any summary statistic, e.g., mean, P10 and P90 of the uncertainty model
We can apply bootstrap to build the uncertainty distributions, predictor feature random variables (RVs), as the inputs for a Monte Carlo Simulation (MCS) workflow when that input is a statistic, e.g., average porosity.
Categorical Feature#
Geostatistical Concepts: a feature that can take be one of a limited, and usually fixed, number of possible values
Categorical Nominal Feature#
Geostatistical Concepts: a categorical feature without any natural odering, for example,
facies = {boundstone, wackystone, packstone , Brecia}
minerals = {quartz, feldspar, clacite}
Categorical Ordinal Feature#
Geostatistical Concepts: a categorical feature with a natural odering, for example,
geologic age = {Miocene, Pliocene, Pleistocene}
Mohs hardness = \({1, 2, \ldots, 10}\)
Cell-based Declustering#
Declustering to Correct Sampling Bias: a declustering method to assign weights to spatial samples based on local sampling density, such that the weighted statistics are likely more representative of the population. Data weights are assigned so that,
samples in densely sampled areas receive less weight
samples in sparsely sampled areas receive more weight
Cell-based declustering proceeds as follows:
a cell mesh is placed over the spatial data and weights are set as proportional to the inverse of the number of samples in the mesh
the cell mesh size is varied and the cell size the minimizes the declustered mean (in the sample mean is biased high) or maximizes the declustered mean (if the sample mean is biased low)
to remove the impact of cell mesh position, the cell mesh is randomly moved several times and the resulting weights are averaged for each datum
The weights are calculated as:
where \(n_l\) is the number of data in the current cell, \(L_o\) is the number of cells with data, and \(n\) is the total number of data.
Here are some highlights for cell-based declustering,
expert judgement based on the nominal sample spacing (before infill) will improve the performance over the automated method for cell size selection based on minimum or maximum declustered mean
cell-based declustering is not aware of the boundaries of the area of interest; therefore, data near the boundary of the area of interest may appear to be more sparsely sampled and receive more weight
cell-based was developed by Professor Andre Journel in 1983, [Jou83]
Continuous Feature#
Geostatistical Concepts: a feature that can take on any value between a lower and upper bound. For example,
porosity = \({13.01\%, 5.23\%, 24.62\%}\)
gold grade = \({4.56 g/t, 8.72 g/t, 12.45 g/t}\)
Continuous, Interval Feature#
Geostatistical Concepts: a continuous feature where the intervals between numbers are equal, for example, the difference between 1.50 and 2.50 is the same as the difference between 2.50 and 3.50, but the values do not have an objective reality (exist on an arbitrary scale)
for example, Celsius scale of temperature (an arbitrary scale based on water freezing at 0 and boiling at 100)
Continuous, Ratio Feature#
Geostatistical Concepts: a continuous feature where the intervals between numbers are equal, for example, the difference between 1.50 and 2.50 is the same as the difference between 2.50 and 3.50, but the values do have an objective reality (measure an actual phenomenon)
for example, Kelvin scale of temperature, porosity, permeability, saturation
Cognitive Biases#
Geostatistical Concepts: an automated (subconscious) thought process used by human brain to simplify information processing from large amount of personal experience and learned preferences. While these have been critical to our evolution on this planet, they can lead to the following issues in data science:
Anchoring Bias, too much emphasis on first piece of information. Studies have shown that first piece of information could be completely irrelevant as we were just starting to learning about the topic!
Availability Heuristic, overestimate importance of information available to them, for example, “My grandpa smoked 3 packs a day and lived to 100”
Bandwagon Effect, probability increases with the number of people holding the belief
Blind-spot Effect, fail to see your own cognitive biases
Choice-supportive Bias, probability increases after a commitment, decision is made
Clustering Illusion, seeing patterns in random events
Confirmation Bias, only consider new information that supports current model
Conservatism Bias, favor old data to newly collected data
Recency Bias, favor the most recently collected data
Survivorship Bias, focus on success cases only
Robust use of statistics / data analytics protects use from bias.
Complimentary Events (probability)#
Probability Concepts: the NOT operator for probability, if we define A then A compliment, \(A^c\) is not A and we have this resulting closure relationship,
complimentary events may be considered for beyond univariate problems, for example consider this bivariate closure,
Note, the given term must be the same.
Conditional Probablity#
Probability Concepts: the probability of an event, given another event has occurred,
we read this as the probability of A given B has occurred as the joint divided by the marginal. We can extend conditional probabilities to any multivariate case by adding joints to either component. For example,
Core Data#
Geostatistical Concepts: the primary sampling method for direct measure for subsurface resources (cuttings analysis are also direct measures with greater uncertainty and smaller, irregular scale). Comments on core data,
expensive / time consuming to collect for oil and gas, interupt drilling operations, sparse and selective (very biased) coverage
very common in mining with regular patterns and tight spacing
What do we learn from core data?
petrological features (sedimentary structures, mineral grades), petrophysical features (porosity, permeability), and mechanical features (elastic modulas, Poisson’s ratio)
stratigraphy and ore body geometry through interpolation betweeen wells and drill holes
Coe data are critical to support subsurface resource interpretations. They anchor the entire reservoir concept / framework for prediction
for example, core data collocated with well log data are used to callibrate (ground truth) facies, porosity from well logs
Correlogram#
Variogram Calculation: a measure of similarity vs. distance. Calculated as the average product of values separated by a lag vector centered by the square of the mean standardize by the variance.
When the feature is standardized to have a variance of 1.0, the correlogram is equal to the covariance function.
and in that case the correlogram is the variogram upside down,
The correlogram is very easy to interprete since it is the correlation over the specified lag distance.
Covariance Function#
Variogram Calculation: a measure of similarity vs. distance. Calculated as the average product of values separated by a lag vector centered by the square of the mean.
The covariance function is the variogram upside down,
We model variograms, but inside the kriging and simulation methods they are converted to covariance values for numerical convenience, i.e., covariances result in diagonally dominant matrices that are more stable when inverted to calculate the kriging weights.
Cumulative Distribution Function (CDF)#
Univariate Distributions: the sum of a discrete PDF or the integral of a continuous PDF. Here are the important concepts,
the CDF is stated as \(F_x(x)\), note the PDF is stated as \(f_x(x)\)
is the probability that a random sample, \(X\), is less than or equal to a specific value \(x\); therefore, the y axis is cumulative probability
for CDFs there is no bin assumption; therefore, bins are at the resolution of the data.
monotonically non-decreasing function, because a negative slope would indicate negative probability over an interval.
The requirements for a valid CDF include,
non-negativity constraint:
valid probability:
cannot have negative slope:
minimum and maximum (ensuring probability closure) values:
Cyclicity (variogram model)#
Variogram Calculation and Modeling: may be linked to underlying geological periodicity, cycles in the deposition that result in layers.
sometimes noise in the experimental variogram due to too few data is mistaken as cyclicity
the wavelength of the cycles in the experimental variogram is the wavelength of the spatial cycles, i.e. the extent of the layers
Data (data aspects)#
Geostatistical Concepts: when describing spatial dataset these are the fundamental aspects,
Data Coverage - what proportion of the population has been sampled for this? For example,
100,000 meters of drill hole samples gathered during initial discovery to delineation
2 meters depth of penetration from the 7 near vertical wells through the reservoir
the entire reservoir has been imaged by seismic data
Data Scale (support size)
What is the scale or volume sampled by the individual samples? For example,
core tomography images corse samples at the pore scale, 1 - 50 \(\mu m\)
gamma ray well log sampled at 0.3 m intervals with 1 m penetration away from the bore hole
ground-based gravity gradiometry map with 20 m x 20 m x 100 m resolution
Data Information Type
What does the data tell us about the subsurface? For example,
grain size distribution that may be applied to calibrate permeability and saturations
fluid type to assess the location of the oil water contact
dip and continuity of important reservoir layers to access connectivity
mineral grade to map high, mid and low grade ore shells for mine planning
Data Analytics#
Geostatistical Concepts: the use of statistics with visualization to support decision making.
Dr. Pyrcz says that data analytics is the same as statistics.
Debiasing with Secondary Data#
Declustering to Correct Sampling Bias: when the entire range of the feature of interest is not sampled we cannot use data weights to debias our statistics, instead we use,
a secondary feature that is sampled over the entire area of interest
a relationship between the secondary feature and the primary feature
to impute the missing part of the primary feature spatial distribution.
The relationship between the primary and secondary feature may be a statistical model with extrapolation to the missing part of the primary feature distribution or based on some other information such as a phyical model
Decision Criteria#
Geostatistical Concepts: a feature that is calculated by applying the transfer function to the subsurface model(s) to support decision making. The decision criteria represents value, health, environment and safety. For example:
contaminant recovery rate to support design of a pump and treat soil remediation project
oil-in-place resources to determine if a reservoir should be developed
Lorenz coefficient heterogeneity measure to classify a reservoir and determine mature analogs
recovery factor or production rate to schedule production and determine optimum facilities
recovered mineral grade and tonnage to determine economic ultimate pit shell
Declustering#
Declustering to Correct Sampling Bias: various methods that assign weights to spatial samples based on local sampling density, such that the weighted statistics are likely more representative of the population. Data weights are assigned so that,
samples in densely sampled areas receive less weight
samples in sparsely sampled areas receive more weight
There are various declustering methods:
cell-based declustering
polygonal declustering
kriging-based declustering
It is important to note that no declustering method can prove that for every data set the resulting weighted statistics will improve the prediction of the population parameters, but in expectation these methods tend to reduce the bias.
Declustering (statistics)#
Declustering to Correct Sampling Bias: once declustering weights are calculated for a spatial dataset, then declustered statistics are applied as input for only subsequent analysis or modeling. For example,
the declustered mean is assigned as the stationary, global mean for simple kriging
the weighted CDF from all the data with weights are applied to sequential Gaussian simulation to ensure the back-transformed realizations approach the declustered distribution
Any statistic can be weighted, including the entire CDF! Here are some examples of weighted statistics, given declustering weights, \(w(\bf{u}_j)\), for all data \(j=1,\ldots,n\).
weighted sample mean,
where \(n\) is the number of data.
weighted sample variance,
where \(\overline{x}_{wt}\) is the declustered mean.
weighted covariance,
where \(\overline{x}_{wt}\) and \(\overline{y}_{wt}\) are the declustered means for features \(X\) and \(Y\).
the entire CDF,
where \(n(Z<z)\) is the number of sorted ascending data less than threshold \(z\). We show this as approximative as this is simplified and at data resolution and without an interpolation model.
It is important to note that no declustering method can prove that for every data set the resulting weighted statistics will improve the prediction of the population parameters, but in expectation these methods tend to reduce the bias.
Deterministic Model#
Geostatistical Concepts: a model that assumes the system or process that is completely predictable
often-based on engineering and geoscience physics and expert judgement
for example, numerical flow simulation or stratigraphic bounding surfaces interpreted from seismic
for this course we also state that data-driven estimation models like
Advantages:
integration of physics and expert knowledge
integration of various information sources
Disadvantages:
often quite time consuming
often no assessment of uncertainty, focus on building one model
Discrete Feature#
Geostatistical Concepts: a categorical feature or a continuous feature that is binned or grouped, for example,
porosity between 0 and 20% assigned to 10 bins = {0 - 2%, 2% - 4%, \ldots ,20%}
Mohs hardness = \(\{1, 2, \ldots, 10\}\) (same at categorical feature)
Distribution Transformations#
Distribution Transformations: a mapping from one distribution to another through percentile values, resulting in a new PDF, CDF. We perform distribution transformations in geostatistical methods and workflows because,
inference - to correct a feature distribution to an expected shape, e.g., correcting for too few or biased data
theory - a specific distribution assumption is required for a workflow step, e.g., Gaussian distribution with mean of 0.0 and variance of 1.0 is required for seuqential Gaussian simulation
data preparation or cleanng - to correct for outliers, the transformation will map the outlier into the target distribution no longer as an outlier
How do we perform distribution transformations?
we are transforming the values from the cumulative distribution function (CDF), \(F_{X}\), to a new CDF , \(G_{Y}\). This can be generalized with the quantile - quantile transformation applied to all the sample data:
The forward transform:
The reverse transform:
This may be applied to any data, nonparametric or samples from a parametric distribution. We just need to be able to map from one distribution to another through percentiles, so it is a:
rank preserving transform
Estimation#
Geostatistical Concepts: is process of obtaining the single best value to represent a feature at an unsampled location, or time. Some additional concepts,
local accuracy takes precedence over global spatial variability.
too ‘smooth’, not appropriate for forecasting
for example, inverse distance and kriging
many predictive machine learning models focus on estimation (e.g., k-nearest neighbours, decision tree, random forest, etc.)
Feature (also variable)#
Geostatistical Concepts: any property measured or observed in a study
for example, porosity, permeability, mineral concentrations, saturations, contaminant concentration, etc.
in data mining / machine learning this is known as a feature
measure often requires significant analysis, interpretation, etc.
Frequentist Probability#
Probability Concepts:measure of the likelihood that an event will occur based on frequencies observed from an experiment. For random experiments and well-defined settings (such as coin tosses):
where:
\(n(A)\) = number of times event \(A\) occurred \(n\) = number of trails
Example: Possibility of drilling a dry hole for the next well, encountering sandstone at a location (\(\bf{u}_{\alpha}\)), exceeding a rock porosity of \(15\%\) at a location (\(\bf{u}_{\alpha}\)).
Geometric Anisotropy (variogram interpretation)#
Variogram Calculation: the same variogram structures are observed over all directions, but the range depends on the direction
commonly, the vertical range of correlation is much less than the horizontal range due to the formation of ‘layering’ due to sedimentary processes
the ratio of the horizontal:vertical range is commonly known as the horizontal to vertical anisotropy ratio
geometric anisotropy is common for the horizontal directions also, the ratio of horizontal major direction: horizontal minor direction range is commonly known as the horizontal major to minor anisotropy ratio
Geometric Anisotropy (variogram model)#
Variogram Calculation and Modeling: we assume geometric anisotropy to model 2D and 3D variogram from experimental variograms calculated only in primary directions.
this model provides a valid interpolation of the variogram between the primary directions
the geometric anisotropy model is based on this lag distance,
where \(a_{maj}, a_{maj}, a_{vert}\) are the ranges in the major, minor and vertical directions and \(\bf{h}_{maj}, \bf{h}_{maj}, \bf{h}_{vert}\) are the lag distance components in the major, minor and vertical directions.
Geostatistics#
Geostatistical Concepts: a branch of applied statistics that integrates:
the spatial (geological) context
the spatial relationship
volumetric support / scale
uncertainty
Hard Data#
Geostatistical Concepts: data that has a high degree of certainty, usually based on a direct measurement from the rock
for example, well core-based and well log-based porosity, lithofacies
Histogram#
Univariate Distributions: is a representation of the univaraite statistical distribution with a plot of frequency over an exhaustive set of bins over the range of possible values. These are the steps to build a histogram,
Divide the continuous feature range of possible values into \(K\) equal size bins, \(\delta x\):
or use available categories for cateogorical features.
Count the number of samples (frequency) in each bin, \(n_k\), \(\forall k=1,\ldots,K\).
Plot the frequency vs. the bin label (use bin centroid if continuous)
Note, histograms are typically plotted as a bar chart.
Hybrid Model#
Geostatistical Concepts: system or process that includes a combination of both deterministic model and stochastic model
most geostatistical models are hybrid models
for example, additive deterministic trend models and stochastic residual
Independence (probability)#
Probability Concepts: events A and B are independent if and only if the following relations are true,
\(P(A \cap B) = P(A) \cdot P(B)\)
\(P(A|B) = P(A)\)
\(P(B|A) = P(B)\)
If any of these are violated we can suspect that there exists some form of relationship.
Inference, Inferential Statistics#
Geostatistical Concepts: given a random sample from a population, describe the population
for example, given the well(s) samples, describe the reservoir
Intersection of Events (probability)#
Probability Concepts: the intersection of outcomes, the probability of A and B is represented as,
only under the assumption of independence of A and B can it be calculate from the probabilities of A and B as,
Isotropic or Omnidirectional (variogram)#
Variogram Calculation: variogram calculated or modelled such that direction (azimuth) is not considered
for an isotropic experimental variogram set the azimuth tolerance as 90 degrees, then the experimental variogram is insensitive to direction.
for an isotropic variogram model set the major range equal to the minor range, then the variogram model is insensitive to direction.
Joint Probability#
Probability Concepts: probability that considers more than one event occurring together, the probability of A and B is represented as,
or the probability of A, B and C is represented as,
Kriging#
Simple and Ordinary Kriging: spatial estimation approach that relies on linear weights that account for spatial continuity, data closeness and redundancy.
weights are unbiased and minimize the estimation variance
The simple kriging weights are calculated by solving a linear system of equations,
this system integrates the,
spatial continuity as quantified by the variogram (and covariance function to calculate the covariance, \(C\), values)
redundancy the degree of spatial continuity between all of the available data with themselves, \(C(\bf{u}_i,\bf{u}_j)\)
closeness the degree of spatial continuity between the available data and the estimation location, \(C(\bf{u},\bf{u}_i)\)
Kriging provides a measure of estimation accuracy known as kriging variance (a specific case of estimation variance).
Kriging estimates are best in that they minimize the above estimation variance.
Properties of kriging estimates include,
Exact interpolator - kriging estimates with the data values at the data locations
Kriging variance can be calculated before getting the sample information, as the kriging estimation variance is not dependent on the values of the data nor the kriging estimate, i.e. the kriging estimator is homoscedastic.
Spatial context - kriging takes into account, furthermore to the statements on spatial continuity, closeness and redundancy we can state that kriging accounts for the configuration of the data and structural continuity of the variable being estimated.
Scale - kriging may be generalized to account for the support volume of the data and estimate. We will cover this later.
Multivariate - kriging may be generalized to account for multiple secondary data in the spatial estimate with the cokriging system. We will cover this later.
Smoothing effect of kriging can be forecast. We will use this to build stochastic simulations later.
Kriging (simple vs. ordinary)#
Simple and Ordinary Kriging: the difference between simple and ordinary kriging is related to the assumption of stationarity in the mean.
Simple Kriging - global stationary mean provided by the user. 1.0 minus the sum of the data weights is applied to the global stationary mean. At data locations all weight is applied to the collocated datum, and beyond the variogram range from all data all weight is applied to the global mean.
Ordinary Kriging - local nonstationary mean calculated by the kriging system. This is accomplished with the addition of a data kriging weights must sum to 1.0 constraint in the kriging system. At data locations all weight is applied to the collocated datum, and beyond the variogram range from all data, all weight is applied to the local calculated mean
Kriging-based Declustering#
Declustering to Correct Sampling Bias: a declustering method to assign weights to spatial samples based on local sampling density, such that the weighted statistics are likely more representative of the population. Data weights are assigned so that,
samples in densely sampled areas receive less weight
samples in sparsely sampled areas receive more weight
Kriging-based declustering proceeds as follows:
calculate and model the experimental variogram
apply kriging to calculate estimates over a high resolution grid covering the area of interest
calculate the sum of the weights assigned to each data
assign data weights proportional to this sum of weights
The weights are calculated as:
where \(nx\) and \(ny\) are the number of cells in the grid, \(n\) is the number of data, and \(\lambda_{j,ix,iy}\) is the weight assigned to the \(j\) data at the \(ix,iy\) grid cell.
Here is an important highlight for kriging-based declustering,
like polygonal declustering, kriging-based declustering is sensitive to the boundaries of the area of interest; therefore, the weights assigned to the data near the boundary of the area of interest may change radically as the area of interest is expanded or contracted
Also, kriging-based declustering integrates the spatial continuity model from variogram model. Consider the following possible impacts of the variogram model on the declustering weights,
if there is 100% relative nugget effect, there is no spatial continuity and therefore, all data receives equal weight. Note for the equation above this results in a divide by 0.0 error that must be checked for in the code.
geometric anisotropy may significantly impact the weights as data alligned over specific azimuths are assessed as closer or further in terms of covariance
Kolmogorov’s 3 Probability Axioms#
Probability Concepts: these are Kolmogorov’s 3 axioms for probabilities,
Probability of an event is a non-negative number.
Probability of the entire sample space is one (unity), also known as probability closure.
Additivity of mutually exclusive events for unions.
e.g., probability of \(A_1\) and \(A_2\) mutual exclusive events is, \(Prob(A_1 + A_2) = Prob(A_1) + Prob(A_2)\)
Lag Distance (variogram)#
Variogram Calculation: the separation between paired data described by a vector, \(\bf{h}\). The experimental variogram characterizes spatial continuity over a varitey of lag distances (magnitude and orientations) and the variogram model is applied to calculate spatial continuity for all possible lag distances (all possible separation distances and orientation).
Lag Distance Tolerance (variogram)#
Variogram Calculation: the tolerance \(+/-\Delta\) in lag distance applied to pool pairs of data for that specific lag distance.
for example, given a lag distance of 300 m, we could assign a lag tolerance of 50 m and then all pairs separated by 250 m - 350 m will be pooled to calculate the experimental variogram for this lag distance
it is common practice to use half the unit lag distance for lag tolerance. This may be increased to smooth the experimental variogram for improved interpretability
Location Map#
Univariate Distributions: a data plot where the 2 axes are locations, e.g., X and Y, Easting and Northing, Latitude and Longitude to show the locations of the spatial data. Often the data points are colored to represent the scale of feature to visualize the sampled feature over the area or volume of interest.
advantage, visualize the data without any model that may bias our impression of the data
disadvantage, may be difficult to visualize large datasets and data in 3D
Major Direction (variogram)#
Variogram Calculation: when calculating an experimental variogram, the direction with the largest variogram range
commonly the major and minor directions describe spatial continuity for 2D phenomenon or the horizontal (or stratigraphically aligned) coordinates in 3D and are then augmented with a vertical (orthogonal to major and minor) direction
Marginal Probability#
Probability Concepts: probability that considers only one event occurring, the probability of A,
marginal probabilities may be calculated from joint probabililies through the process of marginalization,
where we integrate over all cases of the other events, \(B\), to remove them. Given discrete possible cases of event B we can simply sum the probabilies over all possible cases of B,
Minor Direction (variogram)#
Variogram Calculation: when calculating an experimental variogram, the direction with the smallest variogram range, orthogonal to the major direction.
commonly the major and minor directions describe spatial continuity for 2D phenomenon or the horizontal (or stratigraphically aligned) coordinates in 3D and are then augmented with a vertical (orthogonal to major and minor) direction
Monte Carlo Simulation (MCS)#
Distribution Transformations: a random sample from a statistical distribution, random variable, represented by a CDF. The steps for MCS are:
model the feature CDF
draw random value from a uniform [0,1] distribution, this is a random cumulative probability value, known as a p-value, \(p^{\ell}\)
apply the inverse of the CDF to calculate the associated realization
repeat to calculate enough realizations for the subsequent analysis
Monte Carlo Simulation Workflow#
Distribution Transformations: a convenient stocahstic workflow for propagating uncertainty through a transfer function through sampling with Monte Carlo Simulation (MCS). The workflow includes the following steps,
Model all the input features’ distributions, CDFs
Monte Carlo simulate a realizations for all the inputs,
Apply to the transfer function to get a realization of the model output
Repeat to calculate enough realizations to model the response feature distribution.
Multiplication Rule (probability)#
Probability Concepts: we can calculate the joint probablity of A and B as the product of the conditional probability of B given A with the marginal probability of A,
Mutually Exclusive Events (probability)#
Probability Concepts: the events do not intersect or do not have any common outcomes. We represent this as,
Normalized Histogram#
Univariate Distributions: is a representation of the univaroate statistical distribution with a plot of normalized frequency (probability) over an exhaustive set of bins over the range of possible values. These are the steps to build a normalized histogram,
Divide the continuous feature range of possible values into \(K\) equal size bins, \(\delta x\):
or use available categories for cateogorical features.
Count the number of samples (frequency) in each bin, \(n_k\), \(\forall k=1,\ldots,K\) and divide each by the total number of data, \(n\), to calculate the probability of each bin,
Plot the probablity vs. the bin label (use bin centroid if continuous)
Note, normalized histograms are typically plotted as a bar chart.
Nugget Effect (variogram)#
Variogram Calculation: discontinuity in the variogram at distances less than the minimum data spacing Often communicated as the ratio, nugget effect / sill, known as the relative nugget effect (%)
measurement error will cause apparent nugget effect
Parameters (statistics)#
Geostatistical Concepts: a summary measure of a population
for example, population mean, population standard deviation, we rarely have access to this
model parameters are different, and we will cover this later.
Parameters (machine learning)#
Geostatistical Concepts: trainable coefficients for a machine learning model that control the fit to the training data
model parameters are calculated by optimization to minimize error at the training locations through, analytical solution, or iterations, e.g., gradient descent.
Polygonal Declustering#
Declustering to Correct Sampling Bias: a declustering method to assign weights to spatial samples based on local sampling density, such that the weighted statistics are likely more representative of the population. Data weights are assigned so that,
samples in densely sampled areas receive less weight
samples in sparsely sampled areas receive more weight
Polygonal declustering proceeds as follows:
Split up the area of interest with Voronoi polygons. These are constructed by intersected perpendicular bisectors between adjacent data points. The polygons group the area of interest by nearest data point
Assign weight to each datum proportional to the area of the associated Voronoi polygon
where \(w(\bf{u}_j)\) is the weight for the \(j\) data. Note, the sum of the weights is \(n\); therefore, \(w(\bf{u}_j)\) is nominal weight of 1.0, sample density if the data were equally spaced over the area of interest.
Here are some highlights for polygonal declustering,
polygonal declustering is sensitive to the boundaries of the area of interest; therefore, the weights assigned to the data near the boundary of the area of interest may change radically as the area of interest is expanded or contracted
polygonal desclutering is the same as the Theissen polygon method for calculation of percipitation averages developed by Afred H. Thiessen in 1911, [Thi11]
Population#
Geostatistical Concepts: exhaustive, finite list of property of interest over area of interest. Generally, the entire population is not generally accessible
for example, exhaustive set of porosity measures at each location within a reservoir
Prediction, Predictive Statistics#
Geostatistical Concepts: estimate the next samples given assumptions about or a model of the population
for example, given our model of the reservoir, predict the next well (pre-drill assessment) sample, e.g., porosity, permeability, production rate, etc.
Predictor Feature#
Primary Data#
Geostatistical Concepts: the feature of interest, the target for building a model, for example,
porosity measures from cores and logs used to build a full 3D porosity model. Porosity is the primary data.
Probability Density Function (PDF)#
Univariate Distributions: a representation of a statistical distribution with a function, \(f(x)\), of probability density across the range of all possible feature values, \(x\). These are the concepts for PDFs,
non-negativity constraint, the density cannot be negative,
for continuous features the density may be > 1.0, because density is a measure of likelihood not probability
integrate density over a range of \(x\) to calculate probability,
probability closure, the sum of the area under the PDF curve is equal to 1.0,
PDFs are calculated with kernels (usual a small Gaussian distribution) that is summed over all data; therefore, there is an implicity scale (smoothness) parameter when calculating a PDF.
To large of kernels will smooth out important information about the univariate distribution
Too narrow will result in a overly noisy PDF that is difficult to interpret.
This is analogous to the choice of bin size for a histogram or normalized histogram.
Probability Non-negativity, Normalization#
Probability Concepts: the fundamental constraints on probability include,
Bounded, \(0.0 \le P(A) \le 1.0\)
Closure, \(P(\Omega) = 1.0\)
Null Sets, \(P(\emptyset) = 0.0\)
Probability Operators#
Probability Concepts: here’s some common probability operators that are essential to working with probability and uncertainty problems,
Union of Events - the union of outcomes, the probability of A or B is calculated with the probability addition rule,
Intersection of Events - the intersection of outcomes, the probability of A and B is represented as,
only under the assumption of independence of A and B can it be calculate from the probabilities of A and B as,
Complimentary Events - is the NOT operator for probability, if we define A then A compliment, \(A^c\) is not A and we have this resulting closure relationship,
complimentary events may be considered for beyond univariate problems, for example consider this bivariate closure,
Note, the given term must be the same.
Mutually Exclusive Events - the events that do not intersect or do not have any common outcomes. We represent this as,
Probability Perspectives#
Probability Concepts: the 3 primary perspectives for calculating probability:
Long-term frequencies - probability as ratio of outcomes, requires repeated observations of an experiment. The basis for frequentist probablity.
Physical tendencies or propensities - probability from knowledge about or modeling the system, e.g., we could know the probability of coin toss without the experiment.
Degrees of belief - reflect our certainty about a result, very flexible, assign probability to anything, updating with new information. The basis for Bayesian probability.
Production Data#
Geostatistical Concepts: includes bottom hole pressure, fluid production, rates, types, temperatures etc., commingled over multiple wells, for a single well and even at times along the well bore of a single well. Some additional comments,
production from a single well may be comingled over multiple producing intervals, unless production logging tool (PLT) results are available
important ground truth to be matched with a reservoir model forecasts through the geomodel inversion process called historical production matching
Qualitative Features#
Geostatistical Concepts: information about quantities that you cannot directly measure, require interpretation of measurement and are described with words (not numbers)
rock type = sandstone, zonation = bornite-chalcopyrite-gold higher grade copper zone
Quantitative Features#
Geostatistical Concepts: features that can be measured and represented by numbers, for example,
age = 10 Ma (millions of years), porosity = 0.134 (fraction of volume is void space), saturation = 80.5% (volume percentage)
like qualitative features there is often the requirement for interpretation, for example, total porosity may be measured but should be converted to effective porosity through interpretation and a model
Random Function#
Variogram Calculation: set of random variables correlated over space or time
we represent a random varibles with an upper-case variable, e.g., \(X\),
and we represent a random functions with an upper-case variable with location vectors, e.g., \(X(\bf{u}_1), X(\bf{u}_2), \ldots, X(\bf{u}_n)\)
and then the possible joint outcomes called realizations, or data samples are represented with lower case, e.g., \(x(\bf{u}_1), x(\bf{u}_2), \ldots, x(\bf{u}_n)\)
we often represent realization with the \(\ell\) notation, e.g. \(x^{\ell}(\bf{u}_1), x^{\ell}(\bf{u}_2), \ldots, x^{\ell}(\bf{u}_n)\) for \(\ell = 1,\ldots,L\) realizations.
Random Variable#
Univariate Distributions: we do not know the value of a feature at a location or time, it can take on a range of possible values, fully described with a statistical distribution, PDF of CDF.
represented as an upper-case variable, e.g., \(X\), while possible outcomes or data measures are represented with lower case, e.g., \(x\).
for spatial phenomenon we add a location vector, \(\bf{u}\), for a random variable, e.g., \(X(\bf{u})\), while possible spatial outcomes or data measures are represented with lower case, e.g., \(x(\bf{u})\).
Range (variogram)#
Variogram Calculation: lag distance where the experimental variogram reaches the sill. Here’s some more details about the range,
for lag distances less than the range there is information over distance
for lag distances at and beyond the range there is no information over distance
range is also a parameter applied to fit positive definite, permissible variogram models
when the range varies by direction this is called geometric anisotropy
Realizations (uncertainty)#
Geostatistical Concepts: hold input parameters constant and change random number seed to calculate spatial uncertainty
for example, hold the porosity mean constant and observe changes in porosity away from the wells over multiple realizations
Geostatistical univariate_distributions: in our discussion about bootstrap and Monte Carlo Simulation we added these descriptions to the definition of a realization,
an outcome from a random variable, \(X\), (or joint set of outcomes from a random function)
represented with lower case, e.g., \(x\)
in spatial context common to use a location vector, \(\bf{u}\), to describe a location, e.g., \(x(\bf{u})\), as \(X(\bf{u})\)
resulting from simulation, e.g., Monte Carlo simulation, sequential Gaussian simulation, a method to sample (jointly) from the RV (RF) each realization is considered equiprobable
Reservoir Modeling Workflow#
Geostatistical Concepts: the following is the standard, common geostatistical reservoir modeling workflow:
Integrate all available information to build multiple scenarios and realizations to sample the uncertainty space
Apply all the models to the transfer function to map to a decision criteria
Assemble the distribution of the decision criteria
Make the optimum decision accounting for this uncertainty model
Response Feature#
Sample#
Geostatistical Concepts: the set of values, locations that have been measured
for example, 1,000 porosity measures from well-logs over the wells in the reservoir
or 1,000,000 acoustic impedance measurements over a 1000 x 1000 2D grid for a reservoir unit of interest
Scenarios (uncertainty)#
Geostatistical Concepts: calculated by changing the input parameters or other modeling choices to represent the model parameters and choices uncertainty
for example, porosity mean low, mid and high over the models
Secondary Feature#
Geostatistical Concepts: additional feature(s) that provides information about the primary data through a relationship / calibration, for example,
acoustic impedance (secondary feature) to support calculating a model of porosity (primary feature)
porosity (secondary feature) to support calculating a model of permeability (primary feature)
Seismic Data#
Geostatistical Concepts: indirect measurement with remote sensing, reflection seismic applies acoustic source(s) and receivers (geophones) to map acoustic reflections with high coverage and generally low resolution. Some more details,
seismic reflections (amplitude) data inverted to rock properties, e.g., acoustic impedance, consistent with and positionally anchored with well sonic logs
provides framework, bouding surfaces for extents and shapes of reservoirs along with soft information on reservoir properties, e.g., porosity and facies.
Sill (variogram)#
Variogram Calculation: the feature variance. We interpret spatial correlation relative to the variogram sill,
at the distance that the experimental variogram reaches the sill there is no spatial correlation. This is called the range.
Experimental variogram points above the sill indicate negative correlation, although we do not model above the sill for kriging nor simulation. We model to the sill and assume no correlation beyond the range.
Experimental variogram that rise linearly above the sill indicates a spatial trend in the data.
Simulation#
Geostatistical Concepts: is process of obtaining one or more good values of a reservoir property at an unsampled location.
global accuracy, matches the global statistics
simulation methods tend to produce more realistic feature spatial, univariate distributions.
for example, Monte Carlo simulation, sequential Gaussian simulation, indicator simulation, object-based simulation, etc.
Use simulation when,
we need to reproduce the distributions of features of interest, the extreme values matter
we need realistic models for flow simulation
Soft Data#
Geostatistical Concepts: data that has a high degree of uncertainty, such that it must be integrated into the model
for example, probability density function for local porosity calibrated from acoustic impedance
Spatial Estimation#
Trend Modeling: is process of obtaining the single best value to represent a feature at an unsampled location, or time. Some additional concepts,
local accuracy takes precedence over global spatial variability.
too ‘smooth’, not appropriate for forecasting
for example, inverse distance and kriging
many predictive machine learning models focus on estimation (e.g., k-nearest neighbours, decision tree, random forest, etc.)
Given spatial data, \(𝑧(\bf{𝐮}_1), \dots, 𝑧(\bf{𝐮}_n)\) we can estimate at unknown location \(\bf{u}\) with a linear combination of the data as,
we add an unbiasedness constraint by assigning the remainder of the weight (one minus the sum of weights) to the global average; therefore, if we have no informative data we will estimate with the global average of the property of interest.
while satisfying the unbiasedness constraint.
Spatial Continuity#
Variogram Calculation: correlation between spatial samples of a feature over distance.
no spatial continuity indicates no correlation between spatial sample over distance, effectively random values at each location in space regardless of separation distance.
homogenous phenomena have perfect spatial continuity since all values are the same (or very similar) they are correlated over all distances.
Spatial Sampling (biased)#
Declustering to Correct Sampling Bias: the sample statistics are not representative of the population parameters. For example,
the sample mean is not the same as the population mean
the sample variance is not the same as the population variance
Of course, the population parameters are not accessible, so we cannot
Spatial Sampling (clustered)#
Declustering to Correct Sampling Bias: spatial samples with locations preferentially selected, i.e., clustered, resulting in biased statistics,
typically spatial samples are clustered in locations with higher value samples, e.g., high porosity and permeability, good quality shale for unconventional reservoirs, low acoustic impedance indicating higher porostiy, etc.
Spatial Sampling (common practice)#
Declustering to Correct Sampling Bias: data is collected to:
reduce uncertainty by answering questions, for example,
how far does the contaminant plume extend? – sample peripheries
where is the fault? – drill based on seismic interpretation
what is the highest mineral grade? – sample the best part
who far does the reservoir extend? – offset drilling
maximize NPV directly
maximize production rates
maximize tonnage of mineral extracted
Spatial Sampling (representative)#
Declustering to Correct Sampling Bias: if we are sampling for representativity of the sample set and resulting sample statistics, by sampling theory we have 2 options:
random sampling - each potential sample from the population is equally likely to be sampled at each step
Selecting a specific location has no impact on the selection of subsequent locations.
Assumes the population size that is much larger than the sample size, there is not significant correlation imposed due to without replacement sampling (the constraint that you can only sample a location once). Note, generally not an issue for the subsurface with massive populations sparsely sampled
regular sampling - sampling at equal space or time intervals
as long as we don’t align with natural periodicity)
Stationarity#
Variogram Calculation: any statistic requires replicates, repeated sampling (e.g., air or water samples from a monitoring station).
in our geospatial problems repeated samples are not available at a location in the subsurface
instead of time, we must pool samples over space to calculate our statistics.
This decision to pool samples is the decision of stationarity. It is the decision that the subset of the subsurface is all the “same stuff”.
Why must we pool data? Ultimately it is all about making inferences about the population from a limited sample
to calculate statistics
to build models
The decision of the stationary domain for sampling is an expert choice.
without it we are stuck in the “hole” and cannot calculate any statistics nor say anything about the behavior of the subsurface between the sample data.
An example geological definition of stationarity could be like this,
The rock over the stationary domain is sourced, deposited, preserved, and post-depositionally altered in a similar manner, the domain is map-able and may be used for local prediction or as information for analogous locations within the subsurface; therefore, it is useful to pool information over this expert mapped volume of the subsurface.
this expert interpolation is applied to map a statistical definition of stationarity that is applied in modeling.
There are 2 aspects of any decision of stationarity,
Import License - choice to pool specific samples to evaluate a statistic
Export License - choice of where in the subsurface this statistic is applicable
To state a decision of stationarity we must include,
the statistic that is stationary, e.g., the mean, the variance, the CDF, etc.
the area or volume over which the statistic is stationary, e.g., the entire model, the sand facies, the proximal region, etc.
An example statistical definition of stationarity could be like this,
stationary mean
stationary CDF
stationary semivariogram
This may be extended to any statistic of interest including, facies proportions, bivariate distributions and multiple point statistics.
Here’s some additional considerations for stationarity,
Stationarity is a decision, not a hypothesis; therefore, it cannot be tested, although data may demonstrate that it is inappropriate.
The stationarity assessment depends on scale. This choice of modeling scale(s) should be based on the specific problem and project needs.
We cannot avoid a decision of stationarity. No stationarity decision and we cannot move beyond the data. Conversely, assuming broad stationarity over all the data and over large volumes of the earth is naïve.
Geomodeling stationarity is the decision: (1) over what region to pool data (import license) and (2) over what region to use the resulting statistics (export license).
Nonstationary trends may be mapped, and the remaining stationary residual modelled statistically / stochastically, trends may be treated uncertain. This is the hybrid modeling approach.
Stochastic Model#
Geostatistical Concepts: system or process that is uncertain, multiple models constrained by statistics
for example, data-driven models that integrate uncertainty like geostistical simulation models
Advantages:
speed
uncertainty assessment
report significance, confidence / prediction intervals
honor many types of data
data-driven approaches
Disadvantages:
limited physics used
statistical model assumptions / simplification
Statistics (practice)#
Geostatistical Concepts: the theory and practice for collecting, organizing, and interpreting data, as well as drawing conclusions and making decisions.
Statistics (measurement)#
Geostatistical Concepts: summary measure of a sample, for example,
sample mean - \(\overline{x}\)
sample standard deviation - \(s\),
we use statistics as estimates of the parameters, summary measures of the population
Statistical Distribution#
Univariate Distributions: for a variable / feature a description of the probability of occurrence over the range of possible values. What do we learn from a statistical distribution? For example,
what is the minimum and maximum?
do we have a lot of low values?
do we have a lot of high values?
do we have outliers (values that don’t make sense and need explaining)?
Transfer Function (reservoir modeling workflow)#
Geostatistical Concepts: calculation applied to the spatial, subsurface model to calculate a decision criterion, a metric that is used to support decision making representing value, and health, environment and safety. For example:
transport and bioattenuation of a soil contaminant to forecase contaminant concentration over time
volumetric calculation for oil-in-place to calculate resource in place
heterogeneity metric to estimate of recovery factor for reserves
flow simulation for pre-drill production forecast for a planned well
mine plan and rock homogenization to calculate mineral resources and ultimage pit shell
Trend#
Trend Modeling: the feature is nonstationary in space and changes systematical over the area of interest. For example,
the porosity decreases with depth
the copper grade increases toward the highly faulted zone
Trend is also used to describe a model of the nonstationarity in any statistic or metric of interest, as in trend model
Trend (variogram model)#
Variogram Calculation and Modeling: experimental variogram points rise approximately linearly above the sill.
Indicates a trend, e.g., fining upward, compacting with depth, etc.
could be interpreted as a fractal, i.e., model without a finite variance or sill fit with a power law function. Note, variogram models above the sill are not permissible for simulation methods like sequential Gaussian simulation
common workflow is to remove the trend, work with the residual, if the trend is removed the residual variogram will plateau at the sill
Trend Modeling#
Trend Modeling: modeling of local features, based on data and interpretation, that are deemed certain (known). The trend is subtracted from the data, leaving a residual that is modeled stochastically with uncertainty (treated as unknown). The following steps are applied:
model the nonstationary, spatial, deterministic trend for a feature of interest
subtract the trend from the data to calculate the residual
model the residual with geostatistical spatial estimation or simulation
add the deterministic trend to the geostatistical (deterministic if kriging or stochastic if simulation) residual
check the model
Uncertainty Modeling#
Geostatistical Concepts: calculate the range of possible values for a feature at a location or jointly over many locations at the sample time
quantification fo the limitation in our precision of our samples or model predictions
uncertainty is a model, there is no objective uncertainty
uncertainty is caused by our ignorance
uncertainty is caused by sparse sampling, measurement error and bias, and heterogeneity increase uncertainty
we represent uncertainty by multiple models, scenarios and realizations:
Scenarios - calculated by change the input parameters or other modeling choices to represent the model parameters and choices uncertainty, e.g., porosity mean low, mid and high over the models
Realizations - hold input parameters constant and change random number seed to calculate spatial uncertainty, e.g., hold the porosity mean constant and observe changes in porosity away from the wells over multiple realizations
Union of Events (probability)#
Probability Concepts: the union of outcomes, the probability of A or B is calculated with the probability addition rule,
Unit Lag Distance (variogram)#
Variogram Calculation: the increments for lag distance applied to calculate an experimental variogram
for example if the unit lag distance is 50 m and 5 lags are calculated, the experimental variogram will have points at lag distance = 50 m, 100 m, 150 m, 200 m, and 250 m
typically the unit lag distance is set as the nominal minimum data spacing in the specific direction. We don’t want the minimum lag distance as there would be only one pair available and the experimental variogram point would be unreliable.
nominal indicates that there are enough pairs to calculate a reliable experimental variogram point
Univariate Parameters#
Univariate Distributions: summary measures based on one feature measured over the population
Univariate Statistics#
Univariate Distributions: summary measures based on one feature measured over the samples
Variable (also feature)#
Geostatistical Concepts: any property measured or observed in a study, for example,
porosity, permeability, mineral concentrations, saturations, contaminant concentration
in data mining / machine learning this is known as a feature
measure often requires significant analysis, interpretation, etc.
Variogram#
Variogram Calculation: function of difference over distance.
Experimental variogram calculated at specific distances and plotted as points, then permissible variogram models are fit to the experimental variogram while integrating other domain and local knowledge.
The variogram is calculated as one half the average squared difference over lag distance, 𝐡, over all possible pairs of data,
The precise term is semivariogram (or variogram if you remove the \frac{1}{2} in the equation above), but in practice, the term variogram is always used for the semivariogram.
The \(\frac{1}{2}\) term is added so that the covariance function, \(C_z(\bf{u})\), and variogram, \(\gamma_z(\bf{h})\), may be related directly:
Note the correlogram, \(\rho_z(\bf{u})\), is related to the covariance function, \(C_z(\bf{u})\), as: . $\( \rho_z(\bf{u}) = \frac{C_z(\bf{h})}{\sigma_z^2} \)$
Here are some general observations about the variogram,
As the lag distance, \(\bf{h}\), increases, variability over the lag distance increase (in general).
The variogram is calculated with over all possible pairs separated by lag vector, \(\bf{h}\).
We need to plot the sill on with the experimental variogram to know the degree of correlation.
The sill is the Variance, \(\sigma^2\), given stationarity of the variance and variogram, \gamma_z(\bf{h})):
and given a standardized feature, \(\sigma_z^2 = 1.0\),
The distance from the sill to the experimental variogram is the correlation coefficient over the specific lag distance.
The lag distance at which the variogram reaches the sill is know as the range.
at the range, knowing the data value at the tail provides no information about a value at the head.
Sometimes there is a discontinuity in the variogram at distances less than the minimum data spacing. This is known as nugget effect.
the ratio of nugget divided by sill, is known as relative nugget effect, reported in percentage, e.g., 10% relative nugget effect
we model the nuggect effect as a no correlation structure over all lags, \(\bf{h} \gt \epsilon\), an infinitesimal distance
measurement error, causes an apparent nugget effect
Variogram Map#
Variogram Calculation: calculating the variogram over all distances and directions at once.
use a mesh template, cell size controls resolution like lag distance (assuming lag tolerance is \(\frac{1}{2}\) lag distance), number of cells controls the extent of calculation
can be useful to visualize and determine major direction and minor direction.
typically variogram maps require more data than regular isotropic and anisotropic experimental variograms and are not practical with sparse spatial data, i.e., with too fee data they tend to be very noisy
Variogram Model#
Variogram Calculation and Modeling: the variogram model is fit based on the experimental variograms and expert interpretation. Here’s the reasons for variogram modeling:
Interpolate all distances and directions - the variogram must be known for all possible \(\bf{h}\) lags, distances and directions, not just the limited lags calculated for the experimental variogram(s)
Integrate geological knowledge - the variogram modeling process is an opportunity integrate our geological knowledge. For example, geometric anisotropy ratios from knowledge of the depositional setting.
Valid model of difference vs. offset - the variogram model must be positive definite (a legitimate measure of distance), that is, the variance of any linear combination must be positive. The variogram modeling workflow with additive, nested structures ensures this.
The steps for variogram modeling with nested structures to describe variance (sill) components are,
Nugget effect - assign as the single (lowest) isotropic nugget effect
Number of structures - choose the same number of variogram structures for all directions based on the most complex direction
Contributions for each structure - ensure that the same contribution parameter is used for each variogram structure in all directions and that the sum of all structures’ contributions is equal to the sill (variance). Model to the sill and not above the sill.
Apparent Nuggect Effect - for apparent nugget effect structures (nugget effect in only some directions) use 0.0 range in the apparent nuggect effect directions.
Geometric anisotropy - vary the range parameters in each direction (anisotropy ratios)
Zonal anisotropy - if the variogram does not reach the sill in some directions, then assign a very large value for the range in those directions to represent zonal anisotropy
Here are some additional comments to assist with variogram modeling.
initial coordinate and data transformation may be required, if the lags cross folded beds then the range will likely be underestimated
focus on fundamental variogram interpretions including, trend, cyclicity, geometric anisotropy, and zonal anisotropy. If trend is present, calculate a feature spatial trend model and then work with the residual model that does not have trend
often the short scale structure is most important as most estimates and simulated realizations are interpolating between data. Data themselves often control long scale structures
nugget effect due to measurement error should not be modeled, as it is not a feature of the geology. Nugget effect is very rare in oil and gas.
vertical direction is typically better informed due to number of samples and spacing along near vertical wells, model vertical direction and then fit an anisotropy ratio to the limited horizontal experimental variogram points
Variogram Modeling Principles#
Variogram Calculation and Modeling: variogram is very important in spatial data analytics, geostatistical study as the measure of geological (or other spatial feature) distance vs. spatial distance
Venn Diagrams#
Probability Concepts: a plot, visual tool for communicating probability. What do we learn from a Venn diagram?
size of regions \(\propto\) probability of occurrence
proportion of \(\Omega\), all possible outcomes represented by a box, i.e., probability of \(1.0\)
overlap \(\propto\) probability of joint occurrence
Venn diagrams are an excellent tool to visualize marginal, joint and conditional probability.
Well Log Data#
Geostatistical Concepts: as a much cheaper method to sample wells that does not interupt drilling operations, well logs are very common over the wells. Often all wells have various well logs available. For example,
gamma ray on pilot vertical wells to assess the locations and quality of shales for targetting (landing) horizontal wells
neutron porosity to assess location high porosity reservoir sands
gamma ray in drill holes to map thorium mineralization
well log data are critical to support subsurface resource interpretations. Once anchored by core data they provide the essential coverage and resolution to model the entire reservoir concept / framework for prediction, for example,
well log data callibrated by core data collocated with well log data are used to map the critical stratigraphic layers, including reservoir and seal units
well logs are applied to depth correct feature inverted from seismic that have location imprecision due to uncertainty in the rock velocity over the volume of interest
Well Log Data, Image Logs#
Geostatistical Concepts: a special case of well logs where the well logs are repeated at various azimuthal intervals within the well bore resulting in a 2D (unwrapped) image instead of a 1D line along the well bore. For example, Fullbore formation MicroImager (FMI) with:
with 80% bore hole coverage
0.2 inch (0.5 cm) resolution vertical and horizontal
30 inch (79 cm) depth of investigation
observe lithology change, bed dips and sedimentary structures
Zonal Anisotropy (variogram model)#
Variogram Calculation and Modeling: when the experimental variogram does not reach the sill in a direction, i.e., the experimental levels out below the sill
zonal anisotropy is often paired with cyclicity or trend in the other (orthogonal) direction, e.g., zonal anisotropy in the major direction and trend in the minor direction
the variance at which the variogram levels off is called an apparent sill
Want to Work Together?#
I hope this content is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.
Want to invite me to visit your company for training, mentoring, project review, workflow design and / or consulting? I’d be happy to drop by and work with you!
Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PIs including Profs. Foster, Torres-Verdin and van Oort)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!
I can be reached at mpyrcz@austin.utexas.edu.
I’m always happy to discuss,
Michael
Michael Pyrcz, Ph.D., P.Eng. Professor, Cockrell School of Engineering and The Jackson School of Geosciences, The University of Texas at Austin
Comments#
This was a basic introduction to geostatistics. If you would like more on these fundamental concepts I recommend the Introduction, Modeling Principles and Modeling Prerequisites chapters from my text book, Geostatistical Reservoir Modeling{cite}`pyrcz2014’.
I hope this is helpful,
Michael