
Decision Making with Uncertainty#
Michael J. Pyrcz, Professor, The University of Texas at Austin
Twitter | GitHub | Website | GoogleScholar | Book | YouTube | Applied Geostats in Python e-book | LinkedIn
Chapter of e-book “Applied Geostatistics in Python: a Hands-on Guide with GeostatsPy”.
Cite this e-Book as:
Pyrcz, M.J., 2024, Applied Geostatistics in Python: a Hands-on Guide with GeostatsPy [e-book]. Zenodo. doi:10.5281/zenodo.15169133
The workflows in this book and more are available here:
Cite the GeostatsPyDemos GitHub Repository as:
Pyrcz, M.J., 2024, GeostatsPyDemos: GeostatsPy Python Package for Spatial Data Analytics and Geostatistics Demonstration Workflows Repository (0.0.1) [Software]. Zenodo. doi:10.5281/zenodo.12667036. GitHub Repository: GeostatsGuy/GeostatsPyDemos
By Michael J. Pyrcz
© Copyright 2024.
This chapter is a summary of Decision Making with Uncertainty including essential concepts:
loss functions
expected loss vs. estimate
YouTube Lecture: check out my lectures on:
For convenience here’s a summary of the salient points.
Motivation for Decision Making with Uncertainty#
We have built uncertainty models, now we need to make decisions in the presence of uncertainty.

Building Uncertainty Models#
How Did We Calculate Uncertainty Models? Some examples:
Bayesian Updating - prior plus likelihood to calculate the posterior, for example, probability that a coin is fair, uncertainty in reservoir OIP
Bootstrap + Monte Carlo Simulation and Transfer Function - uncertainty the mean porosity by bootstrap, then Monte Carlo simulation for uncertainty in oil-in-place
Kriging - kriging estimate and variance with Gaussian assumption for uncertainty in mineral grade at a location
Simulation and Transfer Function - reservoir features realizations applied to flow simulation to calculate pre-drill production uncertainty at a new well
One way or another, we have calculated a reasonable uncertainty model, for example,
connected volume of water or hydrocarbon to a location
mineral or hydrocarbon resource in place
recover factor mineral or hydrocarbon extraction

and now we must make a decision with that uncertainty model, for examples,
number and locations of wells
time and volume injected for water of chemical flood, pump and treat remediation
dig limits and mine plan
Types of Decisions#
For example well site selection methods include,
Integer programing problem, output must be an integer, not a fraction
Optimization combined with simulation one potential well at a time
Experimental design and response surface methodology
Optimize the profitability of reservoir production,
High dimensional - complicated by large number of parameters and timing
Interactions - locations of injector and producers,
Multivariate - injection, well completions
We could make \(L\) optimum choices, a set of choices that are optimum for each model. Or we could make one globally optimum choice.

Expected Profit Workflow#
These are the steps in the expected profit workflow for optimum decision making in the presence of uncertainty,
Calculate the uncertainty model, \(\ell = 1,\ldots,L\) subsurface realizations
Establish \(s = 1,\dots,S\) development scenarios
Establish \(P\) profit metric
Calculate profit, \(P\), over all \(L\) and \(S\)
Calculate the expected profit for each development scenario, \(E\{P(S)\}\)
Select the \(𝑠\) development scenario that maximizes expected profit.
Now we walk through each of these steps and provide more details.
The Uncertainty Model#
Calculate the uncertainty model, \(\ell = 1,\ldots,L\), subsurface models. This includes,
Scenarios - change the model inputs, and assumptions to capture uncertainty in the modeling decisions
Realizations - change the random number seed to capture spatial uncertainty
Multivariate - include all needed features, and their relations
The uncertainty model is represented by multiple models of the subsurface.

The Development Scenarios#
Discrete alternatives for developing the spatial, subsurface resource
Includes all details needed to simulate extraction, e.g., well locations, completion types, dig limits, mining block sequence
For example, 3 potential well locations with drainage radius,

The Profit Metric#
The profit metric,
is a transfer function to map from spatial features to value, \(P(\ell,s)\)
include inputs subsurface realization, \(\ell\), and development scenario, \(s\)
In the units of the thing are we trying to maximize? For example, net present value, tonnage contaminant removed from soil, or volume of resource recovered

We calculated profit over all \(L\) subsurface realizations, and \(S\) development scenarios. If possible we work with the,
full combinatorial of \(L\) subsurface realizations and \(S\) development scenarios, \(P(\ell = 1,\ldots,L,s=1,\ldots,S)\)

Expected Profit#
Calculation expected profit \(E\{P(S)\}\)
Expected profit is a probability, 𝜆_𝑙, weighted average of profit over subsurface uncertainty
if all models are equiprobable,
Maximize Expected Profit#
Now we maximize the expected profit \(E\{P(S)\}\),
select the development scenario that maximizes expected profit over the subsurface realizations, \(L\)

This is the fundamental workflow for making an optimum decision in the presence of uncertainty.
Below we demonstrate this approach to select a single estimate. But, let’s first talk about uncertainty.
How do we make optimum estimates in the presence of uncertainty?
we cannot provide an uncertainty distribution to operations for decision making.
we must provide a single estimate. We must choose a single estimate in the presence of uncertainty
Let’s think about making this single estimate, we could just choose the average, expectation assuming equal probability for each outcome.
The expectation estimate assumes the cost of under- and over-estimation is symmetric and squared, i.e., the squared loss L2 norm
Risk Adverse – cost of overestimation > cost of underestimation
Risk Seeking – cost of overestimation < cost of underestimation
Let’s formalize this with the concept of loss functions.
Loss Functions#
To make a decision in the presence of uncertainty we need to quantify the loss function,
loss due to over- and underestimation of the true value
no loss if the estimate is correct, i.e., if Estimate – Truth = 0
Note: estimating with the mean minimizes the quadratic loss function
Here’s the L2 loss, symmetric and increasing with the square away from 0, perfectly accurate prediction,

For a more complicated example, overestimation is more costly than underestimation and at some threshold underestimation any further has no more cost,

To further illustrate loss functions, consider the loss function to support the decision to carry an umbrella given uncertainty in the weather, rain or no rain from Professor Clayton Deutsch’s geostatistics course.
Overestimation - you estimate rain, but no rain happens
Underestimation = you estimate no rain, but rain happens
Should you carry an umbrella? The loss function depend on where you are going?

Decision Making with Loss Functions#
This is the workflow for decision making in the presence of uncertainty,
quantify cost of over and underestimation in a loss function
apply loss function to the random variable of interest for a range of estimates
calculate the expected loss for each estimate
make decision that minimizes loss

To understand the change in expected profit over different estimates, let’s put the flipped loss on the uncertainty distribution.

Now we can explain the results,
the black estimate is the lowest because the mode of the uncertainty distribution (highest probability) includes the lowest loss portions of the loss function.
the red estimate is the highest because the cost of over estimation is more costly and the entire uncertainty distribution is in the over estimated portion of the loss function
the blue estimate is quite low, but the cost of underestimation levels off, so the loss is not that high
Let’s actually demonstrate the calculation of expected loss for a discrete case, we turn the uncertainty model into a normalized histogram with probability to be in each bin,
Now we can calculate the expected loss for an estimate of 100 MMbbl as,

and the expected loss for an estimate of 400 MMbbl as,

and the expected loss for an estimate of 700 MMbbl as,

Now I provide an interactive dashboard for decision making in the presence of uncertainty.
this will not render in the book, so in the future I will add static examples for the book.
here is the interactive dashboard.

Load the Required Libraries#
The following code loads the required libraries.
import geostatspy.GSLIB as GSLIB # GSLIB utilies, visualization and wrapper
import geostatspy.geostats as geostats # GSLIB methods convert to Python
We will also need some standard packages. These should have been installed with Anaconda 3.
import numpy as np # ndarrys for gridded data
import pandas as pd # DataFrames for tabular data
import os # set working directory, run executables
import matplotlib.pyplot as plt # for plotting
from scipy import stats # summary statistics
import math # trig etc.
import random
from ipywidgets import interactive # widgets and interactivity
from ipywidgets import widgets
from ipywidgets import Layout
from ipywidgets import Label
from ipywidgets import VBox, HBox
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[2], line 8
6 import math # trig etc.
7 import random
----> 8 from ipywidgets import interactive # widgets and interactivity
9 from ipywidgets import widgets
10 from ipywidgets import Layout
ModuleNotFoundError: No module named 'ipywidgets'
Set Up the Interactive Dashboard#
The following code includes the:
dashboard
numerical methods
graphical displays
# interactive calculation of the random sample set (control of source parametric distribution and number of samples)
l = widgets.Text(value=' Optimum Decision Making Demonstration, Michael Pyrcz, Professor, The University of Texas at Austin',layout=Layout(width='950px', height='30px'))
mean = widgets.FloatSlider(min=0.0, max = 1.0, value = 0.5,step=0.01,description = r'$\mu$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
mean.style.handle_color = 'darkorange'
stdev = widgets.FloatSlider(min=0.001, max = 0.25, value = 0.15,step=0.01,description = r'$\sigma$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
stdev.style.handle_color = 'darkorange'
ui1 = widgets.VBox([mean,stdev],kwargs = {'justify_content':'center'})
slope_under = widgets.FloatSlider(min=0.00, max = 1.0,value = 0.1,step=0.01,description = r'$m_{under}$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
slope_under.style.handle_color = 'green'
power_under = widgets.FloatSlider(min=1.0, max = 5.0, value = 0.0, description = r'$p_{under}$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
power_under.style.handle_color = 'green'
ui2 = widgets.VBox([slope_under,power_under],kwargs = {'justify_content':'center'})
slope_over = widgets.FloatSlider(min=0.00, max = 1.0,value = 0.1,step=0.1,description = r'$m_{over}$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
slope_over.style.handle_color = 'blue'
power_over = widgets.FloatSlider(min=1.0, max = 5.0, value = 0.0, description = r'$p_{over}$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
power_over.style.handle_color = 'blue'
ui3 = widgets.VBox([slope_over,power_over],kwargs = {'justify_content':'center'})
step = widgets.FloatLogSlider(min=-4, max = -1, value = 0.02, description = 'Step',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
step.style.handle_color = 'gray'
ui4 = widgets.VBox([step],kwargs = {'justify_content':'none'})
ui = widgets.HBox([ui1,ui2,ui3,ui4])
ui_title = widgets.VBox([l,ui],)
def make_plot(mean,stdev,slope_under,power_under,slope_over,power_over,step):
seed = 73073
xmin = mean - 5.0 * stdev
xmax = mean + 5.0 * stdev
X = np.arange(xmin,xmax,step) # build uncertainty distribution
pdf = stats.norm.pdf(X, loc = mean, scale = stdev)
y = stats.norm.rvs(loc=mean,scale=stdev,size=10000,random_state=seed)
delta = np.arange(-1*(xmax-xmin)*4,(xmax-xmin)*4,step) # build loss function
loss = np.zeros(len(delta))
rloss = np.zeros(len(delta)) # flip for convolve operator
for id, d in enumerate(delta):
if d < 0.0:
loss[id] = pow(abs(d),power_under)*slope_under
rloss[len(delta)-id-1] = pow(abs(d),power_under)*slope_under
else:
loss[id] = pow(d,power_over)*slope_over
rloss[len(delta)-id-1] = pow(d,power_over)*slope_over
pdf_norm = pdf / pdf.sum() # calculate expected loss
exp_loss = np.convolve(pdf_norm,loss,mode='valid')
x_exp_loss = np.arange(-0.5*step*(len(exp_loss))+mean,0.5*step*(len(exp_loss))+mean,step)
x_exp_loss = x_exp_loss[0:len(exp_loss)] # avoid rounding error
inside = np.logical_and(x_exp_loss >= xmin, x_exp_loss <= xmax)
max_loss_plot = np.max(exp_loss,where = inside,initial=0.0)
best_estimate = x_exp_loss[np.argmin(exp_loss)]
min_loss = np.min(exp_loss)
plt.subplot(131) # plot uncertainty distribution
plt.plot(X,pdf,color='darkorange',alpha=0.7,zorder=10,lw=3,ls='--')
hist = plt.hist(y,bins=X,edgecolor='black',color='darkorange',alpha=0.7,density=True,stacked=True,
histtype=u'step',linewidth=2)
plt.hist(y,bins=X,color='grey',alpha=0.2,zorder=20,density=True)
if best_estimate >= 0 and best_estimate <= 1.0:
plt.plot([best_estimate,best_estimate],[0,np.max(hist[0])*1.2],color='black',linestyle='--',zorder=100)
plt.annotate('Optimum Estimate = ' + str(np.round(best_estimate,2)),[best_estimate,np.max(hist[0])*0.7], rotation=270)
elif best_estimate < 0.0:
plt.annotate('Optimum Estimate = ' + str(r'$-\infty$'),[0,np.max(hist[0])*0.7], rotation=270)
else:
plt.annotate('Optimum Estimate = ' + str(r'$+\infty$'),[0.96,np.max(hist[0])*0.7], rotation=270)
plt.xlim(0.0,1.0); plt.xlabel(r"Feature, $y_1$"); plt.ylim([0,np.max(hist[0])*1.2])
plt.title('Uncertainty Distribution'); plt.ylabel('Density')
plt.grid(color='grey', ls = '-.', lw = 0.1)
plt.subplot(132) # plot loss function, loss vs. estimate error
plt.plot(delta[delta>0],loss[delta>0],color='blue',alpha=0.8,lw=3)
plt.plot(delta[delta<0],loss[delta<0],color='green',alpha=0.8,lw=3)
# plt.plot([0,0],[0,np.max(loss)],color='black',linestyle='--',alpha=0.3)
plt.plot([0,0],[0,1.0],color='black',linestyle='--',alpha=0.8)
# plt.annotate('Underestimation',[(xmax-xmin)*-0.1,np.max(loss)*0.8], rotation=0,horizontalalignment='right')
# plt.annotate('Overestimation',[(xmax-xmin)*0.1,np.max(loss)*0.8], rotation=0,horizontalalignment='left')
plt.annotate('Underestimation',[(xmax-xmin)*-0.1,0.7], rotation=0,horizontalalignment='right')
plt.annotate('Overestimation',[(xmax-xmin)*0.1,0.7], rotation=0,horizontalalignment='left')
# plt.xlim([-1*(xmax-xmin)*3,(xmax-xmin)*3])
# plt.ylim([0,np.max(loss)])
plt.xlim([-1,1]); plt.ylim([0,1.0])
plt.xlabel('Error In Estimate'); plt.title('Loss Function'); plt.ylabel('Loss')
plt.grid(color='grey', ls = '-.', lw = 0.1)
plt.subplot(133) # plot expected loss vs. estimate
#plt.plot(X,pdf/1000*stdev,'--',color='black',alpha=0.7,zorder=1)
plt.plot(x_exp_loss,exp_loss,color='red',alpha=0.8,lw=3)
if best_estimate >= 0 and best_estimate <= 1.0:
plt.plot([best_estimate,best_estimate],[0,max_loss_plot],color='black',linestyle='--',lw=2)
plt.annotate('Optimum Estimate = ' + str(np.round(best_estimate,2)),[best_estimate,max_loss_plot*0.6], rotation=270)
elif best_estimate < 0:
plt.annotate('Optimum Estimate = ' + str(r'$-\infty$'),[0,max_loss_plot*0.6], rotation=270)
else:
plt.annotate('Optimum Estimate = ' + str(r'$+\infty$'),[0.96,max_loss_plot*0.6], rotation=270)
plt.xlim([0.0,1.0]); # plt.ylim([0,1.0])
# plt.plot([xmin,xmax],[min_loss,min_loss],color='black',linestyle='--',alpha=0.3)
plt.plot([0,1],[min_loss,min_loss],color='black',linestyle='--',alpha=0.7,lw=2)
plt.annotate('Minimum Expected Loss',[xmin+(xmax-xmin)*0.2,min_loss*1.1], rotation=0,horizontalalignment='left')
plt.ylim([0,max_loss_plot])
plt.xlabel(r'Estimate, $y^*_1$'); plt.title('Expected Loss vs. Estimate'); plt.ylabel('Expected Loss')
ax2 = plt.gca().twinx()
ax2.plot(X,pdf,'--',color='darkorange',alpha=0.7,zorder=1,lw=3)
ax2.set_ylim([0,np.max(hist[0])*1.2])
ax2.set_ylabel('Density',rotation=270,labelpad=10)
plt.grid(color='grey', ls = '-.', lw = 0.1)
plt.subplots_adjust(left=0.0, bottom=0.0, right=3.0, top=1.2, wspace=0.2, hspace=0.2)
plt.show()
interactive_plot = widgets.interactive_output(make_plot, {'mean':mean,'stdev':stdev,'slope_under':slope_under,'power_under':power_under,'slope_over':slope_over,'power_over':power_over,'step':step})
interactive_plot.clear_output(wait = True) # reduce flickering by delaying plot updating
Optimum Decision Making Demonstration#
specify the uncertainty distribuiton \(\mu\), \(\sigma\) and loss function \(m\) and \(p\) and observe the loss function, expected loss and optimum estimate
display(ui_title, interactive_plot) # display the interactive plot
Want to Work Together?#
I hope this content is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.
Want to invite me to visit your company for training, mentoring, project review, workflow design and / or consulting? I’d be happy to drop by and work with you!
Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PI is Professor John Foster)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!
I can be reached at mpyrcz@austin.utexas.edu.
I’m always happy to discuss,
Michael
Michael Pyrcz, Ph.D., P.Eng. Professor, Cockrell School of Engineering and The Jackson School of Geosciences, The University of Texas at Austin
More Resources Available at: Twitter | GitHub | Website | GoogleScholar | Geostatistics Book | YouTube | Applied Geostats in Python e-book | Applied Machine Learning in Python e-book | LinkedIn
Comments#
This was a basic demonstration of optimum decision making in the precense of uncertainty.
we build uncertainty models all the time to support decision making. Here’s how it gets done.
I have other demonstrations on the basics of working with DataFrames, ndarrays, univariate statistics, plotting data, declustering, data transformations, trend modeling, multivariate analysis and many other workflows available at GeostatsGuy/PythonNumericalDemos and GeostatsGuy/GeostatsPy.
I hope this is helpful,
Michael