First laod the required packages and usefull functions: geoR et fields. Check that the following packahes are loaded: sp, splancs, RandomFields and spam. If not, do it!

library(geoR)
library(fields)

I. Loading and sampling in the data file

I.1 Loading

Simu = read.table('simu1.txt')
names(Simu)=c('x','y','z')
head(Simu)

##   x y         z
## 1 0 0 -1.138420
## 2 1 0 -1.264911
## 3 2 0 -1.707630
## 4 3 0 -1.770876
## 5 4 0 -1.264911
## 6 5 0 -1.391402

I.2 Sampling

IndEch=sample(1:10201,100)
Ech=Simu[IndEch,]
plot(Ech$x,Ech$y)

I.3 First representations of the dataset

I.3.a Histogram

hist(Ech$z)

Plot of the histogram of the values of the random field

I.3.b geodata and first representations

Transform the dataset to a geodataset to be able to use geostatistics functions and first geostatistics representation.

geodata=as.geodata(Ech)
plot(geodata)

The function plot(geodata) produces a 2 x 2 display with the following plots: the first indicates the spatial locations assign different colors to data in different quartiles, the next two shows data against the X and Y coordinates and the last is an histogram of the data values or optionally, a 3-D plot with spatial locations and associated data values. colors for the quartiles in the first plot. ‘’qt.col’’ If missing defaults to blue, green, yellow and red.

I.3.c polynomial regression lines

plot(geodata,lowess=T)

Add a polynomial regression line between the values of \(Z\) and \(x\) (or \(Z\) and \(y\)).

I.3.d 3-dimensions representation

For a 3 dimensions representation, one can load the package scatterplot3d if not already loaded. Then add the option scatter3d=TRUE}.

plot(geodata, scatter3d=TRUE)

I.3.e Other representations

Other representations of the observations are possible: circles with size proportional to the \(Z\) value, circle with fixed size but color nuances in concordance of the \(Z\) value.

To get circles with size proportional to the value of \(Z\):

points(geodata,main="Observed values")

To get fixed size circles with color nuances:

points(geodata,cex.min=2,cex.max=2,col="gray")

To get a representation in function of the 5 quintiles of the distribution of \(Z\).

points(geodata,cex.min=2,cex.max=2,pt.div="quint")

In function of the 10 deciles

points(geodata,cex.min=2,cex.max=2,pt.div="dec", col =tim.colors(64))

To get a summary of the geostatistics data

summary(geodata)

## Number of data points: 100 
## 
## Coordinates summary
##       x   y
## min   1   0
## max 100 100
## 
## Distance summary
##      min      max 
##   1.0000 133.7348 
## 
## Data summary
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -3.6680 -0.7115 -0.1897 -0.1214  0.6483  2.9090

II. Estimation of the variogram

II.1. Empirical estimation

One can use the function variog to get estimates of the variogram.

II.1.a Variogram cloud

vario.c=variog(geodata,op="cloud")
plot(vario.c,main = "Variogram Cloud",pch='+')

II.1.b Default isotropic empirical variogram

vario.b=variog(geodata)
plot(vario.b,main = "Empirical Variogram",type="b")

The defaults If arguments breaks and uvec are not provided, the binning is defined as follows: 1. read the argument max.dist. If not provided it is set to the maximum distance between the pairs of points. 2. the center of the bins are initially defined by the sequence u = seq(0, max.dist, l = 13). 3. the interval spanned by each bin is given by the mid-points between the centers of the bins. If an vector is passed to the argument breaks its elements are taken as the limits of the bins (classes of distance) and the argument uvec is ignored.

II.1.c Options of the variog function

vario.b=variog(geodata,max.dist=70,uvec=seq(0,70,10))
plot(vario.b,main = "Empirical Variogram",type="b")

max.dist. A numerical value defining the maximum distance for the variogram. Pairs of locations separated for distance larger than this value are ignored for the variogram calculation. If not provided defaults takes the maximum distance among all pairs of data locations. classes of distances=bins Bin centers are given by the series seq(0,70,10). Their widths ensure that all the data are taken into account (middle of the window)

vario.b=variog(geodata,max.dist=70,breaks=seq(0,70,10))
plot(vario.b,main = "Empirical Variogram",type="b")

Here the bounds of the bins are given by seq(0,70,10) and the value of the variogram is given in the middle of each window.

vario.b$n # Number of data in each bin

## [1] 134 373 538 593 620 667 637

vario.b$v # Values of the empirical variogram (per interval)

## [1] 0.715403 1.144949 1.412108 1.582327 1.517052 1.412291 1.352057

vario.bc=variog(geodata,max.dist=70,breaks=seq(0,70,10),bin.cloud=TRUE)
plot(vario.bc,main = "Box-plot for each bin",bin.cloud=TRUE)

Box-plot for each bin.

II.1.d Empirical anisotropic variogram

vario.b0=variog(geodata,max.dist=70,breaks=seq(0,70,10), dir=0)
vario.b45=variog(geodata,max.dist=70,breaks=seq(0,70,10), dir=pi/4)
vario.b90=variog(geodata,max.dist=70,breaks=seq(0,70,10), dir=pi/2)
vario.b135=variog(geodata,max.dist=70,breaks=seq(0,70,10), dir=3*pi/4)

ylim1=max(c(vario.b0$v, vario.b45$v, vario.b90$v, vario.b135$v))# to get an unique graphic scale 
plot(vario.b0,main = "Directional Empricial Variograms", ylim=c(0,ylim1),type="b", pch=0)
par(new=TRUE)
plot(vario.b45,ylim=c(0,ylim1),type="b",col="red", pch=1)
par(new=TRUE)
plot(vario.b90,ylim=c(0,ylim1),type="b",col="blue", pch=2)
par(new=TRUE)
plot(vario.b135,ylim=c(0,ylim1),type="b",col="green", pch=3)

II.2. Parametric estimation

II.2.a Least Square Estimation

Test of different models

varioestexp=variofit(vario.b,cov.model = "exponential", ini.cov.pars=c(1.5,30))
plot(vario.b,main="Parametric Estimate of the Variogam with OLS")
lines(varioestexp)
# ini.cov.pars are the initial values used in the minimization algorithm
varioestsph=variofit(vario.b,cov.model = "spherical", ini.cov.pars=c(1.5,30))
lines(varioestsph,col="red")
varioestgaus=variofit(vario.b,cov.model = "gaussian", ini.cov.pars=c(1.5,30))
lines(varioestgaus,col="blue")
varioestnug=variofit(vario.b,cov.model = "pure.nugget")
lines(varioestnug,col="violet")
varioestexppep = variofit(vario.b,cov.model = "exponential",ini.cov.pars=c(1.5,30),fix.nugget=TRUE,nugget=0.1)
lines(varioestexppep,col="gray")
legend("bottomright", legend = c("Empirical","Exponential", "Spherical", "Gaussian","Nugget","Exponential & Nugget"), col = c("black","black","red", "blue", "violet","gray"), pch = c('o','-','-','-','-','-','-'), bty = "o", pt.cex = 1, cex = 0.8, text.col = "black", horiz = FALSE, inset = c(0.1, 0.1))

Adequation

We can get the values of the Loss function for each model.

varioestexp$value

## [1] 26.24905

varioestsph$value

## [1] 19.81734

varioestgaus$value

## [1] 20.35512

varioestnug$value

## [1] 0.5753297

varioestexppep$value

## [1] 116.782

We choose the spherical model (results for pure nugget effects are strange…). One can get the parameter values

varioestsph$cov.pars

## [1]  0.9818383 30.7579613

cov.pars: a two elements vector with estimated values of the covariance parameters \(\sigma^2\) and \(\phi\), respectively.

\(C(h) = \sigma^2 * \rho(h)=C(0)-\gamma(h)\)

\(\sigma^2=C(0)\)

\(\rho(h) = 1 - 1.5 * (h/\phi) + 0.5(h/\phi)^3\) if \(h < \phi\) , 0 otherwise

Remark : \(\phi=a\) in my lecture and \(\sigma^2=Sill=C=Var(X_s)=\lim \gamma(h)\) in the definition of the spherical model. Thus \(\gamma(h)=1.5 * (h/\phi) - 0.5(h/\phi)^3\).

II.2.b Maximum Likelihood Estimation

varioestlikexp= likfit(geodata,cov.model = "exponential", ini.cov.pars=c(1.5,30))
plot(vario.b,main="Parametric Estimate of the Variogam with ML")
lines(varioestlikexp)
# ini.cov.pars sont les valeurs initiales utilisées dans l'algorithme de minimisation
varioestliksph= likfit(geodata,cov.model = "spherical", ini.cov.pars=c(1.5,30))
lines(varioestliksph,col="red")
varioestlikgaus= likfit(geodata,cov.model = "gaussian", ini.cov.pars=c(1.5,30))
lines(varioestlikgaus,col="blue")
varioestliknug= likfit(geodata,cov.model = "pure.nugget", ini.cov.pars=c(1.5,30))
lines(varioestliknug,col="violet")
varioestlikexppep = likfit(geodata,cov.model = "exponential",ini.cov.pars=c(1.5,30),fix.nugget=TRUE,nugget=0.1)
lines(varioestlikexppep,col="gray")
legend("bottomright", legend = c("Empirical","Exponential", "Spherical", "Gaussian","Nugget","Exponential & Nugget"), col = c("black","black","red", "blue", "violet","gray"), pch = c('o','-','-','-','-','-','-'), bty = "o", pt.cex = 1, cex = 0.8, text.col = "black", horiz = FALSE, inset = c(0.1, 0.1))

varioestlikexp$loglik

## [1] -133.6762

varioestliksph$loglik

## [1] -132.9702

varioestlikgaus$loglik

## [1] -132.8799

varioestliknug$loglik

## [1] -158.6393

varioestlikexppep$loglik

## [1] -133.7472

II.2.c Comparison of the two approaches

plot(vario.b,main="Comparison of the two estimations : LS versus ML", sub="Spherical Model")
varioestsph=variofit(vario.b,cov.model = "spherical", ini.cov.pars=c(1.5,30))
lines(varioestsph,col="red")
varioestliksph= likfit(geodata,cov.model = "spherical", ini.cov.pars=c(1.5,30))
lines(varioestliksph,col="blue")
legend("bottomright", legend = c("LS","ML"), col = c("red", "blue"), pch = '-', bty = "o", pt.cex = 1, cex = 0.8, text.col = "black", horiz = FALSE, inset = c(0.1, 0.1))

III. Kriging

III.1. Simple Kriging

Remind that this approach assumes that the mean function is known. Here we can estimate the mean function of the full dataset.

geodataCentre=geodata
geodataCentre$data= geodataCentre$data-mean(Simu$z)

Then we can do simple kriging.

grille=expand.grid(0:100,0:100)
Kcontrol=krige.control(type.krige="sk",obj.model=varioestsph)
KS=krige.conv(geodataCentre,loc=grille,krige=Kcontrol)
KS$predict= KS$predict+ mean(Simu$z)
image(KS, col=tim.colors(64) , x.leg=c(0,100),y.leg=c(-10,-5),coords.data=geodata$coords)
#  ajoute des points noirs aux sites observés utilsés pour le Krigeage
title(main="Simple Kriging on centered dataset")

Remark. coords.data= adds black points where the random field is observed

But we can do all this in one step, using the beta option.

Kcontrol=krige.control(type.krige="sk",obj.model=varioestsph,beta=mean(Simu$z))
KS=krige.conv(geodata,loc=grille,krige=Kcontrol)
image(KS, col=tim.colors(64) , x.leg=c(0,100),y.leg=c(-10,-5),coords.data=geodata$coords)
title(main="Simple Kriging with option beta")

And we can see that we obtain excatly the same figure.

We can get other types of image.

contour(KS, col=tim.colors(64) , coords.data=geodata$coords)
title(main="Level lines for the Simple Kriging")

persp(KS, col=tim.colors(64))
title(main="3-dimension image of the Simple Kriging")

We can also represent the Simple Kriging Variance

titre="Simple Kriging Variance"
image(KS,col =tim.colors(64),ylim=c(-10,100),zlim=c(0,2),val = sqrt(KS$krige.var),x.leg=c(0,100),y.leg=c(-10,-5),coords.data=geodata$coords)
title(main="Simple Kriging Variance")

III.2. Ordinary Kriging

Kcontrol=krige.control(type.krige="ok",obj.model=varioestsph)
KO=krige.conv(geodata,loc=grille,krige=Kcontrol)


image(KO,col=tim.colors(64),ylim=c(-10,100),zlim=c(-5,5),x.leg=c(0,100),y.leg=c(-10,-5),coords.data=geodata$coords)
title(main="Ordinary Kriging")

image(KO,col=tim.colors(64),ylim=c(-10,100),x.leg=c(0,100),y.leg=c(-10,-5),coords.data=geodata$coords)
title(main="Ordinary Kriging")

contour(KO,col=tim.colors(64),coords.data=geodata$coords)
title(main="Ordinary Kriging")

persp(KO,col=tim.colors(64))
title(main="Ordinary Kriging")

image(KO,col =tim.colors(64),ylim=c(-10,100),val = sqrt(KO$krige.var),x.leg=c(0,100),y.leg=c(-10,-5),coords.data=geodata$coords)
title(main="Variance of the Ordinary kriging")

image(KO,col =tim.colors(64),ylim=c(-10,100),zlim=c(0,2),val = sqrt(KO$krige.var),x.leg=c(0,100),y.leg=c(-10,-5),coords.data=geodata$coords)
title(main="Variance of the Ordinary kriging")

contour(KO,col =tim.colors(64),ylim=c(-10,100),zlim=c(0,2),val = sqrt(KO$krige.var),coords.data=geodata$coords)
title(main="Variance of the Ordinary kriging")

persp(KO,col =tim.colors(64),ylim=c(-10,100),val = sqrt(KO$krige.var))
title(main="Variance of the Ordinary kriging")

III.3. Universal Kriging

III.3.a Variogram of random fields with trend

We first have to calculate the new variogram with trend (of first order or second order) and then use them in the universal kriging. First order: polynom of degree 1; second order: polynom of degree 2).

vario.b=variog(geodata,max.dist=70,breaks=seq(0,70,10))
variotrend1.b=variog(geodata,max.dist=70,breaks=seq(0,70,10), trend="1st")
variotrend2.b=variog(geodata,max.dist=70,breaks=seq(0,70,10), trend="2nd")

III.3.b Plot of the different variograms

We first have to choose a common scale for the plot of the different function on the same graphic.

ylim1=max(c(vario.b$v, variotrend1.b$v, variotrend2.b$v))

plot(vario.b,main = "Empirical variograms with different trends", ylim=c(0,ylim1) ,type="b",col="red", pch=1)
par(new=TRUE)
plot(variotrend1.b,type="b",col="blue",ylim=c(0,ylim1) , pch=2)
par(new=TRUE)
plot(variotrend2.b,type="b",col="green",ylim=c(0,ylim1) , pch=3)
legend("bottomright", legend = c("Without trend","With polynomial trend of degree 1", "With polynomial trend of degree 2"), col = c("red","blue", "green"), pch = c(1,2,3), bty = "o", pt.cex = 1, cex = 0.8, text.col = "black", horiz = FALSE, inset = c(0.1, 0.1))

III.3.c Kriging with different models

variotrend1estsph=variofit(variotrend1.b,cov.model = "spherical", ini.cov.pars=c(1.5,30))
Kcontrol=krige.control(type.krige="ok",trend.d="1st", trend.l="1st",obj.model= variotrend1estsph)#  trend.d spécifie trend aux points observés, trend.l spécifie le trend aux points prédits
KU1=krige.conv(geodata,loc=grille,krige=Kcontrol)
variotrend2estsph=variofit(variotrend2.b,cov.model = "spherical", ini.cov.pars=c(1.5,30))
Kcontrol=krige.control(type.krige="ok",trend.d="2nd", trend.l="2nd",obj.model= variotrend2estsph)
KU2=krige.conv(geodata,loc=grille,krige=Kcontrol)

image(KU1,col=tim.colors(64),ylim=c(-10,100),zlim=c(-5,5),x.leg=c(0,100),y.leg=c(-10,-5),coords.data=geodata$coords)
title(main="Universal Kriging, Trend=1st")

image(KU2,col=tim.colors(64),ylim=c(-10,100),zlim=c(-5,5),x.leg=c(0,100),y.leg=c(-10,-5),coords.data=geodata$coords)
title(main="Universal Kriging, Trend=2nd")

contour(KU1,col=tim.colors(64),coords.data=geodata$coords)
title(main="Universal Kriging, Trend=1st")

contour(KU2,col=tim.colors(64),coords.data=geodata$coords)
title(main="Universal Kriging, Trend=2nd")

persp(KU1,col=tim.colors(64))
title(main="Universal Kriging, Trend=1st")

persp(KU2,col=tim.colors(64))
title(main="Universal Kriging, Trend=2nd")

We can also do the same thing for the variance…

Geostatistics with R

Jean-Yves Dauxois

21 juillet 2016