Kriging

Kriging is a group of geostatistical techniques to interpolate the value of a random field (e.g. the elevation Z of the landscape as a function of the geographic location) at an unobserved location from observations of its value at nearby locations. The theory behind interpolation and extrapolation by Kriging was developed by the French mathematician Georges Matheron based on the Master's thesis of Daniel Gerhardus Krige, the pioneering plotter of distance-weighted average gold grades at the Witwatersrand reef complex in South Africa. The English verb is to krige and the most common adjective is kriging.

Kriging interpolation


Kriging belongs to the family of linear least squares estimation algorithms. As illustrated in Figure 1, the aim of kriging is to estimate the value of an unknown real function $$f$$ at a point $$x^*$$, given the values of the function at some other points $$x_1,\ldots, x_n$$. A kriging estimator is said to be linear because the predicted value $$\hat f(x^*)$$ is a linear combination that may be written as
 * $$\hat f(x^*) = \sum_{i=1}^n \lambda_i f(x_i)$$.

The weights $$\lambda_i$$ are solutions of a system of linear equations which is obtained by assuming that $$f$$ is a sample-path of a random process $$F(x)$$, and that the error of prediction
 * $$\varepsilon(x) = F(x) - \sum_{i=1}^n \lambda_i F(x_i) $$

is to be minimized in some sense. For instance, the so-called simple kriging assumption is that the mean and the covariance of $$F(x)$$ is known and then, the kriging predictor is the one that minimizes the variance of the prediction error.

From the geological point of view, the practice of kriging is based on assuming continued mineralization between measured values. Assuming prior knowledge encapsulates how minerals co-occur as a function of space. Then, given an ordered set of measured grades, interpolation by kriging predicts mineral concentrations at unobserved points.

Applications of kriging
The application of kriging to problems in geology and mining as well as to hydrology started in the mid-60's and especially in the 70's with the work of Georges Matheron. The connection between kriging and Geostatistics is still prevailing today.

Kriging is e.g. used in


 * Mining


 * Hydrogeology
 * Natural resources


 * Environmental science


 * Remote sensing


 * Black box modelling in computer experiments

Controversy in mineral exploration and mining
The question of whether spatial dependence may be assumed or ought to be verified by applying Fisher's F-test to the variance of a set of measured values and the first variance term of the ordered set prior to interpolation by kriging is of particular relevance in mineral exploration and mining. For example, Clark’s hypothetical uranium data in Practical Geostatistics do not display a significant degree of spatial dependence but the author reports a kriged estimate for some selected coordinates within this sample space anyway. The practice of kriging lends itself to abuse, particularly when applied to a model ore distribution based on the assumption that ore concentrations display a significant degree of spatial dependency in the sample space under examination. Spatial dependence between borehole grades was assumed at Bre-X's Busang property, Hecla's Grouse Creek mine and scores of others where gold grades turned out to be lower than predicted. A significant degree of spatial dependence is required to justify interpolation between measured values in ordered sets. Failing to pass a test for spatial dependence would indicate that a constant model cannot be distinguished from a kriging model without further information or knowledge.

General equations of kriging
Kriging is a group of geostatistical techniques to interpolate the value $$Z(x_0)$$ of a random field $$Z(x)$$ (e.g. the elevation $$Z$$ of the landscape as a function of the geographic location $$x$$) at an unobserved location $$x_0$$ from observations $$z_i=Z(x_i),\;i=1,\ldots,n$$ of the random field at nearby locations $$x_1,\ldots,x_n$$. Kriging computes the best linear unbiased estimator $$\hat{Z}(x_0)$$ of $$Z(x_0)$$ based on a stochastic model of the spatial dependence quantified either by the variogram $$\gamma(x,y)$$ or by expectation $$\mu(x)=E[Z(x)]$$ and the covariance function $$c(x,y)$$ of the random field.

The kriging estimator is given by a linear combination


 * $$\hat{Z}(x_0)=\sum_{i=1}^n w_i(x_0) Z(x_i)$$

of the observed values $$z_i=Z(x_i)$$ with weights $$w_i(x_0),\;i=1,\ldots,n$$ chosen such that the variance (also called kriging variance or kriging error):


 * $$\sigma^2_k(x_0):=\mathrm{Var}\left(\hat{Z}(x_0)-Z(x)\right)=\sum_{i=1}^n\sum_{j=1}^n w_i(x_0) w_j(x_0) c(x_i,x_j)

+ \mathrm{Var}\left(Z(x)\right)-2\sum_{i=1}^nw_i(x_0)c(x_i,x_0)$$

minimized subject to the unbiasedness condition:

\mathrm{E}[\hat{Z}(x)-Z(x)]=\sum_{i=1}^n w_i(x_0)\mu(x_i) - \mu(x_0) =0 $$ Depending on the stochastic properties of the random field different types of kriging apply. For the different types of kriging the unbiasedness condition is rewritten into different linear constraints for the weights $$w_i$$.

The kriging variance must not be confused with the variance

\mathrm{Var}\left(\hat{Z}(x_0)\right)=\mathrm{Var}\left(\sum_{i=1}^n w_iZ(x_i)\right)=\sum_{i=1}^n\sum_{j=1}^n w_i w_j c(x_i,x_j) $$ of the kriging predictor $$\hat{Z}(x_0)$$ itself.

The types of kriging
Classical types of kriging are


 * Simple kriging assuming a known constant trend: $$\mu(x)=0$$.
 * Ordinary kriging assuming an unknown constant trend: $$\mu(x)=\mu$$.
 * Universal kriging assuming a general linear trend model $$\mu(x)=\sum_{k=0}^p \beta_k f(x)$$.
 * IRFk-kriging assuming $$\mu(x)$$ to be an unknown polynomial in $$x$$.
 * Indicator kriging using indicator functions instead of the process itself, in order to estimate transition probabilities.
 * Multiple-indicator kriging is a version of indicatior kriging working with a family of indicators. However, MIK has fallen out of favour as an interpolation technique in recent years. This is due to some inherent difficulties related to operation and model validation. Conditional simulation is fast becoming the accepted replacement technique in this case.
 * Disjunctive kriging is a nonlinear generalisation of kriging.
 * Lognormal kriging interpolates positive data by means of logarithms.

Simple kriging
Simple kriging is the most simple kind of kriging. It assumes the expectation of the random field to be known, and relies on a covariance function. However, in most applications neither the expectation nor the covariance are known beforehand.

Simple kriging assumptions
The practical assumptions for the application of simple kriging are:
 * wide sense stationarity of the field.
 * The expectation is zero everywhere: $$\mu(x)=0$$.
 * Known covariance function $$c(x,y)=\mathrm{Cov}(Z(x),Z(y))$$

Simple kriging equation
The kriging weights of simple kriging have no unbiasedness condition and are given by the simple kriging equation system:
 * $$\begin{pmatrix}w_1 \\ \vdots \\ w_n \end{pmatrix}=

\begin{pmatrix}c(x_1,x_1) & \cdots & c(x_1,x_n) \\ \vdots & \ddots & \vdots \\ c(x_n,x_1) & \cdots & c(x_n,x_n) \end{pmatrix}^{-1} \begin{pmatrix}c(x_1,x_0) \\ \vdots \\ c(x_n,x_0) \end{pmatrix} $$

Simple kriging interpolation
The interpolation by simple kriging is given by:
 * $$\hat{Z}(x_0)=\begin{pmatrix}z_1 \\ \vdots \\ z_n \end{pmatrix}'

\begin{pmatrix}c(x_1,x_1) & \cdots & c(x_1,x_n) \\ \vdots & \ddots & \vdots  \\ c(x_n,x_1) & \cdots & c(x_n,x_n) \end{pmatrix}^{-1} \begin{pmatrix}c(x_1,x_0) \\ \vdots \\ c(x_n,x_0)\end{pmatrix} $$

Simple kriging error
The kriging error is given by:
 * $$\mathrm{Var}\left(\hat{Z}(x_0)-Z(x_0)\right)=\underbrace{c(x_0,x_0)}_{\mathrm{Var}(Z(x_0))}-

\underbrace{\begin{pmatrix}c(x_1,x_0) \\ \vdots \\ c(x_n,x_0)\end{pmatrix}' \begin{pmatrix} c(x_1,x_1) & \cdots & c(x_1,x_n) \\ \vdots & \ddots & \vdots \\ c(x_n,x_1) & \cdots & c(x_n,x_n) \end{pmatrix}^{-1} \begin{pmatrix}c(x_1,x_0) \\ \vdots \\ c(x_n,x_0) \end{pmatrix}}_{\mathrm{Var}(\hat{Z}(x))} $$ which leads to the generalised least squares version of the Gauss-Markov theorem (Chiles & Delfiner 1999, p. 159):
 * $$\mathrm{Var}(Z(x_0))=\mathrm{Var}(\hat{Z}(x_0))+\mathrm{Var}\left(\hat{Z}(x_0)-Z(x_0)\right).$$

Ordinary kriging
Ordinary kriging is the most commonly used type of kriging. It assumes a constant but unknown mean.

Typical ordinary kriging assumptions
The typical assumptions for the practical application of ordinary kriging are: The mathematical condition for applicability of ordinary kriging are:
 * Intrinsic stationarity or wide sense stationarity of the field
 * enough observations to estimate the variogram.
 * The mean $$E[Z(x)]=\mu$$ is unknown but constant
 * The variogram $$\gamma(x,y)=E[(Z(x)-Z(y))^2]$$ of $$Z(x)$$ is known.

Ordinary kriging equation
The kriging weights of ordinary kriging fulfill the unbiasedness condition


 * $$\sum_{i=1}^n w_i = 1$$

and are given by the ordinary kriging equation system:
 * $$\begin{pmatrix}w_1 \\ \vdots \\ w_n \\ \lambda \end{pmatrix}=

\begin{pmatrix}\gamma(x_1,x_1) & \cdots & \gamma(x_1,x_n) &1 \\ \vdots & \ddots & \vdots & \vdots \\ \gamma(x_n,x_1) & \cdots & \gamma(x_n,x_n) & 1 \\ 1 &\cdots& 1 & 0 \end{pmatrix}^{-1} \begin{pmatrix}\gamma(x_1,x^*) \\ \vdots \\ \gamma(x_n,x^*) \\ 1\end{pmatrix} $$ the additional parameter $$\lambda$$ is a Lagrange multiplier used in the minization of the kriging error $$\sigma_k^2(x)$$ to honor the unbiasedness condition.

Ordinary kriging interpolation
The interpolation by ordinary kriging is given by:
 * $$\hat{Z}(x^*)=\begin{pmatrix}z_1 \\ \vdots \\ z_n \\ 0 \end{pmatrix}'

\begin{pmatrix}\gamma(x_1,x_1) & \cdots & \gamma(x_1,x_n) &1 \\ \vdots & \ddots & \vdots & \vdots \\ \gamma(x_n,x_1) & \cdots & \gamma(x_n,x_n) & 1 \\ 1 &\cdots& 1 & 0 \end{pmatrix}^{-1} \begin{pmatrix}\gamma(x_1,x^*) \\ \vdots \\ \gamma(x_n,x^*) \\ 1\end{pmatrix} $$

Ordinary kriging error
The kriging error is given by:
 * $$var\left(\hat{Z}(x^*)-Z(x^*)\right)=

\begin{pmatrix}\gamma(x_1,x^*) \\ \vdots \\ \gamma(x_n,x^*) \\ 1\end{pmatrix}' \begin{pmatrix} \gamma(x_1,x_1) & \cdots & \gamma(x_1,x_n) &1 \\ \vdots & \ddots & \vdots & \vdots \\ \gamma(x_n,x_1) & \cdots & \gamma(x_n,x_n) & 1 \\ 1 &\cdots& 1 & 0 \end{pmatrix}^{-1} \begin{pmatrix}\gamma(x_1,x^*) \\ \vdots \\ \gamma(x_n,x^*) \\ 1\end{pmatrix} $$

Properties of kriging
(Cressie 1993, Chiles&Delfiner 1999, Wackernagel 1995)
 * The kriging estimation is unbiased: $$E[\hat{Z}(x_i)]=E[Z(x_i)]$$
 * The kriging estimation honors the actually observed value: $$\hat{Z}(x_i)=Z(x_i)$$
 * The kriging estimation $$\hat{Z}(x)$$ is the best linear unbiased estimator of $$Z(x)$$ if the assumptions hold. However (e.g. Cressie 1993):
 * As with any method: If the assumptions do not hold, kriging might be bad.
 * There might be better nonlinear and/or biased methods.
 * No properties are guaranteed, when the wrong variogram is used. However typically still a 'good' interpolation is achieved.
 * Best is not necessarily good: e.g. In case of no spatial dependence the kriging interpolation is only as good as the arithmetic mean.
 * Kriging provides $$\sigma_k^2$$ as a measure of precision. However this measure relies on the correctness of the variogram.

Kriging terms
A series of related terms were also named after Krige, including kriged estimate, kriged estimator, kriging variance, kriging covariance, zero kriging variance, unity kriging covariance, kriging matrix, kriging method, kriging model, kriging plan, kriging process, kriging system, block kriging, co-kriging, disjunctive kriging, linear kriging, ordinary kriging, point kriging, random kriging, regular grid kriging, simple kriging and universal kriging.

Related methods
Kriging is mathematically closely related to regression analysis. Both theories derive a best linear unbiased estimator, based on assumptions on covariances, make use of Gauss-Markov theorem to prove independence of the estimate and error, and make use of very similar formulae. They are nevertheless useful in different frameworks: Kriging is made for interpolation of a single realisation of a random field, while regression models are based on multiple observations of a multivariate dataset.

In the statistical community the same technique is also known as Gaussian process regression, Kolmogorov Wiener prediction, or best linear unbiased prediction.

The kriging interpolation may also be seen as a spline in a reproducing kernel Hilbert space, with reproducing kernel given by the covariance function. The difference with the classical kriging approach is provided by the interpretation: while the spline is motivated by a minimum norm interpolation based on a Hilbert space structure, kriging is motivated by an expected squared prediction error based on a stochastic model.

Kriging with polynomial trend surfaces is mathematically identical to generalized least squares polynomial curve fitting.

Kriging can also be understood as a form of bayesian inference. Kriging starts with a prior distribution over functions. This prior takes the form of a Gaussian process: $$N$$ samples from a function will be normally distributed, where the covariance between any two samples is the covariance function (or kernel) of the Gaussian process evaluated at the spatial location of two points. A set of values is then observed, each value associated with a spatial location. Now, a new value can be predicted at any new spatial location, by combining the Gaussian prior with a Gaussian likelihood function for each of the observed values. The resulting posterior distribution is also Gaussian, with a mean and covariance that can be simply computed from the observed values, their variance, and the kernel matrix derived from the prior.

History
The theory of Kriging was developed by the French mathematician Georges Matheron based on the Master's thesis of Daniel Gerhardus Krige, the pioneering plotter of distance-weighted average gold grades at the Witwatersrand reef complex. The English verb is to krige and the most common adjective is kriging. The method was called krigeage for the first time in Matheron's 1960 Krigeage d’un Panneau Rectangulaire par sa Périphérie. Matheron, in this Note Géostatistique No 28, derives k*, his 'estimateur' and a precursor to the kriged estimate or kriged estimator. In classical statistics, Matheron’s k* is the length-weighted average grade of each of his panneaux in his set. What Matheron failed to derive was var(k*), the variance of his estimateur. On the contrary, he computed the length-weighted average grade of each panneau but did not compute the variance of its central value. In time, he replaced length-weighted average grades for three-dimensional sample spaces such as Matheronian blocks of ore with more abundant distance-weighted average grades for zero-dimensional sample spaces such as Matheronian points.

A central doctrine of geostatistics is that spatial dependence need not be verified but may be assumed to exist between two or more Matheronian points, determined in samples selected at positions with different coordinates. This doctrine of assumed causality is the quintessence of Matheron's new science of geostatistics. The question remains whether assumed causality makes sense in any other scientific discipline. The more so because central values such as distance- and length-weighted averages metamorphosed so smoothly into either kriged estimates or kriged estimators.

Matheron’s 1967 Kriging, or ''Polynomial Interpolation Procedures? A contribution to polemics in mathematical geology'', praises the precise probabilistic background of kriging and finds least-squares polynomial interpolation wanting. In fact, Matheron preferred kriging because it gives infinite sets of kriged estimates or kriged estimators in finite three-dimensional sample spaces. Infinite sets of points on polynomials were rather restrictive for Matheron’s new science of geostatistics.

Books on kriging

 * David, M (1988) Handbook of Applied Advanced Geostatistical Ore Reserve Estimation, Elsevier Scientific Publishing
 * Cressie, N (1993) Statistics for spatial data, Wiley, New York
 * Journel, A.G. and C.J. Huijbregts (1978) Mining Geostatistics, Academic Press London
 * Goovaerts, P. (1997) Geostatistics for Natural Resources Evaluation, Oxford University Press, New York
 * Wackernagel, H. (1995) Multivariate Geostatistics - An Introduction with Applications., Springer Berlin
 * Chiles, J.-P. and P. Delfiner (1999) Geostatistics, Modeling Spatial uncertainty, Wiley Series in Probability and statistics.

Historical references

 * 1) Agterberg, F P, Geomathematics, Mathematical Background and Geo-Science Applications, Elsevier Scientific Publishing Company, Amsterdam, 1974
 * 2) Krige, D.G, A statistical approach to some mine valuations and allied problems at the Witwatersrand, Master's thesis of the University of Witwatersrand, 1951,
 * 3) Link, R F and Koch, G S, Experimental Designs and Trend-Surface Analsysis, Geostatistics, A colloquium, Plenum Press, New York, 1970
 * 4) Matheron, G., "Principles of geostatistics", Economic Geology, 58, pp 1246--1266, 1963
 * 5) Matheron, G., "The intrinsic random functions, and their applications", Adv. Appl. Prob., 5, pp 439-468, 1973
 * 6) Merriam, D F, Editor, Geostatistics, a colloquium, Plenum Press, New York, 1970