Data assimilation

Recursive Bayesian estimation is known in geosciences applications as data assimilation, perhaps most importantly in weather forecasting and hydrology. Data assimilation proceeds by analysis cycles. In each analysis cycle, observations of the current (and possibly, past) state of a system are combined with the results from a mathematical model (the forecast) to produce an analysis, which is considered as 'the best' estimate of the current state of the system. This is called the analysis step. Essentially, the analysis step tries to balance the uncertainty in the data and in the forecast. The model is then advanced in time and its result becomes the forecast in the next analysis cycle.

The analysis and forecasts are best thought of as probability distributions. The analysis step is an application of the Bayes theorem. Advancing the probability distribution in time would be done exactly in the general case by the Fokker-Planck equation, but that is unrealistically expensive, so various approximations operating on simplified representations of the probability distributions are used instead. If the probability distributions are normal, they can be represented by their mean and covariance, which gives rise to the Kalman filter. However it is not feasible to maintain the covariance because of the large number of degrees of freedom in the state, so various approximations are used instead.

Many methods represent the probability distributions only by the mean and impute some covariance instead. In the basic form, such analysis step is known as optimal statistical interpolation. Adjusting the initial value of the mathematical model instead of changing the state directly at the analysis time is the essence of the variational methods, 3DVAR and 4DVAR. Nudging, also known as Newtonian relaxation or 4DDA, is essentially the same as proceeding in continuous time rather than in discrete analysis cycles (the Kalman-Bucy filter), again with imputing simplified covariance.

Ensemble Kalman filters represent the probability distribution by an ensemble of simulations, and the covariance is approximated by sample covariance.

In addition to weather forecasting, other uses of DA include trajectory estimation for the Apollo program, GPS, and atmospheric chemistry.

Weather forecasting applications
Data assimilation is a concept encompassing any method for combining observations of variables like temperature, and atmospheric pressure into numerical models as the ones used to predict weather.

In weather forecasting there are 2 main types of data assimilation: 3 dimensional (3DDA) and 4 dimensional (4DDA). In 3DDA only those observations are used available at the time of analyses. In 4DDA the past observations are included (thus, time dimension added).

History of Data Assimilation in Weather forecasting
The first data assimilation methods were called the "objective analyses" (e.g., Cressman algorithm). This was in contrast to the "subjective analyses", when (in the past practice) numerical weather predictions (NWP) forecasts were adjusted by meteorologists using their operational expertise. The objective methods used simple interpolation approaches, and thus were the kind of 3DDA methods.

The similar 4DDA methods, called "nudging" also exist (e.g. in MM5 NWP model). They are based on the simple idea of Newtonian relaxation (the 2nd axiom of Newton). The idea is to add in the right part of dynamical equations of the model the term, proportional to the difference of the calculated meteorological variable and the observation value. This term, that has a negative sign "keeps" the calculated state vector closer to the observations. Nudging can be interpreted as a variant of the Kalman-Bucy filter (a continuous time version of the Kalman filter) with the gain matrix prescribed rather than obtained from covariances.

The breakthrough in the field of data assimilation was achieved by L. Gandin (1963) who introduced the "statistical interpolation" (or "optimal interpolation" ) method. His work developed the previous ideas of Kolmogorov. That method is the 3DDA method and is a kind of regression analyses, which utilizes the information about the spatial distributions of covariance functions of the errors of the "first guess" field (previous forecast) and "true field". These functions are never known. However, the different approximations were assumed.

In fact the optimal interpolation algorithm is the reduced version of the Kalman filtering (KF) algorithm, when the covariance matrices are not calculated from the dynamical equations, but are pre-determined in advance. The attempts to introduce the KF algorithms as a 4DDA tool for NWP models were performed later. However, this was (and remains) a very difficult task, since the full version of KF algorithm requires solution of the enormous large number of additional equations (~N*N~10**12, where N=Nx*Ny*Nz is the size of the state vector, Nx~100, Ny~100, Nz~100 - the dimensions of the computational grid). To overcome that difficulty the special kind of KF algorithms (approximate or suboptimal KF's) for NWP models were developed. These include, e.g., the Ensemble Kalman filter and the Reduced-Rank Kalman filters (RRSQRT) (e.g., Todling and Cohn, 1994).

Another significant advance in the development of the 4DDA methods was utilizing the optimal control theory (variational approach) in the works of Le Dimet and Talagrand, 1986, based on the previous works of G. Marchuk, who was the first to apply that theory in the environmental modeling. The significant advantage of the variational approaches is that the meteorological fields satisfy the dynamical equations of the NWP model and at the same time they minimize the functional, characterizing their difference from observations. Thus, the problem of constrained minimization is solved. The 3DDA variational methods were for the first time developed by (Sasaki, 1958).

As was shown by Lorenz, 1986, the all abovementioned kinds of 4DDA methods are in some limit equivalent, i.e. under some assumptions they minimize the same cost function. However, in practical applications these assumptions are never fulfilled, the different methods perform differently and generally it is not clear, what approach (Kalman filtering or variational) is better. The fundamental questions also arise in application of the advanced DA techniques such as convergence of the computational method to the global minimum of the functional to be minimised. For instance, cost function or the set in which the solution is sought can be not convex.

Future Development in NWP
The rapid development of the various data assimilation methods for NWP models is connected with the two main points in the field of numerical weather prediction:


 * 1) Utilizing the observations currently seems to be the most promising chance to improve the quality of the forecasts at the different spatial scales (from the planetary scale to the local city, or even street scale) and time scales.
 * 2) The number of different kinds of available observations (sodars, radars, satellite) is rapidly growing.

The question is: can the principal limit of the predictability of the weather forecast models be overcome (and to what extend) with the help of data assimilation?

Cost function
The process of creating the analysis in data assimilation often involves minimization of a "cost function." A typical cost function would be the sum of the squared deviations of the analysis values from the observations weighted by the accuracy of the observations, plus the sum of the squared deviations of the forecast fields and the analyzed fields weighted by the accuracy of the forecast. This has the effect of making sure that the analysis does not drift too far away from observations and forecasts that are known to usually be reliable.

1. 3D-Var

$$J(\mathbf{x}) = (\mathbf{x}-\mathbf{x}_{b})^{\mathrm{T}}\mathbf{B}^{-1}(\mathbf{x}-\mathbf{x}_{b}) + (\mathbf{y}-\mathit{H}[\mathbf{x}])^{\mathrm{T}}\mathbf{R}^{-1}(\mathbf{y}-\mathit{H}[\mathbf{x}]),$$

where $$\mathbf{B}$$ denotes the background error covariance, $$\mathbf{R}$$ the observational error covariance.

$$\nabla J(\mathbf{x}) = 2\mathbf{B}^{-1}(\mathbf{x}-\mathbf{x}_{b}) - 2\mathit{H}\mathbf{R}^{-1}(\mathbf{y}-\mathit{H}[\mathbf{x}])$$

2. 4D-var

$$J(\mathbf{x}) = (\mathbf{x}-\mathbf{x}_{b})^{\mathrm{T}}\mathbf{B}^{-1}(\mathbf{x}-\mathbf{x}_{b}) + \sum_{i=0}^{n}(\mathbf{y}_{i}-\mathit{H}_{i}[\mathbf{x}_{i}])^{\mathrm{T}}\mathbf{R}_{i}^{-1}(\mathbf{y}_{i}-\mathit{H}_{i}[\mathbf{x}_{i}])$$

provided that $$\mathit{H}$$ is linear operator (matrix).

Other applications of Data Assimilation
Data assimilation methods are currently also used in other environmental forecasting problems, e.g. in hydrological forecasting. Basically, the same types of data assimilation methods as those described above are in use there. An example of chemical data assimilation using AutoChem can be found at CDACentral.

Data assimilation is a part of the challenge for every forecasting problem.

Dealing with biased data is a serious challenge in data assimilation. Further developing methods to deal with biases will be of particular use. If there are several instruments observing the same variable then intercomparing them using probability distribution functions can be instructive. Such an analysis is available on line at PDFCentral designed for the validation of observations from the NASA Aura satellite.