Autoregressive integrated moving average

Overview
In statistics, an autoregressive integrated moving average (ARIMA) model is a generalisation of an autoregressive moving average or (ARMA) model. These models are fitted to time series data either to better understand the data or to predict future points in the series. The model is generally referred to as an ARIMA(p,d,q) model where p, d, and q are integers greater than or equal to zero and refer to the order of the autoregressive, integrated, and moving average parts of the model respectively.

Given a time series of data $$X_t$$ where $$t$$ is an integer index and the $$X_t$$ are real numbers, then an ARMA(p,q) model is given by
 * $$\left(1 - \sum_{i=1}^p \phi_i L^i\right) X_t = \left(1 + \sum_{i=1}^q \theta_i L^i\right) \varepsilon_t\,$$

where $$L$$ is the lag operator, the $$\phi_i$$ are the parameters of the autoregressive part of the model, the $$\theta_i$$ are the parameters of the moving average part and the $$\varepsilon_t$$ are error terms. The error terms $$\varepsilon_t$$ are generally assumed to be independent, identically distributed variables sampled from a normal distribution with zero mean.

An ARIMA(p,d,q) process is obtained by integrating an ARMA(p,q) process. That is,
 * $$\left(1 - \sum_{i=1}^p \phi_i L^i\right) (1-L)^d X_t = \left(1 + \sum_{i=1}^q \theta_i L^i\right) \varepsilon_t\,$$

where d is a positive integer that controls the level of differencing (or, if $$d=0$$, this model is equivalent to an ARMA model). Conversely, applying term-by-term differencing d times to an ARMA(p,q) process gives an ARIMA(p,d,q) process. Note that it is only necessary to difference the AR side of the ARMA representation, because the MA component is always I(0).

It should be noted that not all choices of parameters produce well-behaved models. In particular, if the model is required to be stationary then conditions on these parameters must be met.

Some well-known special cases arise naturally. For example, an ARIMA(0,1,0) model is given by:
 * $$X_t = X_{t-1} + \varepsilon_t$$

which is simply a random walk.

A number of variations on the ARIMA model are commonly used. For example, if multiple time series are used then the $$X_t$$ can be thought of as vectors and a VARIMA model may be appropriate. Sometimes a seasonal effect is suspected in the model. For example, consider a model of daily road traffic volumes. Weekends clearly exhibit different behaviour from weekdays. In this case it is often considered better to use a SARIMA (seasonal ARIMA) model than to increase the order of the AR or MA parts of the model. If the time-series is suspected to exhibit long-range dependence then the $$d$$ parameter may be replaced by certain non-integer values in a Fractional ARIMA (FARIMA also sometimes called ARFIMA) model.