class: center, middle, inverse, title-slide # Autoregressive moving average models ### Kevin Kotzé --- <!-- layout: true --> <!-- background-image: url(image/logo.svg) --> <!-- background-position: 2% 98% --> <!-- background-size: 10% --> --- # Contents 1. Introduction 1. Moving Average Models 1. Autoregressive Models 1. ARMA Models 1. Seasonal ARMA Models 1. Model Specification and Parameter Estimation 1. Structural Breaks 1. Conclusion --- # Univariate models for persistent data - Dominant feature of many time series is that today's values are close to tomorrow's values - Observations are not independent, but autocorrelated - Need to account for this behaviour in the explained part of the model, otherwise it will be captured by the error, which violates the assumptions of the model - Example of stochastic process: `\begin{equation} y_t = 0.7 y_{t-1} + \varepsilon_t \end{equation}` - This could represent an example of a linear stochastic difference equation, that includes discrete information - Descriptive information should be used to populate the coefficient and random noise should be contained in the error --- # Moving average models - Linear combination of white noise (i.e. `\(\varepsilon_{t}\)`), such that the `\(MA(1)\)` may take the form, `\begin{equation} y_{t}=\mu +\varepsilon_{t}+\theta \varepsilon_{t-1} \end{equation}` - where `\(\mu\)` is a constant, while `\(\varepsilon_{t}\)` and `\(\varepsilon_{t-1}\)` are independent and identically distributed white noise, `\(\varepsilon_{t}\sim \mathsf{i.i.d.} \;\; \mathcal{N}(0,\sigma^{2})\)` - To determine whether the `\(MA(1)\)` process is stationary, we calculate the different moments --- # MA models - Expected Mean - Note that `\(\mathbb{E}[\varepsilon_{t}] =0\)` and `\(\mathbb{E}[\varepsilon_{t}^2] = \sigma^2\)`, `\begin{eqnarray} \mathbb{E}\left[ y_{t}\right] &=&\mathbb{E}[\mu +\varepsilon_{t}+\theta \varepsilon_{t-1}] \\ &=&\mu +\mathbb{E}[\varepsilon_{t}]+\theta \mathbb{E}\left[ \varepsilon_{t-1}\right] \\ &=&\mu \end{eqnarray}` - Since error terms are `\(\mathsf{i.i.d.}\)` and their expected mean value is zero - Hence, the mean for this process is `\(\mu\)`, which is constant and does not depend on time --- # MA models - Variance `\begin{eqnarray} \mathsf{var}[y_{t}] &=&\mathbb{E}\big[ y_{t}-\mathbb{E}[y_{t}] \big]^2 \\ &=&\mathbb{E}\big[ \left( \mu +\varepsilon_{t}+\theta \varepsilon_{t-1}\right) -\mu \big]^2 \\ &=&\mathbb{E}[\varepsilon_{t}]^{2}+2\theta \mathbb{E}[\varepsilon_{t}\varepsilon_{t-1}]+\mathbb{E}[\theta \varepsilon_{t-1}]^{2} \\ &=& \sigma^{2} + 0 + \theta \sigma^2 \\ &=&\left( 1+\theta^{2}\right) \sigma^{2} \end{eqnarray}` - which is constant and does not depend on time --- # MA models - Covariance - For the first lag, `\begin{eqnarray} \mathsf{cov}[y_{t},y_{t-1}] &=&\mathbb{E}\Big[ \big( y_{t}-\mathbb{E}\left[ y_{t}\right] \big) \big( y_{t-1}-\mathbb{E}\left[ y_{t-1}\right] \big) \Big] \\ &=&\mathbb{E}\big[ \left( \varepsilon_{t}+\theta \varepsilon_{t-1}\right) \left( \varepsilon_{t-1}+\theta \varepsilon_{t-2}\right) \big] \\ &=&\mathbb{E}\left[ \varepsilon_{t}\varepsilon_{t-1}]+\theta \mathbb{E}[\varepsilon^{2}_{t-1}]+\mathbb{E}[\theta \varepsilon_{t}\varepsilon_{t-2}\right] +\mathbb{E}[\theta^{2} \varepsilon_{t-1}\varepsilon_{t-2}] \\ &=&0+\theta \sigma^{2}+0+0 \\ &=&\theta \sigma^{2} \end{eqnarray}` - which is constant and does not depend on time --- # MA models - Covariance - For the general case of `\(j\)` lags, `\begin{eqnarray} \mathsf{cov}[y_{t},y_{t-j}] &=&\mathbb{E}\Big[ \big( y_{t}-\mathbb{E}\left[ y_{t}\right] \big) \big( y_{t-j}-\mathbb{E}\left[ y_{t-j}\right] \big) \Big] \\ &=&\mathbb{E}\big[ \left( \varepsilon_{t}+\theta \varepsilon_{t-1}\right) \left(\varepsilon_{t-j}+\theta \varepsilon_{t-j}\right) \big] \\ &=&0 \;\;\;\; \text{for} \;\; j > 1 \end{eqnarray}` - which is constant and does not depend on time --- # MA models - Stationarity - Neither the mean, variance nor covariances depend on time - Hence the `\(MA(1)\)` process is covariance stationary - Such a `\(MA(1)\)` process is stationary regardless of the value `\(\theta\)` --- # MA models - ACFs - ACF for a `\(MA(1)\)` may then be derived from the expression, `\begin{eqnarray} \rho \left(j\right) \equiv \frac{\gamma \left( j\right) }{\gamma \left( 0\right) } = \frac{\mathsf{cov} [ y_{t},y_{t-j} ] }{\mathsf{var} [ y_{t} ] } \end{eqnarray}` - Hence, `\begin{eqnarray} \rho \left( 1\right) &=&\frac{\theta }{\left( 1+\theta^{2}\right) } \\ \rho \left( j\right) \ &=&0 \;\;\;\; \text{for } \;\; j > 1 \end{eqnarray}` - for lag orders `\(j > 1\)`, the autocorrelations are zero --- background-image: url(image/ma.svg) background-position: top background-size: 90% 90% class: clear, center, bottom Figure 1: Simulated `\(MA(1)\)`: `\(\varepsilon_t - 0.5\varepsilon_{t-1}\)` --- background-image: url(image/ma_acf.svg) background-position: top background-size: 90% 90% class: clear, center, bottom Figure 2: Autocorrelation Functions for `\(MA(1)\)`: `\(\varepsilon_t - 0.5\varepsilon_{t-1}\)` --- # MA models - Higher Order - Finite order `\(MA(q)\)` process may be, `\begin{equation} y_{t}=\mu +\varepsilon_{t}+\theta_{1}\varepsilon_{t-1}+\theta_{2} \varepsilon_{t-2}+ \ldots +\theta_{q}\varepsilon_{t-q} \end{equation}` - Infinite-order moving average process, `\(MA(\infty)\)`, `\begin{equation} y_{t}=\mu +\overset{\infty }{\underset{j=0}{\sum }}\theta_{j}\varepsilon_{t-j}=\mu +\theta_{0}\varepsilon_{t}+\theta_{1}\varepsilon_{t-1}+\theta_{2}\varepsilon_{t-2}+ \ldots \end{equation}` - With `\(\theta _{0} = 1\)` --- # MA models - Higher Order - After excluding extreme cases, `\begin{equation} \overset{\infty }{\underset{j=0}{\sum }}|\theta_{j}|<\infty \end{equation}` - which implies that the coefficients are absolute summable - Moreover, the process is covariance-stationary when, `\begin{equation} \overset{\infty }{\underset{j=0}{\sum }}|\gamma_{j}|<\infty \end{equation}` --- # MA models - Identifying the order - With a `\(MA(1)\)` process the effect of the shock `\(\varepsilon_{t-1}\)` affects the value of `\(y_t\)` - Hence, the value for the first autocorrelation, `\(\rho(1)\)` should differ from zero but the others would not - With a `\(MA(2)\)` process the effect of the shocks `\(\varepsilon_{t-1}\)` and `\(\varepsilon_{t-2}\)` affect the value of `\(y_t\)` - Hence, the value for the first two autocorrelations, `\(\rho(1)\)` and `\(\rho(2)\)` should differ from zero but the others would not - This would allow us to use the ACF to identify the order of an `\(MA(q)\)` process --- background-image: url(image/ma_acf1_3.svg) background-position: top background-size: 85% 85% class: clear, center, bottom Figure 3: Identifying the order - `\(MA(1)\)`, `\(MA(2)\)` & `\(MA(3)\)` process --- # AR models - Solutions - Given the `\(AR(1)\)`, `\begin{equation} y_{t}=\phi y_{t-1}+\varepsilon_{t} \end{equation}` - Relates the value of a variable `\(y\)` at time `\(t\)`, to its previous value at time `\((t-1)\)` and a random disturbances `\(\varepsilon\)`, also at time `\(t\)` - Assuming that `\(\varepsilon_{t}\)` is independent and identically distributed white noise, `\(\varepsilon_{t}\sim \mathsf{i.i.d.} \mathcal{N}(0,\sigma^{2})\)` - We showed that if `\(|\phi |<1\)`, the `\(AR(1)\)` is covariance-stationary, `\begin{eqnarray} \mathbb{E}\left[ y_{t}\right] &=&0 \\ \mathsf{var}[y_{t}] &=&\frac{\sigma^{2}}{1-\phi^2 } \\ \mathsf{cov}[y_{t},y_{t-j}] &=&\phi^{j} \mathsf{var}[y_{t}] \end{eqnarray}` - To prove this we use recursive substitution, method of undetermined coefficients, or lag operators --- # AR models - Recursive Substitution - Starting at some period of time, `\(j\)` `\begin{eqnarray} y_{t} &=&\phi y_{t-1}+\varepsilon_{t} \\ &=& \phi (\phi y_{t-2}+\varepsilon_{t-1})+\varepsilon_{t} \\ &=& \phi ^{2}(\phi y_{t-3}+\varepsilon_{t-2})+\phi \varepsilon_{t-1}+\varepsilon_{t} \\ &=& \vdots \\ &=& \phi^{j+1}y_{t-(j+1)}+\phi^{j}\varepsilon_{t-j} + \ldots + \phi^{2}\varepsilon_{t-2} + \phi \varepsilon_{t-1} + \varepsilon_{t} \end{eqnarray}` - Explains `\(y\)` as a linear function of the initial value `\(y_{t-(j+1)}\)` and the historical values of `\(\varepsilon_{t}\)` - If `\(|\phi | <1\)` and `\(j\)` becomes large, `\(\phi^{j+1}y_{t-(j+1)}\rightarrow 0\)` - Thus, the `\(AR(1)\)` can be expressed as an `\(MA(\infty)\)` - Note that if `\(|\phi | >1\)` and `\(j\)` becomes large, `\(\phi^{j}\rightarrow \infty\)` - Hence, the equivalent of an autoregressive random walk is an moving average with coefficients that are not summable --- # AR models - Lag operators - Lag operators are particularly useful when dealing with more complex model structures - The straightforward `\(AR(1)\)` model can be written as, `\begin{equation} \left( 1-\phi L\right) y_{t}=\varepsilon_{t} \end{equation}` - Such a sequence `\(\left\{ y_{t}\right\}_{t=-\infty }^{\infty}\)` is bounded if there exists a finite number `\(k\)`, such that `\(|y_{t}| <k\)` for all `\(t\)` - Provided `\(|\phi | <1\)` and we restrict ourselves to bounded sequences, we can multiply by `\(\left(1-\phi L\right) ^{-1}\)` on both sides of the equality (the process is invertible), `\begin{eqnarray} \left( 1-\phi L\right)^{-1} \left( 1-\phi L\right) y_{t}&=&\left( 1-\phi L\right)^{-1}\varepsilon_{t} \\ y_{t}&=&\left( 1-\phi L\right)^{-1}\varepsilon_{t} \end{eqnarray}` --- # AR models - Lag operators - Under the assumption that `\(|\phi |<1\)`, we can apply the geometric rule, `\begin{equation} \left( 1-\phi L\right)^{-1}=\underset{j\rightarrow \infty }{\lim }\left( 1+\phi L+\left( \phi L\right)^{2}+ \ldots +\left( \phi L\right)^{j}\right) \end{equation}` - This is based on the expression, `\(\left( 1-z\right)^{-1}=1+z+z^{2}+z^{3}+ \ldots \;\)`, which holds if `\(|z| < 1\)` - Using this we can solve for, `\begin{equation} y_{t}=\varepsilon_{t}+\phi \varepsilon_{t-1}+\phi^{2}\varepsilon_{t-2}+\phi^{3}\varepsilon_{t-3}+ \ldots =\overset{\infty }{\underset{j=0}{\sum }}\phi^{j}\varepsilon_{t-j} \end{equation}` --- # AR models - Lag operators - This expression could be written as a `\(MA(\infty)\)`, `\begin{equation} y_{t}=\varepsilon_{t}+\theta_{1}\varepsilon_{t-1}+\theta_{2}\varepsilon_{t-2}+\theta_{3}\varepsilon_{t-3}+ \ldots =\overset{\infty}{\underset{j=0}{\sum }}\theta_{j}\varepsilon_{t-j} \end{equation}` - Therefore, when `\(|\phi |\)` `\(<1\)`, `\begin{equation} \overset{\infty }{\underset{j=0}{\sum }}|\theta _{j}|=\overset{\infty }{\underset{j=0}{\sum }}|\phi ^{j}| \end{equation}` --- # AR models - Unconditional Moments - The unconditional first-and second-order moments of a stable `\(AR(1)\)` process may be represented by an `\(MA(\infty)\)`, - Where for `\(y_{t}=\phi y_{t-1}+\varepsilon_{t}\)`, `\begin{equation} \mathbb{E}\left[ y_{t}\right] = \mathbb{E}\left[ \varepsilon_{t}+\phi \varepsilon_{t-1}+\phi^{2}\varepsilon_{t-2}+\phi^{3}\varepsilon_{t-3}+ \ldots \right] =0 \end{equation}` - The variance is then, `\begin{eqnarray} \gamma \left[ 0\right] &=&\mathsf{var}\left[ y_{t}\right] =\mathbb{E}\big[ y_{t}-\mathbb{E}\left[ y_{t}\right] \big]^{2} \\ &=&\mathbb{E}\left[ \varepsilon_{t}+\phi \varepsilon_{t-1}+\phi^{2}\varepsilon_{t-2}+\phi^{3}\varepsilon_{t-3}+ \ldots \right]^{2} \\ &=&\mathsf{var}\left[ \varepsilon_{t}\right] +\phi ^{2}\mathsf{var}\left[ \varepsilon_{t-1}\right] +\phi^{4}\mathsf{var}\left[\varepsilon_{t-2}\right] +\phi^{6}\mathsf{var}\left[ \varepsilon_{t-3}\right] + \ldots \\ &=&\left( 1+\phi^{2}+\phi^{4}+\phi^{6}+ \ldots \; \right) \sigma^{2} \\ &=&\frac{1}{1-\phi^{2}}\sigma^{2} \end{eqnarray}` --- # AR models - Unconditional Moments - The first order covariance is then, `\begin{eqnarray} \gamma \left( 1\right) &=&\mathbb{E}\Big[ \big(y_{t}-\mathbb{E}\left[ y_{t}\right] \big)\big(y_{t-1}-\mathbb{E}\left[ y_{t-1}\right] \big) \Big] \\ &=&\mathbb{E}\left[ (\varepsilon_{t}+\phi \varepsilon_{t-1}+\phi^{2}\varepsilon_{t-2}+ \ldots )\times (\varepsilon_{t-1}+\phi \varepsilon_{t-2}+ \ldots )\right] \\ &=&\left( \phi +\phi^{3}+\phi^{5}+ \ldots \right) \sigma^{2}=\phi \left( 1+\phi^{2}+\phi^{4}+ \ldots \right) \sigma^{2} \\ &=&\phi \frac{1}{1-\phi^{2}}\sigma^{2} \\ &=&\phi \mathsf{var}\left[ y_{t}\right] \end{eqnarray}` - While for `\(j>1\)` we have, `\begin{equation} \gamma \left( j\right) =\mathbb{E}\Big[ \big(y_{t}-\mathbb{E}\left[ y_{t}\right] \big)\big( y_{t-j}-E \left[ y-j\right] \big) \Big] =\phi^{j} \mathsf{var} \left[ y_{t}\right] \end{equation}` - which proves the result relating to the stationarity of the `\(AR(1)\)` model when `\(|\phi|<1\)` --- # AR models - Unconditional Moments - As noted previously the ACF for an `\(AR(1)\)` process coincides with its impulse response function - where for the ACF of an `\(AR(1)\)` for `\(j = 1, \ldots ,J\)` `\begin{equation} \rho \left( 0\right) =\frac{\gamma \left( 0\right) }{\gamma \left( 0\right) } =1,\ \rho \left( 1\right) =\frac{\gamma \left( 1\right) }{\gamma \left( 0\right) }=\phi , \ldots , \rho \left( j\right) =\frac{\gamma \left( j\right) }{\gamma \left( 0\right) }=\phi^{j} \end{equation}` - Which equals the dynamic multipliers that may be summarised by the impulse response function `\begin{equation} \frac{\partial y_{t}}{\partial \varepsilon_{t}}=1,\frac{\partial y_{t}}{\partial \varepsilon_{t-1}}=\phi , \ldots , \frac{\partial y_{t}}{\partial \varepsilon_{t-j}}=\phi^{j} \end{equation}` --- background-image: url(image/ar12_acf.svg) background-position: top background-size: 90% 90% class: clear, center, bottom Figure 4: Autocorrelation functions for `\(AR(1)\)` processes --- # AR models - Adding a constant - To ascertain how the results change after adding a constant, `\begin{equation} y_{t}=\mu +\phi y_{t-1}+\varepsilon_{t} \end{equation}` - We can define `\(\upsilon_{t}=\mu +\varepsilon_{t}\)`, such that, `\begin{eqnarray} y_{t} &=&\phi y_{t-1}+\upsilon_{t} \\ y_{t} &=&(1-\phi L)^{-1}\upsilon_{t} \\ &=&\left( \frac{1}{1-\phi }\right) \mu +\varepsilon_{t}+\phi \varepsilon_{t-1}+\phi^{2}\varepsilon_{t-2}+ \ldots \end{eqnarray}` - with unconditional first moment, `\begin{equation} \mathbb{E}\left[ y_{t}\right] =\left( \frac{1}{1-\phi }\right) \mu \end{equation}` - which does not depend on time --- # AR models - Higher order processes - For higher-order autoregressive processes, things become a bit more complicated, where `\begin{equation} y_{t}=\phi_{1}y_{t-1}+\phi_{2}y_{t-2}+\varepsilon_{t} \end{equation}` - No longer able to consider the value of `\(\phi_{1}\)` alone to determine whether it is stationary - To complete the process we have to rewrite the `\(AR(2)\)` expression as a first order difference equation --- # AR models - Higher order processes - Using a vector, `\(Z_{t}\)`, which is of dimension `\((2 \times 1)\)`, `\begin{equation} Z_{t}= \left[ \begin{array}{c} {y_{t}}\\ {y_{t-1}} \end{array}\right] \end{equation}` - With a vector for the errors, `\begin{equation} \upsilon _{t}= \left[ \begin{array}{c} {\varepsilon_{t}}\\ {0} \end{array}\right] \end{equation}` - And the `\((2 \times 1)\)` matrix for the coefficients, `\begin{equation} \Gamma =\left[ \begin{array}{cc} \phi_{1} & \phi_{2} \\ 1 & 0 \end{array}\right] \end{equation}` --- # AR models - Higher order processes - The first-order vector difference equation can be written, `\begin{equation} Z_{t}=\Gamma Z_{t-1}+ \upsilon_{t} \end{equation}` - The matrix `\(\Gamma\)` is termed the *companion form* matrix of the `\(AR(2)\)` process - To check for stationarity we can compute the eigenvalues of this matrix - Moreover, the eigenvalues of `\(\Gamma\)` are two solutions of `\(x\)` polynomial that satisfy the characteristic equation: `\begin{equation} x^{2}-\phi_{1}x-\phi_{2}=0 \end{equation}` --- # AR models - Higher order processes - These eigenvalues `\((m_{1}\)` and `\(m_{2})\)` must then satisfy `\(\left( x-m_{1}\right) \left( x-m_{2}\right)\)`, and can be found from the formula: `\begin{equation} m_{1},m_{2}=\frac{\left( \phi_{1} \pm \sqrt{\phi_{1}^{2}+4\phi_{2}}\right)}{2} \end{equation}` - Stationarity requires that the eigenvalues are less than one in absolute value - In the `\(AR(2)\)` case, one can show that this will be the case if, `\begin{eqnarray} \phi_{1}+\phi_{2} &<&1 \\ -\phi_{1}+\phi_{2} &<&1 \\ \phi_{2} &>&-1 \end{eqnarray}` --- background-image: url(image/eigen.svg) background-position: top background-size: 85% 85% class: clear, center, bottom Figure 5: Eigenvalues for difference equation `\(x^{2}- 0.6 x - 0.2=0\)` --- # AR models - Higher order processes - The `\(AR(p)\)` can then be written as, `\begin{equation} y_{t}=\phi_{1} y_{t-1}+\phi_{2} y_{t-2}+ \ldots + \phi_{p}y_{t-p}+\varepsilon_{t} \end{equation}` - Checking for stationarity involves similar calculations - In this case the `\(\Gamma\)` matrix will be of the form: `\begin{equation} \Gamma =\left[ \begin{array}{cccccc} \phi _{1} & \phi _{2} & \phi _{2} & \dots & \phi _{p-1} & \phi _{p} \\ 1 & 0 & 0 & \dots & 0 & 0 \\ 0 & 1 & 0 & \dots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \dots & 1 & 0 \end{array}\right] \end{equation}` - Provided the eigenvalues are less than one in absolute value, (i.e. they lie within the unit circle), the `\(p^{\text{th}}\)` order autoregression is stable --- # AR models - Identify the order of AR(p) - As in the case of the `\(MA(q)\)` processes one could use the ACF coefficients to identify the order of the `\(AR(p)\)` process - However, as the `\(AR(p)\)` process passes on the persistence to successive lags so the ACF would not be useful - As the PACF removes the effects of the persistence that is passed on from intervening lags of the `\(AR(p)\)` process it may be used to identify the order of an `\(AR(p)\)` process --- # ARMA models - We can specify an `\(ARMA(1,1)\)` process as, `\begin{equation} y_{t}=\phi y_{t-1}+\varepsilon_{t}+\theta \varepsilon_{t-1} \end{equation}` - Or using the lag polynomials, a general form of an ARMA model is, `\begin{equation} \phi \left( L\right) y_{t}= \theta \left( L\right) \varepsilon_{t} \end{equation}` - Note that the number of lags, `\((p)\)` and `\((q)\)`, can differ - For instance, an `\(ARMA(2,1)\)` combines an `\(AR(2)\)` with an `\(MA(1)\)`: `\begin{eqnarray} \left( 1-\phi_{1}L-\phi_{2}L^{2}\right) y_{t} &=&\left( 1+\theta_{1}L\right) \varepsilon_{t}\\ y_{t} &=&\phi_{1}y_{t-1}+\phi_{2}y_{t-2}+\varepsilon_{t} +\theta_{1}\varepsilon_{t-1} \end{eqnarray}` --- # ARMA processes - Whether an `\(ARMA(p,q)\)` process is stationary depends solely on its autoregressive past - Assume an `\(ARMA(1,1)\)` and using the lag operator, `\begin{equation} \left( 1-\phi L\right) y_{t}=\left( 1+\theta L\right) \varepsilon_{t} \end{equation}` - Multiplying by `\(\left( 1-\phi L\right)^{-1}\)` on both sides, `\begin{eqnarray} y_{t} &=&\frac{\left( 1+\theta L\right) }{\left( 1-\phi L\right) } \varepsilon_{t} \\ &=&\left( 1-\phi L\right)^{-1}\varepsilon_{t} + \left( 1-\phi L \right)^{-1} \theta_{1} \varepsilon_{t-1} \end{eqnarray}` --- # ARMA processes - When `\(|\phi | < 1\)`, this can be written as the geometric process, `\begin{eqnarray} y_{t} &=&\overset{\infty }{\underset{j=0}{\sum }}\left(\phi L\right)^{j}\varepsilon_{t}+\theta \overset{\infty }{\underset{j=0}{\sum }}\left(\phi L\right)^{j}\varepsilon_{t-1}\\ &=&\varepsilon_{t}+\overset{\infty }{\underset{j = 1}{\sum }}\phi^{j}\varepsilon_{t-j}+\theta \overset{\infty }{\underset{j=1}{\sum }}\phi^{j-1}\varepsilon_{t-j}\\ &=&\varepsilon_{t}+\overset{\infty }{\underset{j=1}{\sum }}\left( \phi^{j}+\theta \phi^{j-1}\right) \varepsilon_{t-1} \end{eqnarray}` --- background-image: url(image/arma1_acf.svg) background-position: top background-size: 90% 90% class: clear, center, bottom Figure 6: ACF and PACF for `\(AR(1)\)` with `\(\phi=0.5\)` --- background-image: url(image/arma2_acf.svg) background-position: top background-size: 90% 90% class: clear, center, bottom Figure 7: ACF and PACF for `\(MA(1)\)` with `\(\theta=0.6\)` --- background-image: url(image/arma3_acf.svg) background-position: top background-size: 90% 90% class: clear, center, bottom Figure 8: ACF and PACF for `\(ARMA(1,1)\)` with `\(\phi=0.5\)` and `\(\theta=0.6\)` --- # Autocorrelation patterns - When combining the AR and MA correlation functions the results may be somewhat unclear - Possibly `\(ARMA(2,2)\)`, `\(ARMA(1,2)\)`, `\(ARMA(2,1)\)`, `\(ARMA(1,1)\)`, `\(MA(2)\)` or `\(AR(2)\)` --- # Seasonal ARMA Models - In several cases the dependence on the past occurs with a seasonal lag `\(s\)` - With monthly economic data the behaviour in Jan 2010 may be related to Jan 2011 - Could introduce autoregressive and moving average terms that arise at a seasonal interval - For example, `\(ARMA(p,q)_s\)` model that takes the form `\(ARMA(1,1)_{12}\)` would be written as, `\begin{eqnarray*} y_t = \phi y_{t-12} + \varepsilon_t + \theta \varepsilon_{t-12} \end{eqnarray*}` - Estimation is relatively straightforward --- # Seasonal ARMA Models - Identification - The `\(MA(1)\)` with a seasonal `\((s = 12)\)`, which could be written as, `\(y_t = \varepsilon_t + \theta \varepsilon_{t-12}\)` - It is easy to verify that `\begin{eqnarray} \gamma(0) &=& (1 + \theta^2)\sigma^2\\ \gamma(12) &=& \theta \sigma^2\\ \gamma(j) &=& 0, \;\; \text{for values where } j \ne 12 \end{eqnarray}` - The only non-zero autocorrelation, aside from lag zero is, `\(\rho(12) = \theta / (1+\theta^2)\)` --- # Seasonal ARMA Models - Identification - Similarly, for the `\(AR(1)\)` model with seasonal `\((s = 12)\)`, we could calculate, `\begin{eqnarray} \gamma(0) &=& \sigma^2 / (1 - \phi^2)\\ \gamma(12) &=& \sigma^2 \phi^k /( 1 - \phi^2) \;\; \text{for } k = 1, 2, \ldots\\ \gamma(j) &=& 0, \;\; \text{for values where } j \ne 12 \end{eqnarray}` - Results suggest the PACF from non-seasonal are analogous to the seasonal models --- # Seasonal ARMA Models - Identification - Could allow for mixed seasonal models in the general `\(ARMA(p,q)_s\)` framework, `\begin{eqnarray} y_t = \phi y_{t-12} + \varepsilon_t + \theta \varepsilon_{t-1} \end{eqnarray}` - While estimation would be straightforward, the identification of the structural form may be problematic --- # Box-Jenkins methodology - In a real world application we would not know the functional form of the underlying data generating process - The respective parameters in these models would then need to be estimated - Thereafter, we could assess the model fit - This procedure is encapsulated in the Box & Jenkins (1979) methodology - Identification, Estimation, Diagnostic testing --- # Box-Jenkins - Identification - Examine the time plot of the data to - detect and correct for outliers, missing values, structural breaks (if possible) - detect nonstationary by a pronounced trend or prolonged meander (possibly correct) - If you are uncertain about the degree of stationarity then perform unit root tests - plot ACF and PACF to consider the persistence in the data - when ACF quickly returns to zero then there will be no unit root - Alternatively, if you think that the data represents white noise then use `\(Q\)`-statistic --- # Box-Jenkins - Identification - Calculate `\(Q\)`-statistic to test whether a group of autocorrelations is different from zero - Originally developed by Box-Pierce (1970) better small-sample performance reported by Ljung and Box (1978) `\begin{eqnarray} Q = T(T +2) \sum_{k=1}^{s}\rho_j^2 / (T-j) \end{eqnarray}` - If sample value of `\(Q\)` exceeds the critical value `\(\chi^2\)` with `\(s\)` degrees of freedom then at least one value of `\(\rho_j\)` is statistically different from zero at specified significance level --- # Box-Jenkins - Identification - Examine the ACF and PACF functions more closely to try to identify the order of a potential `\(ARMA(p,q)\)` - For the ACF and PACF functions that were provided previously we would consider an `\(ARMA(2,2)\)`, an `\(ARMA(1,2)\)`, an `\(ARMA(2,1)\)` or an `\(ARMA(1,1)\)` - Would also think about using a `\(MA(2)\)` or an `\(AR(2)\)`, but not a `\(MA(1)\)` or an `\(AR(1)\)` --- # Box-Jenkins - Estimation Stage Fit each of the candidate models and examine the various `\(\phi_i\)` and `\(\theta_i\)` coefficients according to: - Parsimony: - Additional coefficients increase fit but reduce degrees of freedom - Parsimonious models often produce better out-of-sample fit - Stationarity and Invertibility: - Distribution theory underlying the use of sample ACF and PACF as approximations for the true DGP assume that `\({y_t}\)` is stationary - `\(t\)`-statistics and `\(Q\)`-statistics presume that the data is stationary - Be suspicious if the estimated value of `\(|\phi_1|\)` is close to unity - Model must be invertible since the ACF and PACF assume that `\(y_t\)` can be approximated by an `\(AR(1)\)` model where `\(|\phi_1|<1\)` --- # Box-Jenkins - Estimation Stage - To evaluate the different candidate models consider the goodness-of-fit measures: - Look at `\(R^2\)` and average of the residual sum of squares - AIC and BIC are more suitable criteria since they weigh-up parsimony and "goodness-of-fit" - Smaller values of a AIC are better (or where AIC `\(< 0\)`, choose the model with the most negative statistic) --- # Box-Jenkins estimation - AIC & BIC - Adding additional lags will reduce the sum of squares of the estimated residuals (and will lead to a higher `\(R^2\)`) - But you will also loose degrees of freedom (which may be essential) - Akaike and Bayesian Information Criteria test for goodness of fit, while prizing parsimony `\begin{eqnarray} AIC = \log \hat{\sigma}^2_k+\frac{T+ 2k}{T}\\ BIC = \log \hat{\sigma}^2_k+\frac{k \log T}{T} \end{eqnarray}` - where `\(k =\)` number of estimated parameters and `\(T=\)` nobs - `\(\hat{\sigma}^2_k = \frac{SSR_k}{T}\)` is the variance of the residual sum of squares --- # Box-Jenkins - Estimation Stage - Make sure `\(T\)` is fixed, when comparing `\(AR(1)\)` & `\(AR(2)\)` - Including a parameter must decrease `\(SSR_k\)` if AIC or BIC is to decrease - Since `\(\log T\)` is greater than `\(2\)`, BIC likes more parsimonious models --- # Box-Jenkins - Diagnostic checking - Plot the residuals to look for outliers or periods where the model does not fit the data - Construct ACF and PACF of the residuals - Serial correlation in the residuals implies that a systematic movement in the `\(y_t\)` sequence is not accounted for by the `\(ARMA(p,q)\)` coefficients - Those models should be eliminated and re-estimated - Use `\(Q\)`-statistic to determine whether any or all of the ACF or PACF coefficients are significant - When applying the `\(Q\)`-statistic to the residuals of an `\(ARMA(p,q)\)` model use `\(\chi^2\)` with `\(s-p-q\)` degrees of freedom - Ensure that the standard errors for the coefficient estimates are appropriate, if not re-estimate model --- # Box-Jenkins - Diagnostics & Forecasts - If possible fit `\(ARMA(p,q)\)` models to subsamples - stability of DGP `\begin{eqnarray} F = \frac{(RSS - RSS_1 - RSS_2)/n}{(RSS_1 + RSS2)/(T-2k)} \end{eqnarray}` - where `\(k\)` is the number of parameters, i.e. `\(p + q + 1\)` (with constant) - If all the coefficients are equal `\((RSS_1 + RSS_2)\)` should equal `\(RSS\)` and `\(F=0\)` - You could then use the model for forecasting `\(y_{T+1}, y_{T+1}, \ldots\)` for out-of-sample comparison --- # Structural Breaks - Chow's Breakpoint - Model two sub-samples of the data and see whether there are significant differences in the parameters - Test whether the null hypothesis of "no structural change" holds after constructing a `\(F\)` test statistic for the parameters - Could construct a model for changes at date `\(\tau\)` `\begin{eqnarray} y_t = x_t^{\top} \beta_t + \varepsilon_t \end{eqnarray}` - where `\begin{equation} \beta_{t} = \left\{ \begin{array}{lcl} \beta & \; & t \leq \tau \\ \beta + \delta & \; & t > \tau \\ \end{array}\right. \end{equation}` - Or alternatively we could test for change in all the model parameters with an `\(F\)` test --- # Structural Breaks - Chow's Breakpoint - Major drawback is that the change point must be known *a priori* - Must ensure that each sub-sample has at least as many observations as the number of estimated parameters --- # Structural Breaks - Quandt LR Test - Extension of the Chow test where an `\(F\)` test statistic is calculated for all potential breakpoints within an interval `\([\underline{i}, \overline{\imath}]\)` - Reject the null hypothesis of no structural change if the absolute value of any of the test statistics are too large - Takes the form of a sup `\(F\)` test - Asymptotic properties of this statistic are non-standard so use those that are referenced in the notes --- background-image: url(image/qlr.svg) background-position: top background-size: 90% 90% class: clear, center, bottom Figure 9: Quandt Likelihood Ratio Test - Breakpoint at observation 100 with `\(n=200\)` and `\(p =0.00\)` --- # Structural Breaks - CUSUM Test - CUSUM test is based on the cumulative sum of the recursive residuals - Plot the cumulative sum together with the 5% critical boundaries - If the cumulative sum breaks either of the two boundaries there is parameter instability and a possible structural break - Need to specify the model *a priori* to obtain the residuals --- background-image: url(image/cusum.svg) background-position: top background-size: 90% 90% class: clear, center, bottom Figure 10: CUSUM Test - Breakpoints for change in coeffic --- # Conclusion - Relatively simple ARMA models can be used to describe stationary univariate time series - Easy to estimate and use of the straightforward Box & Jenkins method can identify possible functional forms for the underlying data generating process - It is possible to test the data generating process for structural breaks