Cointegration

class: center, middle, inverse, title-slide

# Cointegration
### Kevin Kotzé

---

---
# Contents

1. Introduction
1. Defining Cointegration
1. Cointegration and Common Trends
1. Error Correction Models
1. Additional features of cointegration
1. Engle-Granger test
1. Johansen Procedure

---
# Introduction

- Most macroeconomic and many financial variables are nonstationary
- They drift upwards over time and often exhibit characteristics of a stochastic trend
- Engle & Granger proposed that the combined integration order of several variables may be lower than their individual integration order
- Allows for the analysis of nonstationary data, where we are able to retain information relating to the level
- Facilitates an equilibrium analysis, where variables exhibit departures from nonstationary trend (steady-state)

---
# Introduction

- Granger received a Noble Prize for this idea in 2003 "**for methods of analysing economic time series with common trends**"
- He had long been puzzling over the problem of integrated variables and at a seminar Hendry mentioned that the "**difference between integrated series might be stationary**"
- Granger did not believe him: **My response was that it could be proved that he was wrong, but in attempting to do so, I showed that he was correct, and generalised it to cointegration...**

---
# Cointegration defined

- In most cases, when the variables `\(y_{1,t}\)` and `\(y_{2,t}\)` are nonstationary `\(I(1)\)` variables, a linear combination of these variables will also be nonstationary
- However, in a few cases the linear combination of the variables can be stationary
- This occurs when the variables share the same stochastic trend
- The effect of the common stochastic trend may be contained when the variables are combined
- In these cases, we say that the variables are cointegrated

---
# Cointegration defined

- If both `\(y_{1,t}\)` and `\(y_{2,t}\)` are `\(I(1)\)`, they have stochastic trends
- We could rearrange the linear regression model, such that
`\begin{eqnarray}
u_{t}=y_{1,t}-\beta_{1}y_{2,t}
\end{eqnarray}`
- If `\(u_{t}\)` is `\(I(0)\)`, then by definition the combined `\(y_{1,t}-\beta_{1}y_{2,t}\)` is also stationary
    - While both `\(y_{1,t}\)` and `\(y_{2,t}\)` have stochastic trends
    - We say that the variables `\(y_{1,t}\)` and `\(y_{2,t}\)` are cointegrated
    - These stochastic trends are related through `\(\beta_1\)`, which contains this feature of the data
- If `\(u_{t}\)` is `\(I(1)\)`, then  `\(y_{1,t}\)` and `\(y_{2,t}\)` are not cointegrated and regressing `\(y_{2,t}\)` on `\(y_{1,t}\)` will yield a spurious result

---
background-image: url(image/coint1.svg)
background-position: top
background-size: 90% 90%

class: clear, center, bottom

Figure : Unit roots, spurious relations & common stochastic trends
`\(y_{1,t} = 0.1 + \mu_{y,1t} + \upsilon_{y1t}\)` and `\(y_{2,t} = 0.3 + \mu_{y,2t} + \upsilon_{y2t}\)`

---
# Cointegration defined

- Using matrix algebra, if we let `\(\boldsymbol{y}_{t}=(y_{1,t},y_{2,t})^{\prime}\)` denote a `\((2 \times 1)\)` vector of `\(I(1)\)` variables, and `\(\beta=(1,-\beta_{1})^{\prime}\)`
- It follows that the relationship `\(\beta^{\prime}\boldsymbol{y}_{t}=y_{1,t}-\beta_{1}y_{2,t}\)`
- Then the system is cointegrated when `\(\beta^{\prime}\boldsymbol{y}_{t}\sim I(0)\)`
- In this case the variables in the `\(\boldsymbol{y}_{t}\)` vector share a common stochastic trend and will drift together in long-run equilibrium
- The vector `\(\beta\)` is termed a cointegrating vector
- When components of `\(\boldsymbol{y}_{t}\)` are integrated of order `\(d\)` and the reduction in the order of the combined variables is `\(b\)`, then we note that `\(\boldsymbol{y}_{t}\sim CI(d, b)\)`

---
# Cointegration and Common Trends

- Stock & Watson (1988) propose the following decomposition for the `\(I(1)\)` variables
`\begin{eqnarray}
y_{1,t}=\mu_{y_{1,t}} + \upsilon_{y_{1,t}} & \; \; , \; \; & y_{2,t}=\mu_{y_{2,t}} + \upsilon_{y_{2,t}}
\end{eqnarray}`
 where `\(\mu_{it}\)` is the random walk component representing the trend in variable `\(\boldsymbol{y}_t\)`, and `\(\upsilon_{it}\)` is the stationary component
- We are then able to multiply `\(y_{1,t}\)` by `\(\beta_1\)` and `\(y_{2,t}\)` by `\(\beta_2\)` to yield
`\begin{eqnarray}
\beta_1 y_{1,t}=\beta_1 \mu_{y_{1,t}} +\beta_1 \upsilon_{y_{1,t}} & \; \; , \; \; &\beta_2 y_{2,t}=\beta_2 \mu_{y_{2,t}} + \beta_2 \upsilon_{y_{2,t}}
\end{eqnarray}`

---
# Cointegration and Common Trends

- If these variables are `\(CI(1,1)\)` a linear combination of these variables yields;

`\begin{eqnarray}
\beta_{1}y_{1,t}+\beta_{2}y_{2,t}& = &\beta_{1}(\mu_{y_{1,t}}+\upsilon_{y_{1,t}})+\beta_{2}(\mu_{y_{2,t}}+\upsilon_{y_{2,t}})\\
& =& (\beta_{1}\mu_{y_{1,t}}+\beta_{2}\mu_{y_{2,t}})+(\beta_{1}\upsilon_{y_{1,t}}+\beta_{2}\upsilon_{y_{2,t}})
\end{eqnarray}`

- If the errors, `\((\beta_{1}\upsilon_{y_{1,t}}+\beta_{2}\upsilon_{y_{2,t}})\)` are stationary
- And the linear combination of the variables are stationary `\(\beta_{1}y_{1,t}+\beta_{2}y_{2,t}\)`
- Then the stochastic trends `\((\beta_{1}\mu_{y_{1,t}}+\beta_{2}\mu_{y_{2,t}})\)` must vanish
- Hence, for `\(y_{1,t}\)` and `\(y_{2,t}\)` to be `\(CI(1,1)\)`, `\(\mu_{y_{1,t}} = \frac{-\beta_{2}\mu_{y_{2,t}}}{\beta_{1}}\)`
- This implies that they must have the same stochastic trend up to the scalar `\(\frac{-\beta_{2}}{\beta_{1}}\)`

---
# Cointegration and Common Trends

- Stock & Watson's essential insight is that the parameters in the cointegrating vector must purge the trend from the linear combination
- Such a cointegrating vector is unique up to the scalar `\(\frac{-\beta_{2}}{\beta_{1}}\)`
- This can also be extended to the `\(n\)` variable case

`\begin{equation}
\boldsymbol{y}_{t}=\mu_{t} + \boldsymbol{u}_{t}
\end{equation}`

- where `\(\boldsymbol{y}_{t}\)` is a vector `\(\{y_{1_{t}},y_{2_{t}}, \ldots,y_{n_{t}}\}\)` containing various `\(I(1)\)` variables and `\(\mu_{t}\)` is a vector of stochastic trends `\((\mu_{1_{t}},\mu_{2_{t}}, \ldots, \mu_{n_{t}})\)`, and `\(\boldsymbol{u}_{t}\)` is an `\(n\times 1\)` vector of irregular components

---
# Cointegration and Common Trends

- If we can then express one trend as a linear combination of other trends it means that there exists a vector `\(\beta\)` such that

`\begin{equation}
\beta_{1}\mu_{1_{t}}+\beta_{2}\mu_{2_{t}}+ \ldots +\beta_{n}\mu_{n_{t}}=0
\end{equation}`

- where we once again multiply though by `\(\beta\)` to get `\(\beta \boldsymbol{y}_{t}=\beta\mu_{t} + \beta\boldsymbol{u}_{t}\)`
- since the linear combination of all `\(\beta\mu_{t}=0\)`
- we are left with `\(\beta \boldsymbol{y}_{t} = \beta \boldsymbol{u}_{t}\)`, where both sides are stationary

---
# Cointegration and Equilibrium

- A cointegration model may make use of the term *equilibrium* which refers to the existence of a long-run relationship
- This can only occur if there is a common stochastic trend amongst the variables
    - i.e. two variables share a common equilibrium path
- These variables will periodically move away from the equilibrium path, but the effect of this will not be permanent (i.e. the errors are stationary)
- The variables return towards the equilibrium path over time
- The residuals from the cointegrated model are then described as equilibrium errors

---
# Error Correction Models

- A cointegrating relationship defines an equilibrium relationship
- Time paths of cointegrated variables are influenced by the extent of any deviation from long run equilibrium
- In a cointegrated model variables return to the equilibrium value
    - Cointegrated system has an error correction representation
- Engle & Granger formalised the connection between this dynamic response to the errors and co-integration in the Engle-Granger representation theorem

---
# Error Correction Models

- For example, if `\(P_1\)` and `\(P_2\)` are cointegrated share prices
- Assume:
    - gap between the prices is relatively large when compared to the long run equilibrium values (i.e. some dis-equilibria)
    - the low priced share `\(P_2\)` must rise relative to the high priced share `\(P_1\)`.
- This may be accomplished by:
    - `\(\uparrow\)` in `\(P_2\)` or `\(\downarrow\)` in `\(P_1\)`
    - `\(\uparrow\)` in `\(P_1\)` with larger `\(\uparrow\)` in `\(P_2\)`
    - `\(\downarrow\)` in `\(P_1\)` with smaller `\(\downarrow\)` in `\(P_2\)`

---
# Error Correction Models

- The OLS regression then takes the form,
`\begin{eqnarray}
P_{1,t}=\beta_{1}P_{2,t}+u_{t}
\end{eqnarray}`
- When the errors are stationary they may be expressed as,
`\begin{eqnarray}
u_{t}=\phi_{1}u_{t-1}+\varepsilon_{t} \;\;\; \text{with } \; |\phi_{1}|<1
\end{eqnarray}`
- Hence combining the two where, `\(u_{t} = P_{1,t}-\beta_{1}P_{2,t}\)`,
`\begin{eqnarray}
P_{1,t}- \beta_{1}P_{2,t}  & = &\phi_{1}(P_{1,t-1}- \beta_{1}P_{2,t-1})+\varepsilon_{t}\\
P_{1,t}  & = &\beta_{1}P_{2,t}+\phi_{1}(P_{1,t-1}- \beta_{1}P_{2,t-1})+\varepsilon_{t}
\end{eqnarray}`

---
# Error Correction Models

- Adding and subtracting `\(P_{1,t-1}\)` and `\(P_{2,t-1}\)` on both sides
`\begin{eqnarray}
\Delta P_{1,t}&=&-(1-\phi_{1})(P_{1,t-1}- \beta_{1}P_{2,t-1})+ (\beta_{1}\Delta P_{2,t} + \varepsilon_{1,t}) \\
&=&\alpha(P_{1,t-1}- \beta_{1}P_{2,t-1})+\varepsilon_{1,t}
\end{eqnarray}`
where `\(\alpha=-(1-\phi_{1})\)`, while `\(\Delta P_{2,t}\)` is stationary and `\(\varepsilon_{1,t} =  (\beta_{1}\Delta P_{2,t} + \varepsilon_{1,t})\)`
- Note that large persistence in the autoregressive error would imply a slow speed of adjustment

---
# Error Correction Models

- To describe the manner in which the variables return to equilibria use an ECM
- Illustrates how the variables are influenced by deviations from equilibrium
- If we assume that both share prices are `\(I(1)\)` and;
`\begin{eqnarray}
\Delta P_{1} = \alpha_{1}(P_{2,{t-1}}-\beta_1 P_{1,{t-1}}) + \varepsilon_{1,{t}}\\
\Delta P_{2} = \alpha_{2}(P_{2,{t-1}}-\beta_1 P_{1,{t-1}}) + \varepsilon_{2,{t}}
\end{eqnarray}`
- Long term equilibrium described by `\((P_2 - \beta_1 P_1)\)`, which is stationary when variables are `\(CI(1,1)\)`
- If `\(P_1\)` is `\(I(1)\)` then `\(\Delta P_1\)` is stationary
- When variables are `\(CI(1,1)\)`, the term `\(\varepsilon_{1,{t}}\)` is stationary

---
# Error Correction Models

- Hence the two share prices must be cointegrated with the vector `\((1, - \beta_1)^{\prime}\)`, when they are of the order `\(CI(1,1)\)`
- The parameters `\(\alpha_1\)` and `\(\alpha_2\)` are speed of adjustment parameters that describe how changes to share prices react to past deviations from the equilibrium path in the respective share prices
- Small values of `\(\alpha_i\)` imply a relatively unresponsive relationship, where it would take a long time to return to equilibrium

---
# Error Correction Models

- This can be generalised to include lagged changes of both equations,

`\begin{eqnarray}
\Delta y_{1,t}= \gamma_{0} + \alpha_{1} \left[ y_{1,t-1}-\beta_1 y_{2,t-1} \right] + \sum_{i=1}^{K} \zeta_{1,i} \Delta y_{1,t-1} + \sum_{j=1}^{L} \zeta_{2,j} \Delta y_{2,t-1} + \varepsilon_{y_1,{t}} \\
\Delta y_{2,t}= \eta_{0} + \alpha_{2} \left[ y_{1,t-1}-\beta_1 y_{2,t-1} \right] + \sum_{i=1}^{K} \xi_{1,i} \Delta y_{2,t-1} + \sum_{j=1}^{L} \xi_{2,j} \Delta y_{1,t-1} + \varepsilon_{y_2,{t}}
\end{eqnarray}`

- This is a representation for a VECM
- If both `\(\alpha_1\)` and `\(\alpha_2\)` equal zero there is:
    - no equilibrium relationship
    - no error-correction
    - no cointegration

---
# Autoregressive distributed lag model

- The error correction model may also be represented by an autoregressive distributed lag (ARDL) model
- Consider the example,
`\begin{eqnarray}
y_{1,t}=\phi_{1} y_{1,t-1}+\phi_{2} y_{2,t} + \varepsilon_{t}
\end{eqnarray}`
- Subtract `\(y_{1,t-1}\)` and `\(\phi_{2}y_{2,t-1}\)` from both sides,
`\begin{eqnarray}
\Delta y_{1,t}&=&\phi_{2}\Delta y_{2,t}- (1-\phi_{1})y_{1,t-1}+\phi_{2}y_{2,t-1}+\varepsilon_{t}
\\
&=& \phi_{2}\Delta y_{2,t}-(1-\phi_{1})(y_{1,t-1}-\frac{\phi_{2}}{1-\phi_{1}}y_{2,t-1})+\varepsilon_{t}
\end{eqnarray}`

---
# Autoregressive distributed lag model

- Note that the long-term steady-state, with `\(y_{1,t}=y_{1,t-1}=\bar{y}_{1}\)`, may then be described as,
`\begin{eqnarray}
\bar{y}_{1}=\phi_{1}\bar{y}_{1}+\phi_{2}\bar{y}_{2}
\end{eqnarray}`
- Such that,
`\begin{eqnarray}
\bar{y}_{1}=\frac{\phi_{2}}{1-\phi_{1}}\bar{y}_{2}
\end{eqnarray}`
- Hence the relationship between `\(y_1\)` and `\(y_2\)` is described by `\(\frac{\phi_{2}}{1-\phi_{1}}\)`
- This is equivalent to the ECM that we derived earlier

---
# Worthwile Noting ...

- Cointegration refers to linear combinations of non-stationary variables
    - It's possible that there are nonlinear cointegrating relationships, but we don't know how to test for this.
    - We can however model regime-switching cointegrating relationships (Balke & Fomby, 1997)
    - Cointegrating vectors are unique up to a scalar, for every `\(\beta_1 , \beta_2, \ldots\)` there exists `\(\lambda \beta_1, \lambda \beta_2, \ldots\)`, where `\(\lambda\)` is the scalar
- All variables must be integrated of the same order
    - usually a set of `\(I(d)\)` variables are not cointegrated
    - when two variables are integrated of different orders they cannot be cointegrated
    - it is possible to have multicointegration (i.e. `\(b=2\)`)

---
# Worthwile Noting ...

- If `\(\boldsymbol{y}_{t}\)` has `\(n\)` nonstationary components there may be `\(n-1\)` linear cointegrating vectors
    - If `\(\boldsymbol{y}_{t}\)` has two variables then there can only be one cointegrating vector
    - The number of cointegrating vectors is called the cointegrating rank
- If you have three variables and two are `\(I(2)\)` and one is `\(I(1)\)`:
    - The two `\(I(2)\)` variables may be `\(CI(2,1)\)`
    - The remaining `\(I(1)\)` variable may share a common stochastic trend with other two variables, whereupon the system will be stationary
    - The chance of this occurring are very small
- Most of the literature focuses on `\(CI(1,1)\)` cases but many other possibilities exist

---
# The Engle-Granger Procedure - Step 1

- Suppose `\(y_{1,t}\)` and `\(y_{2,t}\)` are possibly `\(I(1)\)` and you want to know whether they are cointegrated
    - Pretest the variables for their order of integration
- Use Augmented Dickey Fuller or similar test
- If both are stationary there is no cointegrating vector
- If variables of different orders there is no cointegrating vector
- For three or more variables they can be of different orders
    - A group could be `\(C(2,1)\)` and this group may be cointegrated with a further set of `\(I(1)\)` variables.

---
# The Engle-Granger Procedure - Step 1

- If both are `\(I(1)\)` estimate LR relationship;
`\begin{eqnarray}
y_{1,t} = \beta_0 + \beta_1 y_{2,t} + u_t
\end{eqnarray}`
- If the variables are cointegrated OLS yields a super consistent estimate of `\(\beta_0\)` and `\(\beta_1\)`
    - The variables are cointegrated when `\(\hat{u}_t\)` is stationary
- To test for stationarity construct an Augmented Dickey Fuller test
`\begin{eqnarray}
\Delta \hat{u}_t = \pi_1 \hat{u}_{t-1} + \sum_{j=1}^k \gamma_{j} \Delta \hat{u}_{t-j} +  \varepsilon_t
\end{eqnarray}`
where `\(\pi_1  = (1-\phi)\)`
- If we cannot reject `\(\pi_1 = 0\)` then they are not cointegrated
- If we can reject `\(\pi_1 = 0\)` then they are cointegrated

---
# The Engle-Granger Procedure - Step 1

- Use tables from Engle & Granger (1987) or Engle & Yoo (1987)
- If the OLS regression includes a constant then do not include a constant in the ADF regression, and vice versa

---
# The Engle-Granger Procedure - Step 2

- Estimate the error-correction model - if the variables are cointegrated use the residuals to estimate the error correction model
`\begin{eqnarray}
\Delta y_{1,t}&=& \gamma_{0} + \alpha_{1} \left[ y_{1,t-1}-\beta_1 y_{2,t-1} \right] + \ldots \\
&& \sum_{i=1}^{K} \zeta_{1,i} \Delta y_{1,t-1} + \sum_{j=1}^{L} \zeta_{2,j} \Delta y_{2,t-1} + \varepsilon_{y_1,{t}} \\
\Delta y_{2,t} &=& \eta_{0} + \alpha_{2} \left[ y_{1,t-1}-\beta_1 y_{2,t-1} \right] + \ldots \\
&& \sum_{i=1}^{K} \xi_{1,i} \Delta y_{2,t-1} + \sum_{j=1}^{L} \xi_{2,j} \Delta y_{1,t-1} + \varepsilon_{y_2,{t}}
\end{eqnarray}`
- where `\(\beta_1\)` is the cointegrating vector, `\(\varepsilon_{y_1,t}\)` and `\(\varepsilon_{y_2,t}\)` are white noise, while `\(\alpha_1\)` and `\(\alpha_2\)` are the speed of adjustment parameters
- We can substitute `\((y_{1,t-1} - \beta_1 y_{2,t-1})\)` with `\(\hat{u}_{t-1}\)`
- OLS provides efficient estimates, `\(t\)` and `\(F\)` test statistics are appropriate

---
# The Engle-Granger Procedure - Step 2

- To assess model adequacy
- Check residuals `\(\varepsilon_{1,t}\)` and `\(\varepsilon_{2,t}\)` to make sure they are white noise
- If there is a problem then consider changing the lag length
- Speed of adjustment `\(\alpha_1\)` and `\(\alpha_2\)` describe dynamics
- Large value of `\(\alpha_2\)` is associated with a large `\(\Delta y_{2,t}\)`
- If `\(\alpha_2 = 0\)` and `\(\xi_{2,j}=0\)` then `\(\Delta y_{1,t}\)` can not Granger-cause `\(\Delta y_{2,t}\)`
- If both `\(\alpha_1 = 0\)` and `\(\alpha_2 = 0\)` then there is no cointegration or error correction
- `\(\alpha_1\)` and `\(\alpha_2\)` should also not be too big since they should converge to the LR values over time (i.e. not immediately, or over-correct too drastically)

---
# Problems with Engle Granger procedure

- Has several important limitations
- Need to specify left-hand side variables up front
`\begin{eqnarray}
y_{1,t} = \beta_1 y_{2,t} + \upsilon_{y1,t} \\
y_{2,t} = \beta_2 y_{1,t} + \upsilon_{y2,t}
\end{eqnarray}`
- `\(\upsilon_{y1,t}\)` may be stationary but `\(\upsilon_{1,t}\)` not?
- reversing the order may give different results
    - Can't identify multiple cointegrating vectors (or their form)
    - Reliance on a two-step procedure
- first step generates residuals
- second step uses these in a regression to obtain, `\(\alpha_1\)` with ECM
- an error in step 1 is carried over to step 2 (i.e. we may incorrectly including a constant in step 1)

---
# Cointegration in a multivariate setting

- Multivariate cointegration makes use of a VAR structure
- Consider a first order `\(VAR(1)\)` for the `\(n\times1\)` vector `\(\boldsymbol{y}_{t}=[y_{1,t}, y_{2,t}, \ldots ,y_{n,t}]^{\prime}\)`,
`\begin{eqnarray}
\boldsymbol{y}_{t}=\mu+\Pi_{1} \boldsymbol{y}_{t-1}+\boldsymbol{u}_{t}
\end{eqnarray}`
- where `\(\mu=[\mu_{1} , \mu_{2}, \ldots ,\mu_{n}]^{\prime}\)` is a vector of constants
- `\(\boldsymbol{u}_{t}=[ u_{1,t},u_{2,t}, \ldots , u_{n,t}]^{\prime}\)` is a vector of error terms
- `\(\Pi_{1}\)` is a `\((n \times n)\)` matrix of coefficients

- The stability of the VAR model is determined by the eigenvalues of `\(\Pi_{1}\)` that are obtained by solving the characteristic equation
`\begin{eqnarray}
| \; \Pi_{1}- I \; |=0
\end{eqnarray}`
- If all eigenvalues are modulus less than 1, the VAR is stable

---
# Vector error correction model (VECM)

- Now by writing the `\(VAR(1)\)` as a VECM,
`\begin{eqnarray}
\Delta \boldsymbol{y}_{t}  & = & \mu+(\Pi_{1}-I)\boldsymbol{y}_{t-1}+\boldsymbol{u}_{t} \\
& = & \mu+\Pi \boldsymbol{y}_{t-1}+\boldsymbol{u}_{t} \;\;\; \text{where } \; \Pi=(\Pi_{1}-I)
\end{eqnarray}`
- With `\(n\)` variables, the number of of linear combinations of the variables in `\(\boldsymbol{y}_{t}\)` that are stationary will provide us with information on the number of cointegration vectors

---
# VECM model

- Three possibilities exist:
    - `\(\Pi\)` has full rank `\(n=r\)`
The VAR must be stable as there is no instability in the system of equations. This model of stationary variables should be estimated in levels.
    - `\(\Pi\)` has rank `\(1\leq r\leq n-1\)`
The number of linear combinations is smaller than the number of variables. Hence, some of the variables must be unstable and at least one combination of the variables is stable. The number of cointegrating vectors is given by `\(r\)`
    - `\(\Pi\)` has rank `\(r=0\)` (i.e. `\(\Pi=0)\)`
There is evidence of instability and no combination of the variables is stable. The unstable VAR cannot be cointegrated and should be estimated in first differences.

---
# VECM model

- As before, the VECM includes the coefficients `\(\alpha\)` and `\(\beta\)`
- When cointegration exists, we can decompose `\(\Pi\)` as,
`\begin{eqnarray}
\Pi=\alpha\beta^{^{\prime}}
\end{eqnarray}`
- where `\(\alpha\)` and `\(\beta\)` are both dimensions `\(n\times r\)`
- `\(\beta\)` is the cointegration matrix where the linear combination of `\(\beta^{\prime} \boldsymbol{y}_{t}\)` is stationary
- each of the `\(r\)` rows of `\(\beta^{\prime}  \boldsymbol{y}_t\)` is a cointegrated (long-run) relation that induces stability
- `\(\alpha\)` measures the speed of adjustment back to equilibrium

---
# VECM model

- Consider the bivariate case that was examined previously,

`\begin{eqnarray}
\left[\begin{array}{c}
\Delta y_{1,t}\\ \Delta y_{2,t}
\end{array}\right] =
\left[\begin{array}{c}
\mu_{1}\\ \mu_{2}
\end{array}\right] +
\left[\begin{array}{c}
\alpha_{1}\\ \alpha_{2}
\end{array}\right]
\Big[\beta_{1} \beta_{2}\Big] \left[\begin{array}{c}
y_{1,t-1}\\ y_{2,t-1}
\end{array} \right] +
\left[\begin{array}{c}
u_{1,t}\\ u_{2,t}
\end{array}\right]
\end{eqnarray}`

- which can be written as,

`\begin{eqnarray}
\Delta y_{1,t}  & =\mu_{1}+\alpha_{1}(\beta_{1}y_{1,t-1}+\beta_{2}y_{2,t-1})+u_{1,t}\\
\Delta y_{2,t}  & =\mu_{2}+\alpha_{2}(\beta_{1}y_{1,t-1}+\beta_{2}y_{2,t-1})+u_{2,t}
\end{eqnarray}`

- and the cointegration relationship `\(\beta^{\prime}  \boldsymbol{y}_{t}\)` is given by,

`\begin{eqnarray}
\beta^{\prime} \boldsymbol{y}_{t}=\beta_{1}y_{1,t}+\beta_{2}y_{2,t}\sim I(0)
\end{eqnarray}`

- which is what we had previously

---
# VECM model

- Now in the three variable case, we have `\(n=3\)` and `\(r=2\)`

`\begin{eqnarray}
\left[\begin{array}{c}
\Delta y_{1,t}\\
\Delta y_{2,t}\\
\Delta y_{3,t}
\end{array}\right]
&=& \left[\begin{array}{c}
\mu_{1}\\
\mu_{2}\\
\mu_{3}
\end{array}\right] +
\left[\begin{array}{cc}
\alpha_{11} & \alpha_{12}\\
\alpha_{21} & \alpha_{22}\\
\alpha_{31} & \alpha_{32}
\end{array}\right] \ldots \\
&& \left[\begin{array}{ccc}
\beta_{11} & \beta_{21} & \beta_{31}\\
\beta_{12} & \beta_{22} & \beta_{32}
\end{array}\right]
\left[\begin{array}{c}
y_{1,t-1}\\
y_{2,t-1}\\
y_{3,t-1}
\end{array}\right] +
\left[\begin{array}{c}
u_{1,t}\\
u_{2,t}\\
u_{3,t}
\end{array}\right]
\end{eqnarray}`

- with `\(r=2\)`, there are two cointegration relationships, which we denote `\(\beta_{1}^{\prime}  \boldsymbol{y}_{t-1}\)` and `\(\beta_{2}^{\prime}  \boldsymbol{y}_{t-1}\)`,

`\begin{eqnarray}
\beta_{1}^{\prime}\boldsymbol{y}_{t}  & =\beta_{11}y_{1,t}+\beta_{21}y_{2,t}+\beta_{31}y_{3,t}\sim I(0)\\
\beta_{2}^{\prime}\boldsymbol{y}_{t}  & =\beta_{12}y_{1,t}+\beta_{22}y_{2,t}+\beta_{32}y_{3,t}\sim I(0)
\end{eqnarray}`

- Hence, we have two linear relationships between the variables, `\(y_{1}, y_{2}\)` and `\(y_{3}\)`, which ensures that this combined system is stationary

---
# VECM model

- The `\(VAR(1)\)` model can be generalized to a `\(VAR(p)\)`,
`\begin{eqnarray}
\Delta \boldsymbol{y}_{t}=\mu+\alpha\beta \boldsymbol{y}_{t-1}+\Gamma_{1}\Delta \boldsymbol{y}_{t-1}+\Gamma_{2}\Delta \boldsymbol{y}_{t-2}+ \ldots \\
+\Gamma_{p-1}\Delta \boldsymbol{y}_{t-p-1} +  \boldsymbol{u}_{t}
\end{eqnarray}`
- where we have added `\(p\)` lags of the vector of variables

---
# Johansen Approach

- Testing for cointegration in the multivariate case amounts to determining the rank of `\(\Pi\)`
- Effectively, we need to determine the number of non-zero eigenvalues in `\(\Pi\)`
- Johansen's maximum likelihood approach suggests that one should order the eigenvalues,
`\(\hat{\lambda}_{1}>\hat{\lambda}_{2}> \ldots >\hat{\lambda}_{n}\)`
- Then test the null hypothesis that there are at most `\(r\)` cointegrating vectors, where
`\begin{eqnarray}
H_{0}: \hat{\lambda}_{i}=0 \;\; \text{ for } \; i=r+1, \ldots ,n
\end{eqnarray}`
- For example, if `\(n=2\)` and `\(r=1\)` as in the first example:
  - First eigenvalue, `\(\hat{\lambda}_{1} \ne 0\)` (able to reject null)
  - Second eigenvalue, `\(\hat{\lambda}_{2} = 0\)` (unable to reject null)

---
# Johansen Approach

- The trace statistic specifies the null of hypothesis `\(H_{0}\)` of `\(r\)` cointegration relations as,
`\begin{eqnarray}
\lambda_{trace}=-T\sum_{i=r+1}^{n}\log(1-\hat{\lambda}_{i}) \;\;\; r=0,1,2, \ldots , n-1
\end{eqnarray}`
- where the alternative hypothesis is that there are more than `\(r\)` cointegration relationships
- The maximum eigenvalue statistic for the null hypothesis of at most `\(r\)` cointegration relations can be computed as,
`\begin{eqnarray}
\lambda_{max}=-T\log(1-\hat{\lambda}_{r+1}) \;\;\; r=0,1,2,\ldots, n-1
\end{eqnarray}`
- where the alternative hypothesis is that there are `\(r+1\)` cointegration relations

---
# Johansen Approach

- For both tests, the asymptotic distribution is nonstandard and depends upon the deterministic components included (constant and trend)
- Tabulated critical values can be found in Johansen (1988) or Osterwald-Lenum (1992)
- Calculated values must be greater than tables to reject null

---
# Summary

- Single-equation modelling of nonstationary variables gives spurious results: unless the variables cointegrated
- Cointegration implies that the variables share a common trend
- Cointegration describes the long-run relationship between variables
- Often such relationships are given by economic theory
- To determine if variables share one or more cointegrating relationships we can do the following:
    - Determine whether the time series are stationary or nonstationary (graphical analysis, unit root testing)
    - If the series are stationary, proceed to estimate dynamic model in levels of the data
    - If the series are nonstationary, proceed to test for cointegration

---
# Summary

- To test for cointegration, two approaches can be used:
    - Single equation approach: Engel-Granger two-step procedure
    - Multivariate approach: Johansen test
    - Since cointegration is inherently a system property, the multivariate approach is usually preferred
- If we are able to reject the null hypothesis of no-cointegration, proceed to estimate an equilibrium correction model either by:
    - Single equation model: ECM or ARDL
    - Multivariate VECM
- If the null of no-cointegration cannot be rejected, proceed to estimate the model in first differences