Introduction

# Introduction
### Kevin Kotzé

---

---
# Contents

1. Overview
1. Introduction
1. Decomposing a time series
1. Popular Processes
1. Difference Equations & Lag Operators
1. Conditional & Unconditional Moments
1. Stationarity, autocorrelation & ergodicity
1. Impact multipliers & IRFs
1. Conclusion

---

# Time Series Approaches

- Time Domain - modelling current  on past (i.e. reduced-form)
    - Box-Jenkins method
    - State-space method
- Frequency domain - modelling period or systematic variations
    - Includes, Fourier analysis, power spectra's & wavelet transforms
- These methodologies may be applied to VAR, SVAR, CVAR, GARCH, MVGARCH, SV, DFM, MSW, STAR, `\(\dots\)`

---
# Time Series Approaches

- Consider the simple linear regression model
`\begin{eqnarray}
y_{t} = \underbrace{x_t^{\top} \beta}_\text{explained} + \underbrace{\varepsilon_{t}}_\text{unexplained}\; , & \hspace{1cm} & t=1, \ldots , T
\end{eqnarray}`
- Errors should not be serially correlated for least squares estimates in such a model:
- `\(\mathbb{E}[\varepsilon_t] = \mathbb{E}[\varepsilon_t |  \varepsilon_{t-1}, \varepsilon_{t-2},\dots] = 0\)`  and
- `\(\mathbb{E}\left[\varepsilon_i \varepsilon_{t-j}\right] = 0\)`, for `\(j \ne 0\)`
- If there is serial correlation in the errors then estimates are inefficient

---
# Introduction

- Most economic and financial time series exhibit some form of serial correlation
    - If economic output is large during the previous quarter then there is a good chance that it is going to be large in the current quarter
- A change that arises in this quarter may only impact on other variables at a distant future point
- Particular shock may affect variables over successive quarters
    - Hence, we need to start thinking about the dynamic structure of the system that we are investigating
- Most time series models look at explicitly allowing for these features, while adhering to the statistical properties mentioned above

---
# Introduction

- Modern day time series is concerned with interpreting data that is measured at discrete intervals
    - Traditionally, a large part of time series analysis is concerned with forecasting
    - Testing various hypotheses (theories) that may be used to describe the past behaviour of a time series variable
- These objectives are best achieved by identifying the *dynamic path* of a time series
- To do this we need to decompose the series into the constituent components

---
background-image: url(image/pic1.svg)
background-position: top
background-size: 90% 90%

Figure 1: Decomposition of Time Series

---
# Basic time series model

- Consider the following example:

`\begin{eqnarray}
Trend:& & T_t = 1+0.1t \\
Seasonal:& & S_t = 1.6 \sin(t \pi /6) \\
Irregular:& & I_t = 0.7 I_{t-1} + \varepsilon_t
\end{eqnarray}`

where `\(\varepsilon_t \;\)` is a random disturbance
- The trend component is deterministic
- The seasonal is also deterministic and uses a sinusoidal to impart cyclical behaviour
- The irregular component contains a stochastic term, which may be described by a statistical distribution

---
background-image: url(image/SA_data.svg)
background-position: top
background-size: 85% 85%

Figure 2: Real World Data: South African Data

---
# Economic data

- Consider the information content that is contained in the series
- Most economic data contains trends, seasonals, and irregular components
- Most economic data is measured in discrete time (with relatively long intervals)
- Some financial data is measured at extremely high frequency
- This data may be expressed as rates, indices or totals
- Be cautions of using interpolation & moving to lower frequency
- Many economic variables are subject to revision

---
# Economic data

- Take note of the frequency and the type of transformation that could (should) be applied
- Common transformations include: calculation of growth rates `\([\log(GDP_{t}/GDP_{t-1})]\)`, the calculation of annualised rates `\([(1+(i_t/100))^{(1/12)}-1]\)`, etc.
- Most countries follow globally accepted measurement practices

---
# Financial data

- Data on stock prices and indices is overwhelming:
    - Does the data contain true trading prices, quotes, or proxies for trading prices?
    - May only be interested in buyer initiated (ask) or seller initiated (bid) orders?
    - Do the prices include transaction costs, commissions & effects of tax transfers?
    - Is the market sufficiently liquid?
    - Have the prices been adjusted for inflation, or have they been correctly discounted?
    
---
# Financial data
    
- At what frequency do you want to measure trading activity/returns?
    - Is it feasible to use extremely high frequency data and what does it represent?
    - Consider the implications of limiting data to a particular sub-sample
- Transformation to returns generally displays more stationary behaviour
- Represents a complete scale-free summary about an investment opportunity

---
# Processes

- Time series is a collection of observations indexed by the date of each realisation
- Using notation that starts at time, `\(t = 1\)` and using the end point, `\(t =T\)`,

`\begin{eqnarray}
\left\{ y_{1}, y_{2}, y_{3}, \ldots , y_{T}\right\}
\end{eqnarray}`

- Time index can be of any frequency  (e.g. daily, quarterly, etc.)

---
# Deterministic & Stochastic processes

- Deterministic process will always produce the same output from a given starting condition or initial state
    - No element of randomness, i.e. `\(T_t = 1+0.1t\)`
- Stochastic process has some indeterminacy that relates to the future evolution of the process
    - Usually described by some form of statistical distribution
    - Examples include: white noise processes, random walks, Brownian motions, Markov chains, martingale difference sequences, etc.

---
# Stochastic processes: White noise

- Serially uncorrelated random variables with zero mean and finite variance
- Errors may follow a normal distribution - Gaussian white noise process
- Slightly stronger condition is that they are independent from one another

`\begin{eqnarray}
\varepsilon_t \sim \mathsf{i.i.d.} & \mathcal{N}(0, \sigma_{\varepsilon_t}^2)
\end{eqnarray}`

- Notice three implications of this assumption:
- `\(\mathbb{E}[\varepsilon_t] = \mathbb{E}[\varepsilon_t | \varepsilon_{t-1}, \varepsilon_{t-1}, \dots ] =0\)`
- `\(\mathbb{E}[\varepsilon_t \varepsilon_{t-j}] = \mathsf{cov}[\varepsilon_{t} \varepsilon_{t-j}] = 0\)`
- `\(\mathsf{var}[\varepsilon_{t}] = \mathsf{cov}[\varepsilon_{t}\varepsilon_{t}] = \sigma_{\varepsilon_t}^2\)`

---
background-image: url(image/wn.svg)
background-position: top 
background-size: 90% 90%

Figure 3: Gaussian White Noise Process

---
# Random Walk

- Random walk would imply that the effect of a shock is permanent

`\begin{eqnarray}
y_t = y_{t-1} + \varepsilon_t
\end{eqnarray}`

- Could be represented as

`\begin{eqnarray}
y_t =  \sum_{j=1}^{t} \varepsilon_j
\end{eqnarray}`

- Shocks have permanent effects

---
background-image: url(image/rw.svg)
background-position: top 
background-size: 90% 90%

Figure 4: Random Walk - Simulated Time Series

---
background-image: url(image/rw_shock.svg)
background-position: top 
background-size: 90% 90%

Figure 5: Random Walk - Effect of Shock [ `\(y_{-1}=0, \varepsilon_0 = 1\)` and `\((\varepsilon_1, \dots) = 0\)` ]

---
# Random Walk plus Drift

- Random walk plus a constant term

`\begin{eqnarray}
y_t =  \mu + y_{t-1} + \varepsilon_t
\end{eqnarray}`

- This could be represented as

`\begin{eqnarray}
y_t =  \mu \cdot t + \sum_{j=1}^{t} \varepsilon_j
\end{eqnarray}`

- Shocks have permanent effects and are influenced by the drift

---
background-image: url(image/rwd.svg)
background-position: top 
background-size: 90% 90%

Figure 6: Random Walk plus Drift - Simulated Time Series [Dotted: `\(\mu=1.2\)` & Solid: `\(\mu = 0.5\)`]

---
background-image: url(image/rwd_shock.svg)
background-position: top 
background-size: 90% 90%

Figure 7: Random Walk with Drift - Effect of Shock [ `\(y_{-1} = 0\)`, `\(\mu = 1.2\)`, `\(\varepsilon_0 = 1\)` and `\((\varepsilon_1, \ldots) = 0\)`]

---
background-image: url(image/rw_m.svg)
background-position: top 
background-size: 90% 90%

Figure 8: Different Random Walks

---
# Autoregressive process

- An `\(AR(1)\)` process describes situations where the present value of a time series is a linear function of the previous observation

`\begin{eqnarray}
y_{t}= \phi y_{t-1} + \varepsilon_{t} & \; & \varepsilon_t \sim \mathsf{i.i.d.} \; \mathcal{N}(0, \sigma_{\varepsilon_t}^2)
\end{eqnarray}`

- Know something about the *conditional* distribution of `\(y_t\)` given `\(y_{t-1}\)`
- After repeated substitution it would take the form

`\begin{eqnarray}
y_t = \phi^j \sum^t_{j=1} \varepsilon_j
\end{eqnarray}`

- Could include several lags, `\(AR(p)\)` model and the distribution of the error term could take many forms
- Think about the implication of future values of `\(y_t\)` when `\(\phi=0.5,1,\)` or `\(1.5\)`?
- Think about the role of the constant in a random walk plus drift and in a stationary `\(AR(1)\)`?

---
background-image: url(image/ar.svg)
background-position: top 
background-size: 90% 90%

Figure 9: `\(AR(1)\)` - Simulated Time Series [ `\(\phi = 0.9\)`]

---
background-image: url(image/ar_shock.svg)
background-position: top 
background-size: 90% 90%

Figure 10: `\(AR(1)\)` - Effect of Shock [ `\(\phi = 0.9\)`]

---
# Moving Average process

- `\(MA(q)\)` model describes a time series by the weighted sum of the current and previous errors
- Consider examples where it takes a bit of time for the error (or "shock") to dissipate

`\begin{eqnarray}
y_t =  \varepsilon_t + \theta \varepsilon_{t-1}
\end{eqnarray}`

- This type of expression may be used to describe a wide variety of stationary time series processes

---
background-image: url(image/ma.svg)
background-position: top
background-size: 90% 90%

Figure 11: `\(MA(1)\)` - Simulated Time Series [ `\(\theta = 0.7\)`]

---
background-image: url(image/ma_shock.svg)
background-position: top
background-size: 90% 90%

Figure 12: `\(MA(1)\)` - Effect of Shock [ `\(\theta = 0.7\)`]

---
# ARMA process

- A combination of these modes is termed an `\(ARMA(1,1)\)`

`\begin{eqnarray}
y_t =  \phi y_{t-1} + \varepsilon_t + \theta \varepsilon_{t-1}
\end{eqnarray}`

- where an `\(ARMA(p, q)\)` takes the form

`\begin{eqnarray}
y_t &=& \phi_1 y_{t-1} + \phi_2 y_{t-2} + ... + \phi_p y_{t-p} + \ldots \\ 
&& \ldots + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \ldots + \theta_q \varepsilon_{t-q}
\end{eqnarray}`

- This model was popularized by Box & Jenkins, who developed a methodology that may be used to identify the terms that should be included in the model
- Note that the `\(AR(p)\)`, `\(MA(q)\)` and `\(ARMA(p,q)\)` may provide some form of characterisation of the South Africa data that we saw previously

---
# Long Memory & Fractional Differencing

- Most `\(AR(p)\)`, `\(MA(q)\)` and `\(ARMA(p,q)\)` processes are often referred to as a short-memory process because the coefficients in the representation are dominated by exponential decay
- Long memory (or persistent) time series are considered intermediate compromises between the short memory models and integrated nonstationary processes
- Long periods during which observations tend to be at a high level and similar long periods during which observations tend to be at a low level

---
# Difference Equations

- Linear first order difference equation may be expressed as

`\begin{eqnarray}
\ y_{t}=\phi y_{t-1}+\varepsilon _{t}\qquad
\end{eqnarray}`

- Relate `\({y}\)` at time `\(t\)`, to its previous value in time `\((t-1)\)`
- If `\(|\phi| <1\)`, we can show that the series will always return to its mean after a shock
- If `\(\phi =1\)` then the difference equation is a random walk
- Higher Order Difference Equations may include

`\begin{eqnarray}
y_{t}=\phi_{1} y_{t-1}+ \phi_{2} y_{t-2}+\varepsilon _{t}
\end{eqnarray}`

---
# Lag Operators

- Convenient tools to analyse difference equations

`\begin{equation}
\ Ly_{t}=y_{t-1}
\label{lag1}
\end{equation}`

- Similarly, `\(L^{-1}y_{t}=y_{t+1}\)`
- Can be raised to arbitrary integer powers `\(k\)` such that

`\begin{eqnarray}
L^{k}y_{t}=y_{t-k} \\
L^{-k}y_{t}=y_{t+k}
\label{lag_k}
\end{eqnarray}`

---
# Lag Operators

- The first difference of a series could then be written as

`\begin{equation}
\ \left( 1-L\right) y_{t}=y_{t}-y_{t-1}
\label{1st_dif}
\end{equation}`

- The four-period difference would be defined as

`\begin{equation}
\left( 1-L^{4}\right) y_{t}=y_{t}-y_{4}
\end{equation}`

- The second difference of a time series is then

`\begin{equation}
\left( 1-L\right) ^{2}y_{t}  = \left( 1-L\right) y_{t} - \left( 1-L\right) y_{t-1} = (y_{t}-y_{t-1}) - (y_{t-1}-y_{t-2})
\label{2nd_dif}
\end{equation}`

---
# Lag Operators

- Higher order expressions make use of a polynomial of lag operators

`\begin{equation}
\phi (L)=\left( 1-\phi _{1}L-\phi _{2}L^{2}- \ldots - \phi _{p}L^{p}\right)
\end{equation}`

- where `\(\phi\)` is a vector of coefficients
- When applying a particular lag polynomial to a time series `\(y_{t}\)`, we could use the expression

`\begin{eqnarray}
\phi (L)y_{t} &\equiv &\left( 1-\overset{p}{\underset{i=1}{\sum }}\phi_{i}L^{i}\right) y_{t}  \\
&=&\left( 1-\phi _{1}L-\phi _{2}L^{2}- \ldots - \phi _{p}L^{p}\right) y_{t}  \\
&=&y_{t}-\phi _{1}y_{t-1}-\phi _{2}y_{t-2} - \ldots -\phi _{p}y_{t-p}
\end{eqnarray}`

---
# Moments of Distribution

- Distributions often summarised by first (mean) and second (variance) moments
- Higher order moments may be of interest (skewness & kurtosis)
- Always important to distinguish between the unconditional and conditional distributions

---
# Moments of Distribution

- The first moment of a stochastic process is the average of `\(y_{t}\)` over all possible realisations

`\begin{eqnarray}
\bar{y} =\mathbb{E}\left[ y_{t}\right], \;\;\;\; t=1, \dots , T
\end{eqnarray}`

- The second moment is defined as the variance

`\begin{eqnarray}
\mathsf{var}[y_{t}]=\mathbb{E}\left\{ y_{t}\;y_{t}\right\} =\mathbb{E}\left\{ \left(y_{t}-\mathbb{E}\left[y_{t}\right]\right)^{2}\right\},  \;\;\;\; t=1, \dots , T
\end{eqnarray}`

- And the covariance, for `\(j\)`:

`\begin{eqnarray}
\mathsf{cov}[y_{t},y_{t-j}]&=& \mathbb{E}\left\{ y_{t}\;y_{t-j}\right\} \\ 
&=& \mathbb{E}\left\{\left(y_{t}-\mathbb{E}\left[y_{t}\right]\right) \left(y_{t-j}- \mathbb{E}\left[y_{t-j}\right]\right)\right\} , \;\; t=j+1, \dots , T
\end{eqnarray}`

---
# Conditional Moments

- Conditional distribution is based on past realisations of a random variable
- For the `\(AR(1)\)` model

`\begin{equation}
\ y_{t}=\phi y_{t-1}+\varepsilon _{t}
\end{equation}`

- where `\(\varepsilon _{t}\sim \mathsf{i.i.d.}  \mathcal{N}(0,\sigma ^{2})\)` is Gaussian white noise and `\(|\phi |{<}1\)`
- Conditional moments satisfy

`\begin{eqnarray}
&&\mathbb{E}\left[ y_{t}|y_{t-1}\right] =\phi y_{t-1}  \\
&&\mathsf{var}[y_{t}|y_{t-1}] =\mathbb{E}[\phi y_{t-1}+\varepsilon _{t}-\phi y_{t-1}]^{2}=\mathbb{E}[\varepsilon _{t}]^{2}=\sigma ^{2}   \\
&&\mathsf{cov}\left[\left(y_{t}|y_{t-1}\right),\left(y_{t-j}|y_{t-j-1}\right)\right] =0 \;\; \text{for } j >1 
\label{cond_m}
\end{eqnarray}`

---
# Conditional Moments

- Conditioning on `\(y_{t-2}\)` for `\(y_t\)`

`\begin{eqnarray}
\mathbb{E}\left[ y_{t}|y_{t-2}\right] &=&\phi ^{2}y_{t-2}   \\
\mathsf{var}[y_{t}|y_{t-2}] &=&(1+\phi ^{2})\sigma ^{2}   \\
\mathsf{cov}\left[\left(y_{t}|y_{t-2}\right),\left(y_{t-j}|y_{t-j-2}\right)\right]  &=&\phi \sigma ^{2} \;\; \text{for } j = 1   \\
\mathsf{cov}\left[\left(y_{t}|y_{t-2}\right),\left(y_{t-j}|y_{t-j-2}\right)\right]  &=&0 \;\; \text{for } j > 1 
%\label{cond_m2}
\end{eqnarray}`

---
# Unconditional Moments

- Unconditional distribution has slightly different moments for the `\(AR(1)\)` model

`\begin{equation}
\ y_{t}=\phi y_{t-1}+\varepsilon _{t}
\end{equation}`

- where `\(\varepsilon _{t}\sim \mathsf{i.i.d.}  \mathcal{N}(0,\sigma ^{2})\)` is Gaussian white noise and `\(|\phi |{<}1\)`
- Unconditional moments satisfy

`\begin{eqnarray}
\mathbb{E}\left[ y_{t}\right] &=&0  \\
\mathsf{var}[y_{t}] &=&\frac{\sigma ^{2}}{1-\phi }   \\
\mathsf{cov}[y_{t}\;y_{t-j}] &=&\phi ^{j}\mathsf{var}(y_{t}) 
\label{uncon_m}
\end{eqnarray}`

---
# Stationarity: Strictly stationary

- Time series is strictly stationary if for any values

`\begin{eqnarray}
\{j_{1}, j_{2}, \dots , j_{n}\},
\end{eqnarray}`

- the joint distribution of

`\begin{eqnarray}
\{ y_{t}, y_{t+j,1} , y_{t+j ,2}, \dots , y_{t+j,n} \}
\end{eqnarray}`

- depend only on the intervals separating the dates

`\begin{eqnarray}
\{ j_{1}, j_{2}, \dots , j_{n}\}
\end{eqnarray}`

- and not on the date itself, `\(t\)`

---
# Stationarity: Covariance stationary

- If neither the mean, `\(\bar{y}\)`, nor the covariance, `\(\mathsf{cov}(y_{t}\;y_{t-j})\)`, depend on the date, `\(t\)`
- Then the process for `\(y_{t}\)` is said to be covariance (weakly) stationary, where for all `\(t\)`  and any `\(j\)`

`\begin{eqnarray}
\mathbb{E}\left[ y_{t}\right] &=&\bar{y} \\ 
\mathbb{E}\left[ \left( y_{t}-\bar{y} \right) \left( y_{t-j}-\bar{y} \right) \right] &=&\mathsf{cov}(y_{t}\;y_{t-j})
\end{eqnarray}`

- When referring to stationarity in the remainder of the course we refer to covariance stationarity
- Note that the process `\(y_{t}=\alpha t+\varepsilon _{t}\)` would not be stationary, as the mean clearly depends on `\({t}\)`
- In addition, we saw that the unconditional moments of the `\(AR(1)\)` with `\(|\phi|<1\)` had a mean and covariance that did not depend on time

---
# Autocorrelation function (ACF)

- For a stationary process we can plot the covariance of the process against a number of lags
- Makes use of the autocovariance function, which is denoted `\(\gamma \left(j\right) \equiv \mathsf{cov}\left( y_{t} \; y_{t-j}\right)\)` for `\(t=1,\ldots , T\)`
- Autocovariance function may be standardized by dividing the variance to derive the ACF

`\begin{eqnarray}
\rho \left(j\right) \equiv \frac{\gamma \left( j\right) }{\gamma \left( 0\right)}
\end{eqnarray}`

- Useful to make a plot of `\(\rho \left( j\right)\)` against (non-negative) `\(j\)` to learn about the properties of a time series

---
# Partial autocorrelation function (PACF)

- With an `\(AR(1)\)` process, `\(y_{t}=\phi y_{t-1}+\varepsilon _{t}\)`, the ACF would suggest `\(y_t\)` and `\(y_{t-2}\)` are correlated even though `\(y_{t-2}\)` does not appear in the model
- Result of the pass-through, where `\(y_t = \phi^2 y_{t-2}\)`
- PACF eliminates effects of intervening values and focuses on relationship between `\(y_t\)` and `\(y_{t-2}\)`
- Hence, the PACF `\((y_t,y_{t-j})\)` eliminates the effects of intervening correlations between `\(y_{t-1}\)` and `\(y_{t-j-1}\)`

---
# Partial autocorrelation function (PACF)

- To construct a PACF one would usually make use of the following steps,
    - Demean the series `\((y_t^\star = y_t - \bar{y})\)`
    - Form the `\(AR(1)\)` equation, `\(y_t^\star = \phi_{11} y_{t-1}^\star + \upsilon_t\)`, where `\(\upsilon_t\)` may not be white noise. In this case, `\(\phi_{11}\)` is `\(\rho(1)\)` in the ACF. Also equal to the first coefficient in the PACF as there are no intervening values between `\(y_t\)` and `\(y_{t-1}\)`
    - Now form the second-order autoregression, `\(y_t^\star = \phi_{21} y_{t-1}^\star + \phi_{22} y_{t-2}^\star + \upsilon_t\)`. In this case, `\(\phi_{22}\)` is the PACF between `\(y_t\)` and `\(y_{t-2}\)`, since the effects of `\(y_{t-1}^\star\)` on `\(y_{t}^\star\)` are captured by the `\(\phi_{21}\)`, which isolate the effects of `\(y_{t-1}\)` on `\(y_t\)`
    - etc.

---
# Q-statistic

- Box-Ljung `\(Q\)`-statistic tests whether series is white noise
- Tests whether a group of autocorrelations differ from zero
- Tests the "overall" randomness based on a number of lags

`\begin{eqnarray}
Q(k) = T(T+2) \sum_{j=1}^{k} \frac{\rho_j^2}{T-j}
\end{eqnarray}`

- where `\(\rho\)` refers to the residual autocorrelation from lag `\(j\)`
- In this case we would express `\(\rho\)` as

`\begin{eqnarray}
\rho_k = \frac{\sum_{t=1}^{T-k} (\varepsilon_t - \bar{\varepsilon})(\varepsilon_{t+k} - \bar{\varepsilon} )}{\sum_{t=1}^{T} (\varepsilon_t - \bar{\varepsilon})^2}
\end{eqnarray}`

- where `\(\bar{\varepsilon}\)` is the mean of the `\(T\)` residuals

---
background-image: url(image/q_stat1.svg)
background-position: top 
background-size: 80% 80%

Figure : Serial Correlation

---
# Ergodicity

- Covariance-stationary process is ergodic in the mean when the sample average, `\(\bar{y}\)` `\(\equiv \left( 1/T\right) \sum^{T}_{t=1}   y_{t}\)`, converges in probability to the population `\(\mathbb{E}\left[ y_{t}\right]\)` as `\(T\rightarrow\)` `\(\infty\)`
- Similar statement could be made for a process that is ergodic in the variance (or autocovariance)
- When ergodicity holds, the sample average and variance provide a consistent estimate of their population counterparts

---
# Impact multipliers

- May investigate cause & effects of events
  - Estimate the response of GDP growth after unexpected `\(1\)`% increase in demand
  - How will the exchange rate react to an unexpected increase in the interest rate
- Assuming stationarity, any `\(AR(p)\)` process can be written as an infinite order MA (later in course)
- Implies that an `\(AR(1)\)` process may be written as

`\begin{equation}
y_{t}=\varepsilon _{t}+\phi \varepsilon _{t-1}+\phi ^{2}\varepsilon_{t-2}+ \dots = \overset{\infty }{\underset{j=0}{\sum }}\phi^{j}\varepsilon_{t-j}
\end{equation}`

- Suggests `\(y_{t}\)` can be described by past & present errors / shocks

---
# Impact multipliers

- Assume the dynamic simulation started at time `\(j\)`, taking `\(y_{t-(j+1)}\)` as given
- Effect of a change in the initial shock on `\(y_{t}\)` is then

`\begin{equation}
\frac{\partial y_{t}}{\partial \varepsilon_{t-j}}=\phi ^{j}
\end{equation}`

- Termed the dynamic multiplier that depends only on `\(j\)`, `\(\varepsilon _{t-j}\)`, and `\(y_{t}\)`
- This expression does not depend on `\(t\)`, (date of observation)

---
# Impulse response functions (IRF)

- Cumulative effect of temporary shock is then

`\begin{equation}
\sum_{j=0}^{\infty} \frac{\partial y_{t}}{\partial \varepsilon_{t-j}} = 1+\phi +\phi ^{2}+ \dots +\phi^{j}=\frac{1}{\left( 1-\phi\right)}
\end{equation}`

- Different values of `\(\phi\)` produce a variety of responses in `\(y_{t}\)`
- When `\(|\phi| < 1\)`, the process decays geometrically towards zero
- When `\(0<\phi<1\)`, there will be a smooth decay
- When `\(0>\phi>-1\)` there will be an oscillating decay
- We say that a system described in this way is stable
- Dynamic multipliers can be moved forward in time, such that `\(\frac{\partial y_{t+j}}{\partial \varepsilon _{t}}= \phi ^{j}\)`
- Hence, the dynamic multiplier for `\(j =1, \ldots , J\)` may be termed the IRF

---
# Conclusion

- Overview of fundamental concepts that we will apply throughout this course
- South African data may contain trends and varying degrees of persistence
- Provides challenge for regression models, as serial correlation in an error term provides inefficient standard errors
- Considered statistical properties of many processes
    - Random walk: effect of errors do not disappear `\(\Rightarrow\)` difficult to forecast
    - `\(AR(p)\)` and `\(MA(q)\)`:  effect of errors dissipate `\(\Rightarrow\)` over (extremely) long-term mean would represent a reasonable forecast and we just need to  model the time taken to revert to mean
- Other essential tools include difference equations, lag operators, autocorrelation functions, impact multipliers and impulse response functions