Processing math: 47%
+ - 0:00:00
Notes for current slide
Notes for next slide

Decomposing Time Series

Kevin Kotzé

1 / 55

Contents

  1. Introduction
  2. Spectral analysis
  3. Statistical detrending methods
  4. Deterministic trends
  5. Stochastic filters
  6. Beveridge-Nelson decompositions
  7. Wavelet decompositions
2 / 55

Introduction to decompositions

  • Most time series exhibit repetitive or regular behaviour over time
  • Hence a great part of the study of this data is conducted in the time domain, with ARIMA or state-space models
  • Another important phenomena of time series is that they may be decomposed into periodic variations of the underlying phenomenon
  • Shumway & Stoffer (2011) suggest these frequency decompositions should be expressed as Fourier frequencies that are driven by sines and cosines
  • This follows the tradition of Joseph Fourier
3 / 55

Identifying the business cycle

  • These decompositions may be used to describe the stylised facts of the business cycle:
  • Business cycle refers to regular periods of expansion and contraction in major economic aggregate variables (Burns and Mitchell, 1946)
    • i.e. persistence of economic fluctuations and correlations (or lack thereof) across economic aggregates
  • We say that a turning point occurs when the business cycle reaches local maximum (peak) or local minimum (trough)
  • May also be used to identify the output gap (difference between potential and actual output)
  • It is important to note that when seeking to measure the business cycle, there are no unique periodicities that are of relevance
  • In addition, most economic time series are both fluctuating and growing, which makes the decomposition quite difficult
4 / 55

Introduction to decompositions

  • We may imagine a system responding to various driving frequencies by producing linear combinations of sine and cosine functions
  • The frequency domain may be considered as a regression of a time series on periodic sines and cosines
  • This lecture considers a few widely used methods that are used to decompose economic time series
5 / 55

Spectral Analysis

  • A time series could be considered as a weighted sum of underlying series that have different cyclical patterns
  • The total variation of an observed time series will be the sum of each underlying series, which may vary in different frequencies
  • Spectral analysis is a tool that can be used to decompose the variation of a time series into different frequency components
6 / 55

Spectral Analysis

  • Consider an example of three quarterly time series variables, yt, xt and υt
  • Where the term υti.i.d.N(0,σ)
  • If xt=ytβ+υt, then after regressing yt on xt, we would expect to find that the coefficient would be large and significant, provided that σ is not too large
  • The rationale for this is that xt contains information about yt, which is reflected by the coefficient value for β
  • From an intuitive perspective, frequency domain analysis involves regressing a time series variable, xt, on a number of different periodic frequency components
  • This would allow us to identify which frequency component is contained in xt
7 / 55

Spectral Analysis

  • To define the rate at which a series oscillates, we define a cycle as one completed period of a sine or cosine functions yt=Acos(2πωt+ϕ)
  • for t=0,±1,±2,, where ω is a frequency index, defined in cycles per unit of time, which is an expression of T
  • A determines the height or amplitude of the function and the starting point of the cosine function is termed the phase, ϕ
8 / 55

Spectral Analysis

  • When seeking to conduct some form of data analysis, it is usually easier to use a trigonometric identity of this expression which may be written as, yt=U1cos(2πωt)+U2sin(2πωt)
  • where U1=Acosϕ and U2=Asinϕ are often taken to be normally distributed random variables
  • The above random process is also a function of its frequency, defined by the parameter ω
  • The frequency is measured in cycles per unit of time
  • For ω=1, the series makes one cycle per time unit
  • For ω=.50, the series makes two cycles per time unit
9 / 55

Spectral Analysis

  • To see how the spectral techniques can be used to interpret the regular frequencies in the series, consider the following four periodic time series x1,t=2cos(2πt6/100)+3sin(2πt6/100)x2,t=4cos(2πt30/100)+5sin(2πt30/100)x3,t=6cos(2πt40/100)+7sin(2πt40/100)yt=x1,t+x2,t+x3,t
10 / 55

Figure : Different frequency components

11 / 55

Spectral Analysis

  • Sorting out of the essential frequency components in a time series, including their relative contributions, constitutes one of the main objectives of spectral analysis
  • One way to accomplish this objective is to regress sinusoids that vary at the different fundamental frequencies on the data
  • This is represented by the periodogram (or sample spectral density) and may be expressed as, P(j/n)=2nnt=1ytcos(2πtj/n)2+2nnt=1ytsin(2πtj/n)2
  • It may be regarded as a measure of the squared correlation of the data with sinusoids oscillating at a frequency of ωj=j/n, or j cycles in n time points
12 / 55

Figure : Periodogram for frequency components

13 / 55

Spectral Analysis

  • An interesting exercise would be to construct the x1 series from yt, which may be regarded as actual data
  • To do so we need to filter out all components that lie outside the chosen frequency band of x1
  • Such a filter could operate with the aid of a regression model that contains the information that relates to a particular frequency (although there are more convenient ways of going about this)
  • Hence, cycles with frequencies corresponding to x2 and x3 would be excluded, while cycles with frequency corresponding to x1 will be maintained (i.e. can pass through the filter)
14 / 55

Figure : Filtered result for frequency components

15 / 55

Methods for decomposing a time series

  • Various detrending methods provide different estimates of the cycle
  • Appropriate transformation should depend on the underlying dynamic properties
    • Usually a good idea to consider whether a series has a stochastic trend (i.e. unit root)
  • Assume that an economic time series can be decomposed into trend, gt, and cycle, ct: yt=gt+ct
  • where we abstract from a noise and seasonal component
  • Estimates of gt and ct may be obtained from various univariate detrending methods
16 / 55

Deterministic trends & filters

  • Early methods to decompose economic variables assumed that the (natural) growth path for the economy was largely deterministic
  • Therefore, the trend cycle decomposition was described as, yt=gt+ctˆgt=ˆα0+ˆα1t+ˆα2t2+ˆct=ytˆgt

    • where the trend, ˆgt, is found by simple estimation techniques
    • cycle corresponds to the residual in the series
  • When we assume a linear trend, |α1|>0 and α2=0
  • For a quadratic trend, |α1|>0 and |α2|>0
17 / 55

Figure : Linear decomposition - SA output (1960Q1-2018Q4)

18 / 55

Deterministic trends & filters

  • Previous graph displays the logarithm of South Africa GDP with linear trend and cycle
  • However, productivity growth has not been perfectly log-linear (i.e. constant growth rate) and far from smooth
  • In addition, there are several structural breaks such as the oil price shock in 1973/1974 & recent GFC
  • To allow for a possible structural break in the trend, we could estimate, ˆgt=ˆα0+ˆα1t+ˆα2DSt(j)+ˆα3DLt(k)+
  • where DSt(j) and DLt(k) are dummy variables that capture the change in the slope or the level of the trend in periods j and k
  • Hence, DSt(j)=tj and DLt(k)=1, if t>j or t>k, while it would be zero otherwise
19 / 55

Deterministic trends & filters

  • Identifying structural breaks could be problematic
  • Detrending an integrated process with a deterministic trend may result in the introduction of a spurious cycle
  • See Nelson-King (1981) for details
20 / 55

Stochastic trends & filters

  • Let gt represent a moving average of observed yt
  • We can extract the trend component, gt, by applying gt=nj = mωjytj
  • where m and n are positive integers and ωj are weights in the G(L) polynomial G(L)=nj = mωjLj
  • where L is defined so that Ljyt=ytj for positive and negative values for j
21 / 55

Stochastic trends & filters

  • The cyclical component is the difference between yt and gt ct=[1G(L)] ytC(L) yt
  • where C(L) and G(L) are linear filters
  • Weights are chosen to add up to one, nj=mcj=1, so that the level of the series is maintained
  • The moving-average filter with weight 1/5, may be obtained by filtering over the {moving window} of five observations gt=152j=2yt=15(yt2+yt1+yt+yt+1+yt+2)
  • Will produce a smooth stochastic trend if the underlying data has such a trend
22 / 55

Hodrick-Prescott (HP) filter

  • The HP filter has been a widely used approach to extract cycles in economic data
  • Extracts a stochastic trend, gt, for a given value of λ, which is the smoothing parameter
  • Seeks to emphasize true business cycle frequencies
  • The filter can be obtained as the solution to the following problem: min
  • This minimization problem has a unique solution, and the filtered series, g_{t} has the same length as y_{t}
  • Termed a low-pass filter as it only models low frequency data
23 / 55

Hodrick-Prescott (HP) filter

  • The smoothness is determined by \lambda, which penalizes the acceleration in the growth component
  • If \lambda \rightarrow \infty, the lowest minimum is achieved when variability in the trend is zero (as in the case of a linear trend)
  • If \lambda =0 there is more variation in the trend, such that there will be no cycle
  • Hodrick and Prescott argue that \lambda =1600 is a reasonable choice for U.S. quarterly data
  • However, it is not necessarily the case that it can be universally applied, to other variables or output of other components
  • Specifies a typical cycle of between eight to ten years when using traditional values for \lambda
24 / 55

Hodrick-Prescott (HP) filter

  • Another concern is the end-of-sample problem:
    • trend is close to observed data at the beginning and end of the sample
    • problematic when we are at the peak of a cycle
    • some researchers use forecasts to generate additional data at the end of the series
  • King & Rebelo (1993) note that the HP-filtered cyclical component contains both forward and backward differences
  • As a result, the end of sample properties are poor when you do not have an observation for t+1 or t-1
  • Method is also criticised on the basis that the smoothness of the stochastic trend component has to be determined a priori
25 / 55

Figure : HP filter - SA output (1960Q1-2018Q4)

26 / 55

Band pass filters

  • Band pass filters introduced to economic data by Baxter & King (1999) and Christiano & Fitzgerald (2003)
  • Identify all components that correspond to the chosen frequency band that has an upper and lower limit
  • Need to determine the periodicity of the business cycles one wants to extract
  • This is usually expressed within the frequency domain
27 / 55

Frequency-domain

  • Consider a time series, \begin{eqnarray} y_{t}=A\cos (2\pi \omega t) \end{eqnarray}
  • where A is the amplitude (height) of the cycle
  • \omega is the frequency of oscillation (the number of occurrences of a repeating event per unit of time)
  • 2 \pi measures the period of the cycles
  • t is the time
  • Hence, if y_{t}=A\cos (2\pi t), we will observe one cycle over the data sample
  • By increasing \omega, we increase the number of cycles
28 / 55

Figure : Artificial Data

29 / 55

Band Pass Filters

  • An intuitive measure of frequency is the amount of time that elapses per cycle, \lambda \begin{eqnarray} \lambda =2\pi /\omega \end{eqnarray}
  • Where we have quarterly data, for \omega corresponding to a cycle length of 1.5 years
  • Set \lambda=6 quarters per cycle and solve for 6=2\pi /\omega_{h} \begin{eqnarray} \omega_{h}=2\pi /6=\pi /3 \end{eqnarray}
  • Similarly, the frequency corresponding to a low frequent cycle length of 8 years is: \begin{eqnarray} \omega_{1}=2\pi /32=\pi /16 \end{eqnarray}
30 / 55

Baxter and King (1999)

  • Baxter and King (1999) decompose a time series into three periodic components: trend, cycle, and irregular fluctuations
  • Business cycles were defined as periodic components whose frequencies lie between 1.5 and 8 years per cycle
  • Periodic components with lengths longer than 8 years were identified with the trend
  • Periodic components of less than 1.5 years were identified with the irregular component \begin{eqnarray} B(\omega ) &=&1\text{ for }\omega \in \lbrack \pi /16,\pi /3]\text{ or }[-\pi /3,-\pi /16 ]\\ &=&0\text{ otherwise} \end{eqnarray}
  • Hence, the interval B(\omega ) = [\pi /16,\pi /3] can be interpreted as the business cycle frequency
  • The interval [0,\pi /16] corresponds to the trend and [\pi /3,\pi ] defines irregular fluctuations
31 / 55

Band Pass Filters

  • While Baxter and King favour a 3-part decomposition, other economists prefer a two-part classification
  • This may be incorporated in this setup, where \begin{eqnarray} H(\omega ) &=&1\text{ for }\omega \in \lbrack \pi /16,\pi ]\text{ or }[\text{-}\pi ,-\pi /16] \\ &=&0\text{ otherwise} \end{eqnarray}
  • The trend component is still defined in terms of fluctuations lasting more than 8 years
  • Cyclical component now consists of all oscillation lasting 8 years or less
  • This is known as a high pass filter, as only higher frequency components are captured in H(\omega )
  • As with the HP filter one has to decide on the preferred frequencies for the cycles a priori
  • There are potential end-of-sample problems, but usually eliminate the estimates at the start and end
32 / 55

Figure : BP filter - SA output (1960Q1-2018Q4)

33 / 55

Beveridge-Nelson Decomposition

  • Beveridge & Nelson (1981) model the trend as a random walk with drift, and the cycle is treated as stationary process with zero mean
  • To perform this decomposition, let y_{t} be integrated of first order, so that its first difference, \Delta y_{t} are stationary
  • Assume it has the following moving average representation \begin{eqnarray} (1-L)y_{t}=\Delta y_{t}= \mu +B(L)\varepsilon_{t} \end{eqnarray}
34 / 55

Beveridge-Nelson Decomposition

  • The BN decomposition explores the following
  • First, define the polynomial, \begin{eqnarray} B^{\ast }(L)=(1-L)^{-1}[B(L)-B(1)] \end{eqnarray}
  • where B(1)=\overset{\infty }{\underset{s=0}{\sum }}B_{s}
  • Rewriting this polynomial in terms of B(L), gives \begin{eqnarray} B(L)=[B(1)+(1-L)B^{\ast }(L)] \end{eqnarray}
  • and substituting into the above yields \begin{eqnarray} \Delta y_{t}=\mu +B(L)\varepsilon_{t}= \mu +[B(1)+(1-L)B^{\ast}(L)]\varepsilon_{t} \end{eqnarray}
35 / 55

Beveridge-Nelson Decomposition

  • For the decomposition, y_{t}=g_{t}+c_{t}, it follows that \Delta y_{t}=\Delta g_{t}+\Delta c_{t}
  • Therefore a change in the trend component of y_{t} equals \begin{eqnarray} \Delta g_{t}=\mu +B(1)\varepsilon_{t} \end{eqnarray}
  • and the change in the cyclical component is, \begin{eqnarray} \Delta c_{t}=(1-L)B^{\ast }(L)\varepsilon_{t} \end{eqnarray}
  • where we see that the trend follows a random walk with drift
  • This expression can be solved to yield \begin{eqnarray} g_{t}=g_{0}+\mu t+B(1)\overset{t}{\underset{s=1}{\sum }}\varepsilon_{s} \end{eqnarray}
36 / 55

Beveridge-Nelson Decomposition

  • As such, the trend consists of both a deterministic term \begin{eqnarray} g_{0}+\mu t \end{eqnarray} and a stochastic term \begin{eqnarray} B(1)\overset{t}{\underset{s=1}{\sum }}\varepsilon_{s} \end{eqnarray}
  • For B(1)=0, the trend reduces to a deterministic case
  • where for B(1)\neq 0, the stochastic part indicates the long-run impact of a shock \varepsilon_{t} on the level of y_{t}
37 / 55

Beveridge-Nelson Decomposition

  • The cyclical component is stationary and is given by \begin{eqnarray} c_{t}=B^{\ast }(L)\varepsilon_{t}=(1-L)^{-1}[B(L)-B(1)]\varepsilon_{t} \end{eqnarray}
  • Beveridge & Nelson (1981) showed that the stochastic trend could also be interpreted as the long-term forecast from RW plus drift model
  • Cycle is the stationary process that reflects the deviations from the trend
38 / 55

Beveridge-Nelson Decomposition

  • To estimate the BN decomposition in practice, assume an AR(1) process for the growth rate of output \begin{eqnarray} \Delta y_{t}=\phi \Delta y_{t-1}+\varepsilon_{t}, \end{eqnarray}
  • where we ignore the constant term
  • Assuming \phi <1, the AR(1) process can be written in terms of the infinite order MA(q) process where we find B(L), B(1) and B^{\ast }(L) as \begin{eqnarray} B(L) &=&\frac{1}{1-\phi L} \\ B(1) &=&\frac{1}{1-\phi } \\ B^{\ast }(L) &=&(1-L)^{-1}[B(L)-B(1)]=\frac{\phi }{(1-\phi )(1-\phi L)} \end{eqnarray}
39 / 55

Beveridge-Nelson Decomposition

  • Solving in terms of y_{t} \begin{eqnarray} y_{t}=(1-L)^{-1}[B(1)+(1-L)B^{\ast }(L)]\varepsilon_{t} \end{eqnarray}
  • which can be rewritten as \begin{eqnarray} y_{t}=B(1)(1-L)^{-1}\varepsilon_{t}+(1-L)^{-1}[B(L)-B(1)]\varepsilon_{t} \end{eqnarray}
  • Substituting in for the AR(1) solution derived above, we have \begin{eqnarray} y_{t} &=&g_{t}+c_{t} \\ &\Downarrow & \\ y_{t} &=&\frac{1}{1-\phi }(1-L)^{-1}\varepsilon_{t}+\frac{-\phi }{(1-\phi L)(1-\phi )}\varepsilon_{t} \end{eqnarray}
40 / 55

Beveridge-Nelson Decomposition

  • The advantage of Beveridge-Nelson method is that it is appropriate when a series is difference-stationary
  • It also allows the series to contain a unit root that can be highly volatile
  • However, it has the disadvantage of being rather time-consuming to compute
  • In addition, one has to choose between different ARMA(p,q) models that may give quite different results
  • Misrepresenting an I(2) process as an I(1) process may generate excess volatility in the trend
41 / 55

Figure : Evaluation of decompositions - SA output (1960Q1-2018Q4)

42 / 55

Figure : Leads and Lags - Correlation with GDP

43 / 55

Summary

  • Many economic and financial applications make use of decompositions for nonstationary time series, which are transformed into a permanent and a transitory component
  • Could use a linear filter where trend is perturbed by transitory cyclical fluctuations
  • The Hodrick-Prescott (HP) filter is the most popular way to extract business cycles
  • The HP filter extracts a stochastic trend for a given value of the parameter \lambda
    • Trend moves smoothly over time and is uncorrelated with the cycle
    • Results are not robust to the value of the smoothness parameter
  • Another popular method used to measure the business cycle is the band pass (BP) filter
  • The filter removes (filters out) all the components in a series except those that correspond to the chosen frequency band
44 / 55

Summary

  • In the Beveridge and Nelson (BN) decomposition, the permanent component is shown to be a random walk with drift
  • The transitory component is stationary process with zero mean, which is perfectly correlated with the permanent component
  • Different decompositions provide different results and should be interpreted with caution
  • Usually a good idea to consider different options before drawing conclusions
45 / 55

Wavelet transformations

  • Spectral decompositions define the rate at which the time series oscillates
  • Results in the loss of all time-based information
  • Assumes that the periodicity of all the components is consistent throughout the entire sample
  • This may not be the case:
    • Gabor (1946) developed the Short-Time Fourier Transform (STFT) technique
    • Involves the application of a number of Fourier transforms to different subsamples
    • Precision of the analysis is affected by the size of the subsample
  • Large subsample to identify changes in low frequency
  • Small subsamples to identify changes in higher frequency
46 / 55

Wavelet transformations

  • Wavelet transformations capture features of time-series data across different frequencies that arise at different points in time
  • Wavelet functions are stretched and shifted to describe features that are localised in frequency and time
    • Could be expanded over a relatively long period of time when identifying low-frequency events
    • Could be relatively narrow when describing high frequency events
  • Involves shifting various wavelet functions with different amplitudes over the sample of data
  • One is then able to associate the components with specific time horizons that occur at different locations in time
  • Wavelets use scales rather than frequency bands, where the highest scale refers to the lowest frequency
47 / 55

Wavelet transformations

  • Early work with wavelet functions dates back to Haar (1910)
  • See, Hubbard (1998) and Heil (2006) for a detailed account of the history of wavelet analysis
  • For computation most studies currently employ the multiresolution decomposition of Mallat (1989) and Strang (1996)
  • Early applications of wavelet methods in economics include the work of Ramsey (1997), which made use of a wavelet decomposition of exchange rate data to describe the distribution of this data at different frequencies
48 / 55

Wavelet transformations

  • To describe this technique, consider a variable that is composed of a trend and a number of higher-frequency components
  • The trend may be represented by a father wavelet, \phi(t)
  • The mother wavelets, \psi(t), are used to describe information at lower scales (i.e. higher frequencies)
  • One could then describe variable x_t as \begin{eqnarray} x_t = \sum_k s_{0,k} \phi_{0,k} (t) + \sum_{j=0}^{J} \sum_k d_{J,k} \psi_{J,k} (t) \end{eqnarray}
  • where J refers to the number of scales, and k refers to the location of the wavelet in time
  • The s_{0,k} coefficients are termed smooth coefficients, since they represent the trend, and the d_{J,k} coefficients are termed the detailed coefficients, since they represent finer details in the data
49 / 55

Wavelet transformations

  • Mother wavelet functions, \psi_{J,k} (t), \dots, \psi_{1,k} (t), are then generated by shifts in the location of the wavelet in time and scale \begin{eqnarray} \psi_{j,k} (t) = 2^{-j/2} \psi \left(\frac{t-2^{j}k}{2^j}\right), \;\; j=1,\dots,J \end{eqnarray}
  • where the shift parameter is represented by 2^{j}k and the scale parameter is 2^{j}
  • As depicted in the daublet wavelet functions, smaller values of j (which produce a smaller scale parameter 2^{j}), would provide the relatively tall and narrow wavelet function on the left
  • Larger values of j, the wavelet function is more spread out and of lower amplitude
  • After shifting this function by one period, we produce the function that is depicted on the right
50 / 55

Figure : Daublet (4) wavelet functions - \psi_{1,0}(t) and \psi_{2,1}(t)

51 / 55

Wavelet transformations

  • Wavelet functions may be:
    • Smooth - decompose data into trend, cycle, noise (or various cycles)
    • Peaked - identify peak and trough of cycle
    • Square - identify structural breaks
  • Use smooth functions that include daublets, coiflets and symlets
  • Multiresolution techniques are used for computation, which includes the maximum overlap discrete wavelet transform (MODWT)
52 / 55

Figure : Daublet (4) wavelet decomposition - South African inflation

53 / 55

Figure : Daublet (4) wavelet decomposition - South African inflation

54 / 55

Wavelet transformations – Summary

  • Advantages:
    • Can be applied to data of any integration order
    • Has the benefits of spectral techniques without losing time support (very useful when identifying changes in the process at different frequencies)
    • Can include a number of bands, which are additive
55 / 55

Contents

  1. Introduction
  2. Spectral analysis
  3. Statistical detrending methods
  4. Deterministic trends
  5. Stochastic filters
  6. Beveridge-Nelson decompositions
  7. Wavelet decompositions
2 / 55
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow