Decomposing Time SeriesKevin Kotzé1 / 55

ContentsIntroduction
Spectral analysis
Statistical detrending methods
Deterministic trends
Stochastic filters
Beveridge-Nelson decompositions
Wavelet decompositions
2 / 55

Introduction to decompositionsMost time series exhibit repetitive or regular behaviour over time
Hence a great part of the study of this data is conducted in the time domain, with ARIMA or state-space models
Another important phenomena of time series is that they may be decomposed into periodic variations of the underlying phenomenon
Shumway & Stoffer (2011) suggest these frequency decompositions should be expressed as Fourier frequencies that are driven by sines and cosines
This follows the tradition of Joseph Fourier
3 / 55

Identifying the business cycleThese decompositions may be used to describe the stylised facts of the business cycle:
Business cycle refers to regular periods of expansion and contraction in major economic aggregate variables (Burns and Mitchell, 1946)i.e. persistence of economic fluctuations and correlations (or lack thereof) across economic aggregates

We say that a turning point occurs when the business cycle reaches local maximum (peak) or local minimum (trough)
May also be used to identify the output gap (difference between potential and actual output)
It is important to note that when seeking to measure the business cycle, there are no unique periodicities that are of relevance
In addition, most economic time series are both fluctuating and growing, which makes the decomposition quite difficult
4 / 55

Introduction to decompositionsWe may imagine a system responding to various driving frequencies by producing linear combinations of sine and cosine functions
The frequency domain may be considered as a regression of a time series on periodic sines and cosines
This lecture considers a few widely used methods that are used to decompose economic time series
5 / 55

Spectral AnalysisA time series could be considered as a weighted sum of underlying series that have different cyclical patterns
The total variation of an observed time series will be the sum of each underlying series, which may vary in different frequencies
Spectral analysis is a tool that can be used to decompose the variation of a time series into different frequency components
6 / 55

Spectral AnalysisConsider an example of three quarterly time series variables, yt, xt and υt
Where the term υt∼i.i.d.N(0,σ)
If xt=y⊤tβ+υt, then after regressing yt on xt, we would expect to find that the coefficient would be large and significant, provided that σ is not too large
The rationale for this is that xt contains information about yt, which is reflected by the coefficient value for β
From an intuitive perspective, frequency domain analysis involves regressing a time series variable, xt, on a number of different periodic frequency components
This would allow us to identify which frequency component is contained in xt
7 / 55

Spectral AnalysisTo define the rate at which a series oscillates, we define a cycle as one completed period of a sine or cosine functions
yt=Acos(2πωt+ϕ)
for t=0,±1,±2,…, where ω is a frequency index, defined in cycles per unit of time, which is an expression of T
A determines the height or amplitude of the function and the starting point of the cosine function is termed the phase, ϕ
8 / 55

Spectral AnalysisWhen seeking to conduct some form of data analysis, it is usually easier to use a trigonometric identity of this expression which may be written as,
yt=U1cos(2πωt)+U2sin(2πωt)
where U1=Acosϕ and U2=−Asinϕ are often taken to be normally distributed random variables
The above random process is also a function of its frequency, defined by the parameter ω
The frequency is measured in cycles per unit of time
For ω=1, the series makes one cycle per time unit
For ω=.50, the series makes two cycles per time unit
9 / 55

Spectral AnalysisTo see how the spectral techniques can be used to interpret the regular frequencies in the series, consider the following four periodic time series
x1,t=2cos(2πt6/100)+3sin(2πt6/100)x2,t=4cos(2πt30/100)+5sin(2πt30/100)x3,t=6cos(2πt40/100)+7sin(2πt40/100)yt=x1,t+x2,t+x3,t
10 / 55

Figure : Different frequency components

11 / 55

Spectral AnalysisSorting out of the essential frequency components in a time series, including their relative contributions, constitutes one of the main objectives of spectral analysis
One way to accomplish this objective is to regress sinusoids that vary at the different fundamental frequencies on the data
This is represented by the periodogram (or sample spectral density) and may be expressed as,
P(j/n)=2nn∑t=1ytcos(2πtj/n)2+2nn∑t=1ytsin(2πtj/n)2
It may be regarded as a measure of the squared correlation of the data with sinusoids oscillating at a frequency of ωj=j/n, or j cycles in n time points
12 / 55

Figure : Periodogram for frequency components

13 / 55

Spectral AnalysisAn interesting exercise would be to construct the x1 series from yt, which may be regarded as actual data
To do so we need to filter out all components that lie outside the chosen frequency band of x1
Such a filter could operate with the aid of a regression model that contains the information that relates to a particular frequency (although there are more convenient ways of going about this)
Hence, cycles with frequencies corresponding to x2 and x3 would be excluded, while cycles with frequency corresponding to x1 will be maintained (i.e. can pass through the filter)
14 / 55

Figure : Filtered result for frequency components

15 / 55

Methods for decomposing a time seriesVarious detrending methods provide different estimates of the cycle
Appropriate transformation should depend on the underlying dynamic propertiesUsually a good idea to consider whether a series has a stochastic trend (i.e. unit root)

Assume that an economic time series can be decomposed into trend, gt, and cycle, ct:
yt=gt+ct
where we abstract from a noise and seasonal component
Estimates of gt and ct may be obtained from various univariate detrending methods
16 / 55

Deterministic trends & filters

Early methods to decompose economic variables assumed that the (natural) growth path for the economy was largely deterministic
Therefore, the trend cycle decomposition was described as, $\begin{eqnarray} y_{t} &=&g_{t}+c_{t} \\ \widehat{g}_{t} &=&\widehat{\alpha }_{0}+\widehat{\alpha }_{1}t+\widehat{\alpha }_{2}t^{2}+ \ldots \\ \widehat{c}_{t} &=&y_{t} - \widehat{g}_{t} \end{eqnarray}$
- where the trend, $\widehat{g}_{t}$ , is found by simple estimation techniques
- cycle corresponds to the residual in the series
When we assume a linear trend, $|\alpha_{1}| >0$ and $\alpha_{2}=0$
For a quadratic trend, $|\alpha_{1}| >0$ and $|\alpha_{2}| >0$

17 / 55

Figure : Linear decomposition - SA output (1960Q1-2018Q4)

18 / 55

Deterministic trends & filtersPrevious graph displays the logarithm of South Africa GDP with linear trend and cycle
However, productivity growth has not been perfectly log-linear (i.e. constant growth rate) and far from smooth
In addition, there are several structural breaks such as the oil price shock in 1973/1974 & recent GFC
To allow for a possible structural break in the trend, we could estimate,
ˆgt=ˆα0+ˆα1t+ˆα2DSt(j)+ˆα3DLt(k)+…
where DSt(j) and DLt(k) are dummy variables that capture the change in the slope or the level of the trend in periods j and k
Hence, DSt(j)=t−j and DLt(k)=1, if  t>j or t>k, while it would be zero otherwise
19 / 55

Deterministic trends & filtersIdentifying structural breaks could be problematic
Detrending an integrated process with a deterministic trend may result in the introduction of a spurious cycle
See Nelson-King (1981) for details
20 / 55

Stochastic trends & filtersLet gt represent a moving average of observed yt
We can extract the trend component, gt, by applying
gt=n∑j = −mωjyt−j
where m and n are positive integers and ωj are weights in the G(L) polynomial
G(L)=n∑j = −mωjLj
where L is defined so that Ljyt=yt−j for positive and negative values for j
21 / 55

Stochastic trends & filtersThe cyclical component is the difference between yt and gt
ct=[1−G(L)] yt≡C(L) yt
where C(L) and G(L) are linear filters
Weights are chosen to add up to one, ∑nj=−mcj=1, so that the level of the series is maintained
The moving-average filter with weight 1/5, may be obtained by filtering over the {moving window} of five observations
gt=152∑j=−2yt=15(yt−2+yt−1+yt+yt+1+yt+2)
Will produce a smooth stochastic trend if the underlying data has such a trend
22 / 55

Hodrick-Prescott (HP) filterThe HP filter has been a widely used approach to extract cycles in economic data
Extracts a stochastic trend, gt, for a given value of λ, which is the smoothing parameter
Seeks to emphasize true business cycle frequencies
The filter can be obtained as the solution to the following problem:
min
This minimization problem has a unique solution, and the filtered series, g_{t} has the same length as y_{t}
Termed a low-pass filter as it only models low frequency data
23 / 55

Hodrick-Prescott (HP) filterThe smoothness is determined by \lambda, which penalizes the acceleration in the growth component
If \lambda \rightarrow \infty, the lowest minimum is achieved when variability in the trend is zero (as in the case of a linear trend)
If \lambda =0 there is more variation in the trend, such that there will be no cycle
Hodrick and Prescott argue that  \lambda =1600 is a reasonable choice for U.S. quarterly data
However, it is not necessarily the case that it can be universally applied, to other variables or output of other components
Specifies a typical cycle of between eight to ten years when using traditional values for \lambda
24 / 55

Hodrick-Prescott (HP) filterAnother concern is the end-of-sample problem:trend is close to observed data at the beginning and end of the sample
problematic when we are at the peak of a cycle
some researchers use forecasts to generate additional data at the end of the series

King & Rebelo (1993) note that the HP-filtered cyclical component contains both forward and backward differences
As a result, the end of sample properties are poor when you do not have an observation for t+1 or t-1
Method is also criticised on the basis that the smoothness of the stochastic trend component has to be determined a priori
25 / 55

Figure : HP filter - SA output (1960Q1-2018Q4)

26 / 55

Band pass filtersBand pass filters introduced to economic data by Baxter & King (1999) and Christiano & Fitzgerald (2003)
Identify all components that correspond to the chosen frequency band that has an upper and lower limit
Need to determine the periodicity of the business cycles one wants to extract
This is usually expressed within the frequency domain
27 / 55

Frequency-domainConsider a time series,
\begin{eqnarray}
y_{t}=A\cos (2\pi \omega t)
\end{eqnarray}
where A is the amplitude (height) of the cycle
\omega is the frequency of oscillation (the number of occurrences of a repeating event per unit of time)
2 \pi measures the period of the cycles
t is the time
Hence, if y_{t}=A\cos (2\pi t), we will observe one cycle over the data sample
By increasing \omega, we increase the number of cycles
28 / 55

Figure : Artificial Data

29 / 55

Band Pass FiltersAn intuitive measure of frequency is the amount of time that elapses per cycle, \lambda
\begin{eqnarray}
\lambda =2\pi /\omega
\end{eqnarray}
Where we have quarterly data, for  \omega corresponding to a cycle length of 1.5 years
Set  \lambda=6 quarters per cycle and solve for 6=2\pi /\omega_{h}
\begin{eqnarray}
\omega_{h}=2\pi /6=\pi /3
\end{eqnarray}
Similarly, the frequency corresponding to a low frequent cycle length of 8 years is:
\begin{eqnarray}
\omega_{1}=2\pi /32=\pi /16
\end{eqnarray}
30 / 55

Baxter and King (1999)Baxter and King (1999) decompose a time series into three periodic components: trend, cycle, and irregular fluctuations
Business cycles were defined as periodic components whose frequencies lie between 1.5 and 8 years per cycle
Periodic components with lengths longer than 8 years were identified with the trend
Periodic components of less than 1.5 years were identified with the irregular component
\begin{eqnarray}
B(\omega ) &=&1\text{ for }\omega \in \lbrack \pi /16,\pi /3]\text{ or }[-\pi /3,-\pi /16 ]\\
&=&0\text{ otherwise}
\end{eqnarray}
Hence, the interval B(\omega ) = [\pi /16,\pi /3]  can be interpreted as the business cycle frequency
The interval [0,\pi /16] corresponds to the trend and [\pi /3,\pi ]  defines irregular fluctuations
31 / 55

Band Pass FiltersWhile Baxter and King favour a 3-part decomposition, other economists prefer a two-part classification
This may be incorporated in this setup, where
\begin{eqnarray}
H(\omega ) &=&1\text{ for }\omega \in \lbrack \pi /16,\pi ]\text{ or }[\text{-}\pi ,-\pi /16] \\
&=&0\text{ otherwise}
\end{eqnarray}
The trend component is still defined in terms of fluctuations lasting more than 8 years
Cyclical component now consists of all oscillation lasting 8 years or less
This is known as a high pass filter, as only higher frequency components are captured in H(\omega )
As with the HP filter one has to decide on the preferred frequencies for the cycles a priori
There are potential end-of-sample problems, but usually eliminate the estimates at the start and end
32 / 55

Figure : BP filter - SA output (1960Q1-2018Q4)

33 / 55

Beveridge-Nelson DecompositionBeveridge & Nelson (1981) model the trend as a random walk with drift, and the cycle is treated as stationary process with zero mean
To perform this decomposition, let y_{t} be integrated of first order, so that its first difference, \Delta y_{t} are stationary
Assume it has the following moving average representation
\begin{eqnarray}
(1-L)y_{t}=\Delta y_{t}= \mu +B(L)\varepsilon_{t}
\end{eqnarray}
34 / 55

Beveridge-Nelson DecompositionThe BN decomposition explores the following
First, define the polynomial,
\begin{eqnarray}
B^{\ast }(L)=(1-L)^{-1}[B(L)-B(1)]
\end{eqnarray}
where B(1)=\overset{\infty }{\underset{s=0}{\sum }}B_{s}
Rewriting this polynomial in terms of B(L), gives
\begin{eqnarray}
B(L)=[B(1)+(1-L)B^{\ast }(L)]
\end{eqnarray}
and substituting into the above yields
\begin{eqnarray}
\Delta y_{t}=\mu +B(L)\varepsilon_{t}= \mu +[B(1)+(1-L)B^{\ast}(L)]\varepsilon_{t}
\end{eqnarray}
35 / 55

Beveridge-Nelson DecompositionFor the decomposition, y_{t}=g_{t}+c_{t}, it follows that \Delta y_{t}=\Delta g_{t}+\Delta c_{t}
Therefore a change in the trend component of y_{t} equals
\begin{eqnarray}
\Delta g_{t}=\mu +B(1)\varepsilon_{t}
\end{eqnarray}
and the change in the cyclical component is,
\begin{eqnarray}
\Delta c_{t}=(1-L)B^{\ast }(L)\varepsilon_{t}
\end{eqnarray}
where we see that the trend follows a random walk with drift
This expression can be solved to yield
\begin{eqnarray}
g_{t}=g_{0}+\mu t+B(1)\overset{t}{\underset{s=1}{\sum }}\varepsilon_{s}
\end{eqnarray}
36 / 55

Beveridge-Nelson DecompositionAs such, the trend consists of both a deterministic term
\begin{eqnarray}
g_{0}+\mu t
\end{eqnarray}
and a stochastic term
\begin{eqnarray}
B(1)\overset{t}{\underset{s=1}{\sum }}\varepsilon_{s}
\end{eqnarray}
For B(1)=0, the trend reduces to a deterministic case
where for B(1)\neq 0, the stochastic part indicates the long-run impact of a shock \varepsilon_{t} on the level of y_{t}
37 / 55

Beveridge-Nelson DecompositionThe cyclical component is stationary and is given by
\begin{eqnarray}
c_{t}=B^{\ast }(L)\varepsilon_{t}=(1-L)^{-1}[B(L)-B(1)]\varepsilon_{t}
\end{eqnarray}
Beveridge & Nelson (1981) showed that the stochastic trend could also be interpreted as the long-term forecast from RW plus drift model
Cycle is the stationary process that reflects the deviations from the trend
38 / 55

Beveridge-Nelson DecompositionTo estimate the BN decomposition in practice, assume an AR(1) process for the growth rate of output
\begin{eqnarray}
\Delta y_{t}=\phi \Delta y_{t-1}+\varepsilon_{t},
\end{eqnarray}
where we ignore the constant term
Assuming \phi <1, the AR(1) process can be written in terms of the infinite order MA(q) process where we find B(L), B(1) and B^{\ast }(L) as
\begin{eqnarray}
B(L) &=&\frac{1}{1-\phi L} \\
B(1) &=&\frac{1}{1-\phi } \\
B^{\ast }(L) &=&(1-L)^{-1}[B(L)-B(1)]=\frac{\phi }{(1-\phi )(1-\phi L)}
\end{eqnarray}
39 / 55

Beveridge-Nelson DecompositionSolving in terms of   y_{t}
\begin{eqnarray}
y_{t}=(1-L)^{-1}[B(1)+(1-L)B^{\ast }(L)]\varepsilon_{t}
\end{eqnarray}
which can be rewritten as
\begin{eqnarray}
y_{t}=B(1)(1-L)^{-1}\varepsilon_{t}+(1-L)^{-1}[B(L)-B(1)]\varepsilon_{t}
\end{eqnarray}
Substituting in for the AR(1) solution derived above, we have
\begin{eqnarray}
y_{t} &=&g_{t}+c_{t} \\
&\Downarrow & \\
y_{t} &=&\frac{1}{1-\phi }(1-L)^{-1}\varepsilon_{t}+\frac{-\phi }{(1-\phi L)(1-\phi )}\varepsilon_{t}
\end{eqnarray}
40 / 55

Beveridge-Nelson DecompositionThe advantage of Beveridge-Nelson method is that it is appropriate when a series is difference-stationary
It also allows the series to contain a unit root that can be highly volatile
However, it has the disadvantage of being rather time-consuming to compute
In addition, one has to choose between different ARMA(p,q) models that may give quite different results
Misrepresenting an I(2) process as an I(1) process may generate excess volatility in the trend
41 / 55

Figure : Evaluation of decompositions - SA output (1960Q1-2018Q4)

42 / 55

Figure : Leads and Lags - Correlation with GDP

43 / 55

SummaryMany economic and financial applications make use of decompositions for nonstationary time series, which are transformed into a permanent and a transitory component
Could use a linear filter where trend is perturbed by transitory cyclical fluctuations
The Hodrick-Prescott (HP) filter is the most popular way to extract business cycles
The HP filter extracts a stochastic trend for a given value of the parameter \lambdaTrend moves smoothly over time and is uncorrelated with the cycle
Results are not robust to the value of the smoothness parameter

Another popular method used to measure the business cycle is the band pass (BP) filter
The filter removes (filters out) all the components in a series except those that correspond to the chosen frequency band
44 / 55

SummaryIn the Beveridge and Nelson (BN) decomposition, the permanent component is shown to be a random walk with drift
The transitory component is stationary process with zero mean, which is perfectly correlated with the permanent component
Different decompositions provide different results and should be interpreted with caution
Usually a good idea to consider different options before drawing conclusions
45 / 55

Wavelet transformationsSpectral decompositions define the rate at which the time series oscillates
Results in the loss of all time-based information
Assumes that the periodicity of all the components is consistent throughout the entire sample
This may not be the case:Gabor (1946) developed the Short-Time Fourier Transform (STFT) technique
Involves the application of a number of Fourier transforms to different subsamples
Precision of the analysis is affected by the size of the subsample

Large subsample to identify changes in low frequency
Small subsamples to identify changes in higher frequency
46 / 55

Wavelet transformationsWavelet transformations capture features of time-series data across different frequencies that arise at different points in time
Wavelet functions are stretched and shifted to describe features that are localised in frequency and timeCould be expanded over a relatively long period of time when identifying low-frequency events
Could be relatively narrow when describing high frequency events

Involves shifting various wavelet functions with different amplitudes over the sample of data
One is then able to associate the components with specific time horizons that occur at different locations in time
Wavelets use scales rather than frequency bands, where the highest scale refers to the lowest frequency
47 / 55

Wavelet transformationsEarly work with wavelet functions dates back to Haar (1910)
See, Hubbard (1998) and Heil (2006) for a detailed account of the history of wavelet analysis
For computation most studies currently employ the multiresolution decomposition of Mallat (1989) and Strang (1996)
Early applications of wavelet methods in economics include the work of Ramsey (1997), which made use of a wavelet decomposition of exchange rate  data to describe the distribution of this data at different frequencies
48 / 55

Wavelet transformationsTo describe this technique, consider a variable that is composed of a trend and a number of higher-frequency components
The trend may be represented by a father wavelet, \phi(t)
The mother wavelets, \psi(t), are used to describe information at lower scales (i.e. higher frequencies)
One could then describe variable x_t as
\begin{eqnarray}
x_t = \sum_k s_{0,k} \phi_{0,k} (t) + \sum_{j=0}^{J} \sum_k d_{J,k} \psi_{J,k} (t)
\end{eqnarray}
where J refers to the number of scales, and k refers to the location of the wavelet in time
The s_{0,k} coefficients are termed smooth coefficients, since they represent the trend, and the d_{J,k} coefficients are termed the detailed coefficients, since they represent finer details in the data
49 / 55

Wavelet transformationsMother wavelet functions, \psi_{J,k} (t), \dots, \psi_{1,k} (t), are then generated by shifts in the location of the wavelet in time and scale
\begin{eqnarray}
\psi_{j,k} (t) = 2^{-j/2} \psi \left(\frac{t-2^{j}k}{2^j}\right), \;\; j=1,\dots,J
\end{eqnarray}
where the shift parameter is represented by 2^{j}k and the scale parameter is 2^{j}
As depicted in the daublet wavelet functions, smaller values of j (which produce a smaller scale parameter 2^{j}), would provide the relatively tall and narrow wavelet function on the left
Larger values of j, the wavelet function is more spread out and of lower amplitude
After shifting this function by one period, we produce the function that is depicted on the right
50 / 55

Figure : Daublet (4) wavelet functions - $\psi_{1,0}(t)$ and $\psi_{2,1}(t)$

51 / 55

Wavelet transformationsWavelet functions may be:Smooth - decompose data into trend, cycle, noise (or various cycles)
Peaked - identify peak and trough of cycle
Square - identify structural breaks

Use smooth functions that include daublets, coiflets and symlets
Multiresolution techniques are used for computation, which includes the maximum overlap discrete wavelet transform (MODWT)
52 / 55

Figure : Daublet (4) wavelet decomposition - South African inflation

53 / 55

Figure : Daublet (4) wavelet decomposition - South African inflation

54 / 55

Wavelet transformations – SummaryAdvantages:Can be applied to data of any integration order
Has the benefits of spectral techniques without losing time support (very useful when identifying changes in the process at different frequencies)
Can include a number of bands, which are additive

55 / 55

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

Decomposing Time Series

Kevin Kotzé

Contents

Introduction to decompositions

Identifying the business cycle

Introduction to decompositions

Spectral Analysis

Spectral Analysis

Spectral Analysis

Spectral Analysis

Spectral Analysis

Spectral Analysis

Spectral Analysis

Methods for decomposing a time series

Deterministic trends & filters

Deterministic trends & filters

Deterministic trends & filters

Stochastic trends & filters

Stochastic trends & filters

Hodrick-Prescott (HP) filter

Hodrick-Prescott (HP) filter

Hodrick-Prescott (HP) filter

Band pass filters

Frequency-domain

Band Pass Filters

Baxter and King (1999)

Band Pass Filters

Beveridge-Nelson Decomposition

Beveridge-Nelson Decomposition

Beveridge-Nelson Decomposition

Beveridge-Nelson Decomposition

Beveridge-Nelson Decomposition

Beveridge-Nelson Decomposition

Beveridge-Nelson Decomposition

Beveridge-Nelson Decomposition

Summary

Summary

Wavelet transformations

Wavelet transformations

Wavelet transformations

Wavelet transformations

Wavelet transformations

Wavelet transformations

Wavelet transformations – Summary

Contents

Help