Bhundia and Ricci (2005) note that between end-April and end-August in 1998, the South African rand depreciated by 28% in nominal terms against the U.S. dollar. This was accompanied by increases of around 700 basis points in short-term interest rates and long-term bond yields, while sovereign U.S. dollar-denominated bond spreads increased by about 400 basis points. At the same time share prices fell by 40% and output contracted during the third quarter of 1998 (quarter-on-quarter). Once again, in 2001, the rand depreciated by 26% in nominal terms against the U.S. dollar between end-September and end-December, but short-term interest rates remained stable, long-term bond yields increased by less than 100 basis points and sovereign U.S. dollar-denominated bond spreads narrowed by about 40 basis points. Share prices rose by 28%, and real GDP increased.

What drives such extraordinary changes in relative currency valuations, and can we predict their direction and magnitude? On the one hand, the answer to this question must be yes, since financial institutions devote substantial resources to producing forecasts for their clients, and forecasting firms successfully market currency forecasts. However, the answer may be no, since economic models often fail to explain exchange rate movements after the fact.

Corporations use currency forecasts in a variety of contexts: quantifying foreign exchange risk, setting prices for their products in foreign markets, valuing foreign projects, developing international operational strategies, and managing working capital. International portfolio managers use exchange rate forecasts to evaluate the desirability of investing in particular foreign equity and bond markets and whether to hedge the associated currency risks.

Should managers purchase currency forecasts? If markets are relatively efficient, it should be difficult to produce better short-term forecasts than forward exchange rates suggest or better long-term forecasts than uncovered interest rate parity predicts. Yet, we have seen evidence that would suggest that these parity conditions do not always hold, especially in the short run. Therefore, currency forecasts are potentially valuable.

In the section on exchange rate determination we suggested that some macroeconomic variables (fundamentals) may influence the behaviour of the exchange rate, while the exchange rate may in turn influence certain macroeconomic variables. Hence, we could envision a true model of the two economies (domestic and foreign) that include all of these variables, incorporate full information, and expectations and in the context of economic optimisation and random events generate the time path of foreign exchange rate between two currencies.

Forecasting can be thought of as the formal process for generating expectations through the use of economic and financial theory, as well as all available mathematical and statistical techniques. When expectations for future economic variables are derived, we have an implicit forecast of the variable in question, the exchange rate. The rational expectations theory says that people form expectations of future values of the exchange rate and other variables in the same way that the true model of the economy generates these variables. Forecasting is very common and necessary in our times. People take forecasting into consideration when they make economic decisions. These decisions then influence the direction in which the economy will move. Cash flows of all international transactions are affected by the expected value of the exchange rates; therefore, forecasting exchange rate movements is very important for businesses, investors, and policy makers.

Multinational corporations (MNCs) need forecasts of exchange rates for their hedging decision. Firms face the decision of whether or not to hedge future payables and receivables, which are in foreign currencies. Short-term financial and investment decisions require exchange rate forecasts to determine the ideal currency for borrowing and holding cash flow to maximise the return on an investment. Capital budgeting decisions also make use of forecasts for exchange rates to determine the expected cash flows and make an accurate decision for these foreign investments. In addition, long-term financial decisions require these forecasts to decide from where to borrow money (which will reduce the cost if the currency is depreciated) and whether it is better to issue a bond denominated in foreign currency. Furthermore, earnings assessments need to forecast the foreign currency in which the earnings are will be derived to decide if earnings are going to be remitted back to the parent company, or whether they should be invested abroad.

For all of these reasons we are going to consider the use of different forecasting techniques in what follows.

Technical forecasting involves the use of historical exchange rate data to predict future values. It is sometimes conducted in a judgemental manner, without statistical analysis. From a corporate point of view, the use of technical forecasting may be limited to focus on the near future, which is not very helpful in developing corporate policies. Many researchers represent the general solution to a linear stochastic difference equation, which may consist of four distinct parts:

\[\begin{eqnarray} s_t = trend + cyclical + seasonal + irregular \tag{1.1} \end{eqnarray}\]

This specification for the exchange rate suggests that it has no obvious tendency for mean reversion. A critical task for econometricians is to develop simple stochastic difference equation models that can mimic the behaviour of trending variables. The key feature of a trend, cyclical and seasonal are that they have a permanent effects on the time series variable. Since the irregular component is stationary, the effects of any irregular components will dissipate over time, while the other elements will continue to influence the long-term forecasts.

In addition, to generating forecasts for the expected mean of an exchange rate, or change in the exchange rate, one may also wish to generate a forecast for the volatility in the exchange rates. One approach to forecasting the volatility in the exchange rate is to explicitly introduce an independent variable that helps to predict volatility. Consider the simplest case in which,

\[\begin{eqnarray} S_{t+1} = \varepsilon_{t+1} X_{t} \tag{1.2} \end{eqnarray}\]

where, \(S_{t+1}\) is the spot exchange rate (the variable of interest), \(\varepsilon_{t+1}\) is a white-noise disturbance term with variance \(\sigma^2\), and \(X_{t}\) is an independent variable that can be observed at period \(t\). If \(X_{t} = X_{t-1} = X_{t-2} = \ldots\) = constant, the \(\{S_t\}\) sequence is the familiar white-noise process with a constant variance. If the realisation of the \(\{X_{t}\}\) sequence are not all equal, the variance of \(S_{t+1}\) conditional on the observable value of \(X_{t}\) is

\[\begin{eqnarray} \mathsf{var}(S_{t+1} | X_{t} ) = X_{t}^2 \sigma^2 \tag{1.3} \end{eqnarray}\]

Such a process could be modelled with the aid of a conditional heteroskedastic model, the most common of which makes use of the Generalised Autoregressive Conditional Heteroskedastic (GARCH) framework. In what follows we focus our attention on forecast for the expected mean value of the exchange rate.

One of the basic characteristics of \(S_t\) that can be described relatively easily is its long-term growth path. Despite the difficulties that may be incurred with forecasting the short-run upward and downward movements, it is possible that \(S_t\) might exhibit a clear long-term trend. Such a trend may be deterministic, where we are able to derive the future value of the trending process with absolute certainty, or it could be stochastic, where the future value of the trending process incorporates a random element that would be associated with certain probabilistic assumptions.

There are many models that describe this deterministic trend and they can be used to forecast or extrapolate future predicted values for \(S_t\). Such models would include the following:

**(a) Linear time trend:**

\[\begin{eqnarray} S_t = \alpha_0 + \alpha_1 t + \varepsilon_{t} \tag{1.4} \end{eqnarray}\]

**(b) Log linear time trend:**

\[\begin{eqnarray} s_t = \beta_0+ \beta_1t + \varepsilon_{t} \tag{1.5} \end{eqnarray}\]

**(c) Quadratic time trend:**

\[\begin{eqnarray} s_t = \gamma_0 + \gamma_1 t + \gamma_2 t^2 + \varepsilon_{t} \tag{1.6} \end{eqnarray}\]

**(d) Polynomial time trend:**

\[\begin{eqnarray} s_t = \delta_0 + \delta_1 t + \delta_2 t^2 + \ldots + \delta_n t^n + \varepsilon_{t} \tag{1.7} \end{eqnarray}\]

Where, \(S_t\) is the spot exchange rate, \(t = \{1, 2, 3, \ldots \}\) is the time trend, \(n\) is the \(n^{\text{th}}\)-degree polynomial, and \(s_t = \log S_t\) (lower-case letters are the natural logarithms of the upper-case counterparts).

When the trend incorporates a random component we are no longer able to predict the future value of the trend with absolute certainty. These trends are stochastic, where some of these models could take the following form:

**(a) The random walk model:**

The random walk model is a special case of the AR(1) process, which takes the form:

\[\begin{eqnarray} s_t = \alpha_0 + \alpha_1 s_{t-1} + \varepsilon_{t} \tag{1.8} \end{eqnarray}\]

where \(\alpha_0 = 0\) and \(\alpha_1 = 1\). Therefore, the random walk is usually expressed as,

\[\begin{eqnarray} s_t = s_{t-1} + \varepsilon_{t} \tag{1.9} \end{eqnarray}\]

where, \(s_t - s_{t-1} = \Delta s_t = \varepsilon_{t}\)

The conditional expected mean of \(s_{t+h}\), for any \(h > 0\), is

\[\begin{eqnarray} \mathbb{E}_t s_{t+h} = s_t + \mathbb{E} \sum_{i=1}^{h} \varepsilon_{t+i} = s_t \tag{1.10} \end{eqnarray}\]

In this case the variance is time dependent,

\[\begin{eqnarray} \mathsf{var} (s_t ) = \mathsf{var} ( \varepsilon_{t} + \varepsilon_{t-1} + \ldots + \varepsilon_1 ) = t \sigma^2 \tag{1.11} \end{eqnarray}\]

and since the variance is not constant, the random walk process is nonstationary, such that as \(t \rightarrow \infty\), it will also be the case that \(\mathsf{var}(s_t ) \rightarrow \infty\). The forecast function for this model will be,

\[\begin{eqnarray} \mathbb{E}_t s_{t+h} = s_t \tag{1.12} \end{eqnarray}\]

Hence, the constant value of \(s_t\) is the unbiased estimator of all future values of \(s_{t+h}\) for all \(h > 0\).

**(b) The random walk plus drift model:**

The random walk plus drift model augments the random walk model by adding a constant \(\alpha_0\). In this case, the variable \(s_t\) may incorporate both deterministic and stochastic characteristics:

\[\begin{eqnarray} s_t = s_{t-1} + \alpha_0 + \varepsilon_{t} \tag{1.13} \end{eqnarray}\]

For a given initial condition for the exchange rate, \(s_{0}\), and stochastic element \(\varepsilon_0\), the solution for \(s_t\) is:

\[\begin{eqnarray} s_t = s_{0} + \alpha_0 t + \varepsilon_0 \tag{1.14} \end{eqnarray}\]

such that the forecast function from the initial condition would be

\[\begin{eqnarray} \mathbb{E}_t s_{t+h} = s_{0} + \alpha_0 (t + h) \tag{1.15} \end{eqnarray}\]

This expression could be generalised to provide the forecast function for \(h\) periods ahead, where:

\[\begin{eqnarray} \mathbb{E}_t s_{t+h} = s_{t} + \alpha_0 h \tag{1.16} \end{eqnarray}\]

**(c) The local linear trend model:**

The local linear trend model is constructed from several random walk plus noise processes, where one process is used for the stochastic level and the other is used for the stochastic trend. In this case we assume that \(\{\eta_{t} \}\), \(\{\varepsilon_{t} \}\), and \(\{\upsilon_t \}\) are three independent and identically distributed Gaussian white-noise processes. The local linear trend model could then be represented by the equations,

\[\begin{eqnarray} \nonumber s_t &=& \mu_{t} + \eta_{t}\\ \nonumber \mu_{t} &=& \mu_{t-1} + \alpha_t + \varepsilon_{t}\\ \alpha_t &=& \alpha_{t-1} + \upsilon_t \tag{1.17} \end{eqnarray}\]

The local linear trend model consists of the stochastic level, \(\mu_{t}\), and the stochastic slope, \(\alpha_t\). What is interesting about the model is that the other models are special cases of the local linear trend model. To observe the dynamics that are present in this process, consider the case where the value of \(\alpha_1 = 1\). When \(\upsilon_1\), \(\varepsilon_1\) and \(\mu_0\) are equal to zero, then \(\mu_1\) will equal \(1\). Now if both \(\upsilon_2\) and \(\varepsilon_2\) are again equal to zero, then \(\mu_2\) will equal \(2\). This implies that the \(\alpha_t\) process influences the trend of \(s_t\) while the \(\mu_t\) process influences the level. In this case, both of these elements are stochastic and would influence the forecast function of \(s_{t+h}\).

In this section, we discuss a number of traditional time series models that may be used to provide a forecast for the exchange rate. The objective is to develop models that explain the movement of the \(s_t\) variable over time, where we are looking to explain the relationship that \(s_t\) has with its previous values and how it is related to the values of past shocks that have affected the evolution of this variable.

In the autoregressive process of order \(p\), it is assumed that the current observation \(s_t\) is generated by a weighted average of past observations of the exchange rate, going back \(p\) periods. Shocks to this process take the form of a random disturbance. This model may be denoted by the AR(\(p\)) representation that is written as:

\[\begin{eqnarray} s_t = \phi_0 + \phi_1 s_{t-1} + \phi_2 s_{t-2} + \ldots + \phi_p s_{t-p} + \varepsilon_{t} \tag{1.18} \end{eqnarray}\]

where, \(\phi_0\) is a constant term, which is used to describe the mean of the stochastic process. The first-order process AR(\(1\)) may then be expressed as,

\[\begin{eqnarray} s_t = \phi_0 + \phi_1 s_{t-1} + \varepsilon_{t} \tag{1.19} \end{eqnarray}\]

The mean value for \(s_t\) is then given by, \(\mu = 1-\phi\), and the process is stationary if \(|\phi_1 | < 1\). Given the recursive structure of the model it is relatively straightforward to generate the forecast function, where if we assume that \(\phi_0\) takes on a value of zero and after updating by one period, we obtain,

\[\begin{eqnarray} s_{t+1} = \phi_1 s_t + \varepsilon_{t+1} \tag{1.20} \end{eqnarray}\]

while after updating the process for two periods we have

\[\begin{eqnarray*} s_{t+2} &=& \phi_1 s_{t+1} + \varepsilon_{t+2} \\ &=& \phi_1 \left( \phi_1 s_t + \varepsilon_{t+1} \right) + \varepsilon_{t+2} \\ &=& \phi_1^2 s_{t} + \phi_1 \varepsilon_{t+1} + \varepsilon_{t+2} \\ \end{eqnarray*}\]

Since the expected value for \(\mathbb{E}_t[\varepsilon_{t+1}] = \mathbb{E}_t[\varepsilon_{t+2}] = 0\), we can derive the forecast function for \(s_{t+h}\) conditioned on the information that is available at period \(t\), which would be:

\[\begin{eqnarray} \mathbb{E}_t \left[ s_{t+h} | I_t \right] = \hat{\phi_1}^h s_t \tag{1.21} \end{eqnarray}\]

where, \(\mathbb{E}_t \left[ s_{t+h} \right]\) is the forecast that is generated over a horizon of \(h\) periods for \(s_{t+h}\), \(s_t\) is the current spot rate, and \(\hat{\phi_1}\) is the estimated coefficient that would have been generated from the data that was available up until period \(t\).

In the same way the AR(\(p\)) model can be used to forecast the spot rate, where more than a single lag would be used in the estimation and forecast function.

In the general moving average model that has an order of \(q\), each observation for \(s_t\) is generated by a weighted average of random disturbances going back \(q\) periods. We denote this process as MA(\(q\)) and its expression is written as,

\[\begin{eqnarray} s_t = \mu + \varepsilon_{t} + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \ldots + \theta_q \varepsilon_{t-q} \tag{1.22} \end{eqnarray}\]

where the parameters \(\theta_1, \ldots , \theta_q\) may be positive or negative. The first-order moving-average model, or MA(\(1\)) would then take the form:

\[\begin{eqnarray} s_t = \mu + \varepsilon_{t} + \theta_1 \varepsilon_{t-1} \tag{1.23} \end{eqnarray}\]

As with the AR(\(1\)) model, if we assume that \(\mu = 0\) and after updating this expression by one period, we obtain,

\[\begin{eqnarray} s_{t+1} = \varepsilon_{t+1} + \theta_1 \varepsilon_{t} \tag{1.24} \end{eqnarray}\]

while after updating the process for two periods we have

\[\begin{eqnarray*} s_{t+2} &=& \varepsilon_{t+2} + \theta_1 \varepsilon_{t+1} \end{eqnarray*}\]

Once again, since the expected value for \(\mathbb{E}_t[\varepsilon_{t+1}] = \mathbb{E}_t[\varepsilon_{t+2}] = 0\), we can derive the forecast function for \(s_{t+h}\) conditioned on the information that is available at period \(t\), which would be:

\[\begin{eqnarray} \mathbb{E}_t \left[ s_{t+h} | I_t \right] = \hat{\theta_1} \varepsilon_{t} \tag{1.25} \end{eqnarray}\]

where, \(\mathbb{E}_t \left[ s_{t+h} \right]\) is the forecast for \(s_{t+h}\) that would arise after \(h\) periods have elapsed, and \(\hat{\theta_1}\) is the estimated coefficient. As in the case of the autoregressive model, this coefficient estimate is conditioned on the information available at time period \(t\).

In the same way, the MA(\(q\)) model can be used to forecast values for the spot exchange rate, where more than a single lag may be used in the estimation and forecast function.

Many stationary random processes cannot be modelled as purely autoregressive or moving average variables, since they may include features that are typical of both processes. The logical extension of these models is to make use of model that incorporates both autoregressive terms of order \(p\) and moving-average terms of order \(q\). This would typically provide a more parsimonious model structure than what could be provided by either individual autoregressive or moving average models and is represented by the following expression:

\[\begin{eqnarray} s_t = \mu + \phi_1 s_{t-1} + \ldots + \phi_p s_{t-p} + \varepsilon_{t} + \theta_1 \varepsilon_{t-1} + \ldots + \theta_q \varepsilon_{t-q} \tag{1.26} \end{eqnarray}\]

or

\[\begin{eqnarray*} s_t = \mu + \sum_{i=1}^{p} \phi_{i} s_{t-i} + \sum_{i=1}^{q} \theta_i \varepsilon_{t-i} + \varepsilon_{t} \end{eqnarray*}\]

where the mean of this process would be given by: \(\mu / (1- \phi_1 - \ldots - \phi_p)\). The ARMA(\(1, 1\)) process would then take the form,

\[\begin{eqnarray} s_t = \mu + \phi_1 s_{t-1} + \theta_1 \varepsilon_{t-1} + \varepsilon_{t} \tag{1.27} \end{eqnarray}\]

After estimating the coefficients in the above regression, we can use them to forecast \(h\) periods ahead, where the after assuming that \(\mu = 0\), the forecast function for \(\mathbb{E}_t \left[ s_{t+h} \right]\) would be given by:

\[\begin{eqnarray} \mathbb{E}_t \left[ s_{t+h} | I_t \right] = \hat{\phi_1}^h s_t + \hat{\theta_1} \varepsilon_{t} \tag{1.28} \end{eqnarray}\]

Hence, the ARMA(\(p, q\)) process may be used to forecast future values for the spot rate over a horizon of \(h\) periods, based on past realised values of the exchange rate and error terms:

\[\begin{eqnarray} \mathbb{E}_t \left[ s_{t+h} | I_t \right] = \mathbb{E}_t \left( s_{t+1} | s_t , s_{t-1} , \ldots , s_{t-p} , \varepsilon_{t} , \varepsilon_{t-1} , \ldots , \varepsilon_{t-q} \right) \tag{1.29} \end{eqnarray}\]

An autoregressive integrated moving-average model is a generalisation of an autoregressive moving-average model and may be applied when the data contains a unit root. For example, where the exchange rate data is difference stationary, one would need to take the first-difference of the data to remove the nonstationary stochastic trend. Such a model is generally termed an ARIMA(\(p, d, q\)) model, where \(p\), \(d\), and \(q\) are non-negative integers that refer to the respective order of the autoregressive, integrated, and moving-average parts of the model. These models form an important part of the Box-Jenkins approach to time-series modelling, which is described in Box and Jenkins (1979) and Box and Jenkins (1979). It makes use of the following expression:

\[\begin{eqnarray} \Delta^ds_t = \mu + \sum_{i=1}^{p} \phi_{i} s_{t-i} + \sum_{i=1}^{q} \theta_i \varepsilon_{t-i} + \varepsilon_{t} \tag{1.30} \end{eqnarray}\]

If \(s_t\) is stationary then \(d=0\) and we can use the ARMA(\(p\), \(q\)) model for \(s_t\). However, if one or more characteristic roots are greater than or equal to unity, then \(d\) would take on a value that is equivalent to the number of times that it would need to be differenced before it is stationary. Therefore, by way of example, an ARIMA (\(0, 1, 2\)) model would take the form:

\[\begin{eqnarray} s_t - s_{t-1} = \mu + \varepsilon_{t} + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} \tag{1.31} \end{eqnarray}\]

In this case, \(\Delta^d\) may be used to denote differencing, such that \(\Delta^1 s_t = s_t - s_{t-1}\) and \(\Delta^2 s_t = \Delta^1 s_t - \Delta^1 s_{t-1}\), etc. Note that after differencing we may want to retrieve the level, after we have generated the forecasts for the variable of interest. We would then need to sum the differences and the starting value. For example, if it is assumed that we have a starting value of zero then, \(w_t = \Delta^d s_t\) and \(s_t = \sum^d w_t\).

After we have differenced the time series variable \(s_t\) to produce the stationary series \(w_t\), we can model \(w_t\) as an ARMA process. If \(w_t = \Delta^d s_t\), and \(w_t\) is an ARMA(\(p, q\)) process, then we say that \(s_t\) is an integrated autoregressive moving-average process of order (\(p, d, q\)), or simply an ARIMA(\(p, d, q\)) process.

Using the above expressions, and where it is assumed that \(\mu = 0\) in what is deemed to be an appropriate forecasting model that would take the form of a ARIMA (\(1,1,1\)), then we could firstly derive the forecast function for \(w_t\)

\[\begin{eqnarray} \mathbb{E}_t \left[ w_{t+h} | I_t \right] = \hat{\phi_1}^h w_t + \hat{\theta_1} \varepsilon_{t} \tag{1.32} \end{eqnarray}\]

before using the equality:

\[\begin{eqnarray} \mathbb{E}_t \left[ s_{t+h} | I_t \right] = \sum \mathbb{E}_t \left[ w_{t+h} | I_t \right] \tag{1.33} \end{eqnarray}\]

Fundamental forecasting exercises are based on fundamental relationships between various economic variables and exchange rates. This implies that all the theories relating to exchange rate determination could be used to forecast the value of the exchange rate over subsequent periods of time. This type of analysis is called fundamental analysis, due to the economic fundamentals that are used in the forecasting process. Thus, fundamental forecasting is the practice of using fundamental analysis to predict future exchange rates. This involves looking at all quantitative and qualitative aspects that might affect exchange rates, including various macroeconomic indicators and political factors. Critics suggest that the applicability of fundamental forecasting exercises is relatively limited as some of the data that should be included in the model is difficult to quantify and as there is a relatively large degree of uncertainty about our ability to explain past exchange rate behaviour with these fundamental factors, we should not expect that they would provide accurate forecasts.

The monetary model that was discussed during the section on exchange rate determination could be used to generate fundamental forecasts. Such a model is a descendant of the original models of Bilson (1978), and Neely and Sarno (2002). It starts with the conventional money demand functions for both the domestic and foreign economies and incorporates the purchasing power parity condition to determine the current spot exchange rate by using historic values of certain variables. This model may be expressed as:

\[\begin{eqnarray} \nonumber s_t &=& \mu + \psi \left( m_{t-1} - m^{\star}_{t-1} \right) + \beta \left( y_{t-1} - y^{\star}_{t-1} \right) + \gamma \left( i_{t-1} - i^{\star}_{t-1} \right) \ldots \\ \nonumber && + \delta \left( w_{t-1} - w^{\star}_{t-1} \right) + \zeta \left( ca_{t-1} - ca^{\star}_{t-1} \right) + \theta \left( nd_{t-1} - nd^{\star}_{t-1} \right) \ldots \\ && + \lambda \left( I_{t-1} - I^{\star}_{t-1} \right) + \phi \left( p_{t-1} - p^{\star}_{t-1} \right) + \ldots + \varepsilon_{t} \tag{2.1} \end{eqnarray}\]

where, \(s_t\) refers to the current spot rate, \(m_{t}\) is the domestic money supply, \(y_{t}\) is real income, \(i_{t}\) is the domestic short-term interest rate, \(w_{t}\) is the wage rate, \(ca_{t}\) is the current account, \(nd_{t}\) is the national debt, \(I_{t}\) is investment, and \(p_{t}\) is the domestic price level, while \(\varepsilon_{t}\) is the error term. With the exception of the wage rate and the interest rate, all the other variables are expressed in terms of natural logarithms and the star (i.e. \(\star\)) is used to denote the foreign country. Note that in addition to the variables that we have mentioned, one could include a number of other potential fundamental variables that could potentially explain the exchange rate.

To make use of this regression model, we would need to estimate values for the following parameters: \(\{ \hat{\mu}, \hat{\psi}, \hat{\beta}, \hat{\gamma}, \hat{\delta}, \hat{\zeta} , \hat{\theta}, \hat{\lambda}, \hat{\phi}, \ldots \}\). After we have obtained these parameter estimates, we could then make use of the forecast function \(\mathbb{E}_t \left[ s_{t+1} \right]\) by using the current values of the independent variables times the coefficient estimates:

\[\begin{eqnarray} \nonumber \mathbb{E}_t \left[ s_{t+1} | I_t \right] &=& \hat{\mu} + \hat{\psi} \left( m_{t} - m^{\star}_{t} \right) + \hat{\beta} \left( y_{t} - y^{\star}_{t} \right) + \hat{\gamma} \left( i_{t} - i^{\star}_{t} \right) \ldots \\ \nonumber && + \hat{\delta} \left( w_{t} - w^{\star}_{t} \right) + \hat{\zeta} \left( ca_{t} - ca^{\star}_{t} \right) + \hat{\theta} \left( nd_{t} - nd^{\star}_{t} \right) \ldots \\ && + \hat{\lambda} \left( I_{t} - I^{\star}_{t} \right) + \hat{\phi} \left( p_{t} - p^{\star}_{t-1} \right) + \ldots \tag{2.2} \end{eqnarray}\]

The above equation is a constrained one in the sense that we make use of a single coefficient for the elasticity that is associated with the cross-country difference for each variable. Alternative specifications, which are less constrained may allow for different elasticities for both domestic and foreign variables:

\[\begin{eqnarray} \nonumber s_t &=& \mu + \psi_1 m_{t-1} + \psi_2 m^{\star}_{t-1} + \beta_1 y_{t-1} + \beta_2 y^{\star}_{t-1} + \gamma_1 i_{t-1} + \gamma_2 i^{\star}_{t-1} \ldots \\ \nonumber && + \delta_1 w_{t-1} + \delta_2 w^{\star}_{t-1} + \zeta_1 ca_{t-1} + \zeta_2 ca^{\star}_{t-1} + \theta_1 nd_{t-1} + \theta_2 nd^{\star}_{t-1} \ldots \\ && + \lambda_1 I_{t-1} + \lambda_2 I^{\star}_{t-1} + \phi_1 p_{t-1} + \phi_2 p^{\star}_{t-1} + \ldots + \varepsilon_{t} \tag{2.3} \end{eqnarray}\]

By taking the values of the estimated coefficients, \(\{ \hat{\mu}, \hat{\psi_1}, \hat{\psi_2} , \ldots \}\), and the current values of the independent variables, we would be able to generate a value for \(\mathbb{E}_t \left[ s_{t+1} | I_t \right]\), following the method that was applied in eq. (2.2).

To provide a forecast for the exchange rate, \(\mathbb{E}_t \left[ s_{t+h} | I_t\right]\), with the aid of a regression model, we would want to make use of variables that could explain future movements in \(s_t\), where such variables are not themselves perfectly correlated with one another. Let us suppose that the best regression model contains the following independent variables:

\[\begin{eqnarray} \nonumber s_t &=& \mu + f \left[ \left( m_{t-1} - m^{\star}_{t-1} \right), \left(y_{t-1} - y_{t-1}^{\star} \right), \left(i_{t-1} - i_{t-1}^{\star} \right), \ldots \right. \\ && \left. \left(p_{t-1} - p_{t-1}^{\star} \right), \left(tb_{t-1} - tb_{t-1}^{\star} \right), \left(bd_{t-1} - bd_{t-1}^{\star} \right) \right] + \upsilon_{t} \tag{2.4} \end{eqnarray}\]

This equation has an implicit additive error term, \(\upsilon_{t}\), that accounts for unexplained variation in \(s_t\). This process may incorporate a certain degree of serial correlation and as such it would be a good idea to investigate the properties of the residuals for the above regression. For example, where the fitted values for the regression are expressed as \(\hat{s_t}\), the residuals could be calculated from, \(s_t - \hat{s}_t = \upsilon_t\). If these residuals are serially correlated then we would need to substitute an ARIMA (\(p,d,q\)) model for the implicit error term in the above regression. This would hopefully reduce the size of the unexplained variation in the regression model. An example of such a combined regression and univariate time series model could be expressed as follows, where it is assumed that the residuals from eq. (2.4) are stationary:

\[\begin{eqnarray} s_t = \mu + A X_{t-1} + \sum_{i=1}^{p} \phi_{i} s_{t-i} + \sum_{i=1}^{q} \theta_i \varepsilon_{t-i} + \varepsilon_{t} \tag{2.5} \end{eqnarray}\]

where, \(X_{t}\) represents the independent explanatory variables that are contained in eq. (2.4). In this case it would be interesting to take note of the properties of \(\varepsilon_{t}\), which would hopefully have a smaller variance than \(\upsilon_{t}\).

This model may produce better forecasts than either the fundamental regression in eq. (2.4) or any one of univariate time series models, since it includes a structural (economic) explanation for that part of the variation in \(s_t\) that can be explained by fundamental factors and a time series explanation for that part of the variation in \(s_t\) that does not have a structural interpreation. Equation (2.5) is referred to as a transfer function model or a multivariate autoregressive moving-average model (MARMA). Despite the attractive features of this model, it may not provide results that are clearly superior to other models as it is highly parameterised and may include coefficients that adversely affect the forecasting performance, particularly when these parameters are imprecisely estimated.

One of the most fertile areas of contemporary time series research concerns the application of multi-equation time series models, as many economic systems exhibit a certain degree of feedback among variables in a particular system. One form of analysis that treats all variables symmetrically without making reference to the dependence or independence of variables, makes use of a vector autoregression (VAR) model.

When we are not confident that a variable is actually exogenous, a natural extension of the transfer function models is to treat each variable symmetrically. In the two variable case, we could assume that the time path of the exchange rate is affected by current and past realisations of another variable, which we denote \(f_t\), while the time path of \(f_t\) is affected by current and past realizations of \(s_t\).

To model this dependence we could use a VAR model that represents a system of equations, where each endogenous variable is treated as a function of its own past and the past of the other endogenous variables in the system. These models, which were initially presented in Sims (1972) and Sims (1980), have provided a number of successful forecasts for various interrelated time series variables. In addition, many extensions to this framework have been proposed in the literature, and in what follows we allow for the inclusion of exogenous variables that may improve the forecasts for the endogenous variables. The simplest exogenous variable could take the form of a time trend, seasonal or dummy variable.

In what follows, we consider the construction of a model that incorporates both the spot, \(s_t\), and forward, \(f_t\), exchange rates in a VAR framework that incorporates a deterministic time trend,

\[\begin{eqnarray} \nonumber s_t &=& \alpha_{10} + A_{11} (L) s_{t-1} + A_{12} (L) f_{t -1} + A_{13}t + \varepsilon_{s,t} \\ f_t &=& \alpha_{20} + A_{21} (L) s_{t-1} + A_{22} (L) f_{t -1} + A_{23} t + \varepsilon_{f,t} \tag{2.6} \end{eqnarray}\]

where, \(\alpha_{10}\) and \(\alpha_{20}\) are constants, \(A_{ij}\) refers to the matrix of coefficients, \(L\) is the lag operator, and \(t\) is the time trend. Alternative, exogenous variables, may include the short-term interest rates that are used to execute monetary policy in the two economies (i.e. \(i_{t}\) and \(i_{t}^\star\)), where we assume that the influence of interest rates on exchange rates is exogenous.

The solution to eq. (2.6) can be used to examine the interaction between the two variables, \(s_t\) and \(f_t\), as the respond to orthogonal shocks. For example, the coefficients from eq. (2.6) can be used to consider the effects of either \(\varepsilon_{s,t}\) or \(\varepsilon_{f,t}\) on the subsequent time path of the \(s_t\) and \(f_t\) variables. Such an analysis would make use of the following partial derivatives for different values of \(j\).

\[\begin{eqnarray*} \frac{\partial s_{t+j}}{\partial \varepsilon_{s,t}}, \;\;\; \frac{\partial s_{t+j}}{\partial \varepsilon_{f,t}}, \;\;\; \frac{\partial f_{t+j}}{\partial \varepsilon_{s,t}}, \;\;\; \frac{\partial f_{t+j}}{\partial \varepsilon_{f,t}} \end{eqnarray*}\]

Plotting these impulse response functions is a practical way to visually represent the response of the variables to the various shocks.

Market indicators could be used to predict the future values for the exchange rate. For example, if we maintain that the current spot price reflects the future expected value of the exchange rate, then it could be used as forecast. Alternatively, when looking to make use of a one-month ahead forecast for the value of the spot rate, \(s_t\), then we could use the one-month forward rate, \(f_t\). This type of forecast would be classified as a market-based forecast.

The current spot exchange rate, \(s_t\), can be used to forecast the spot rate during subsequent periods of time. If rational market participants expect that the domestic currency will depreciate against the foreign currency in the near future, speculators will buy foreign currency with domestic currency. This excess demand for foreign currency will immediately increase the value of foreign currency relative to domestic currency. In addition, these conditions would also give rise to an excess supply of domestic currency. Thus, the current value of the domestic currency would decrease.

This behaviour would suggest that current spot rates could be used to predict future spot rates and if the foreign exchange market is efficient in the sense that current prices reflect all available information, then such forecasts may be reasonably accurate. The reason for this is that the current price would also summarise all the available information about the expected future value of a currency, i.e. \(S_{t+h}^e\). These forecasts could be generated by a random walk or a random walk plus drift (when the process contains evidence of trending behaviour).

Forward rates are quoted for a specific date in the future. For most of the major developed-world currencies, these may include one-, three-, six-, twelve- and even sixty-month forwards, while a one-month forward would usually be available for most emerging market currencies. We could use these forward rates to provide an estimate for the spot rate forecast over an equivalent time horizon. For example, a one-month forward (i.e. \(F_1\)) could be used to forecast the spot rate in the next period, when using month data for the spot rate. Therefore, \[\begin{eqnarray} F_j = \mathbb{E} \left( s_{t+j} | I_t \right) \tag{3.1} \end{eqnarray}\]

where, \(F_j\) is the current forward rate quoted for \(j\)-months ahead and \(s_{t+j}\) is the spot rate that is expected to be realised \(j\) months from the current date.

The forward rate may provide a reasonable estimate for the future spot rate, since market participants make use of these instruments for speculating and hedging purposes. For example, if the current quote for a one-month forward for the USDZAR is quoted at, \(F_1 = 18.00\) R \(/\) $ and the current spot rate is R\(16.00\) \(/\) $, then the market expects that the US dollar will appreciate and the South African rand will depreciate. This would encourage market participants to start buying dollars and selling rands, which would result in an appreciation of the dollar at the expense of the rand. Thus, the participants actions may give rise to a self-fulfilling response and as a result the forward rate may be deemed a good predictor of the future spot rate.

Note that the long-term forward rates may involve relatively large bid/ask spreads as there is limited trading volume for such instruments.

This type of forecasting is based on a similar practice that is employed when using the market beta of a financial asset. Currency betas measure the responsiveness of a particular currency to a market index of foreign currencies. To estimate currency betas, we may use the following equation:

\[\begin{eqnarray} \dot{s}_t = \alpha + \beta \dot{e}_{M,t-1} + \varepsilon_{t} \tag{3.2} \end{eqnarray}\]

where, \(\dot{e}_{M,t}\) is the percentage change of a market index for foreign currencies, as a percentage per annum, \(\alpha\) is the intercept, \(\beta\) is the sensitivity (responsiveness) of the exchange rate to the currency index (slope of the line), and \(\varepsilon_{t}\) is the error term. The left-hand side variable, \(\dot{s}_t\), is the percentage change of the spot exchange rate, as a percentage per annum. To calculate this value we may proceed as follows:

\[\begin{eqnarray*} \% \Delta S_t = \frac{S_t - S_{t-1}}{S_{t-1}} \frac{12}{n} 100 = \left(s_t - s_{t-1} \right) \frac{12}{n} 100 \end{eqnarray*}\]

We are then able to find estimates for the \(\hat{\alpha}\) and \(\hat{\beta}\) parameters for the specific exchange rate with respect to the market index of foreign currencies.

Some forecasts are superior to the others, but no one knows with certainty which forecast is going to provide the best result. Therefore, it has been suggested that to avoid large forecasting errors one should combine the results that are produced by a number of forecasting techniques. For example, we could make use of a technical, a fundamental, a market-based forecast, and the currency beta method, and combine them by taking the average of these forecasts or we could assign different weights to each to derive the weighted average value for the future spot rate.

Forecasts errors for exchange rates arise because the interaction between all global economies incorporates behaviour that is extremely complex. In addition, our information is limited and our ability to process all available information is severely constrained. Hence, the models we use are usually only relatively poor approximations of reality. To consider the potential source of forecast errors suppose the true model is given by,

\[\begin{eqnarray*} s_t = \chi_t \beta + \varepsilon_{t} \end{eqnarray*}\]

where, \(\beta\) is a vector of parameters, and \(\varepsilon_{t}\) is an independent and identically distributed random variable with zero mean and fixed variance. Of course, the true model that is responsible for generating values for \(s_t\) is not known, but we could make use of a set of variables, which are contained in the \(x_t\) vector, to obtain estimates for the coefficients, \(\hat{\beta}\), of the unknown parameters (i.e. \(\beta\)). Then, setting the error term equal to its mean value (zero), the forecasts for the exchange rate, which we denote \(\hat{s}_t\) are obtained as follows:

\[\begin{eqnarray} \hat{s}_t = x_t \hat{\beta} \tag{5.1} \end{eqnarray}\]

The forecast error, \(e_t\), is the difference between the actual and the forecast value,

\[\begin{eqnarray} e_t = s_t - x_t \hat{\beta} \tag{5.2} \end{eqnarray}\]

Given the above characterisation, there are three potential sources for a forecast error: (a) residual or innovation uncertainty, (b) coefficient uncertainty and (c) model uncertainty.

**Residual or Innovation Uncertainty**. This first source of errors arises because the innovations \(\varepsilon_{t}\) in the equation are unknown for the forecast period and are replaced with their expectations. While the expected mean value for the residuals is zero, the individual values are non-zero. Note that the larger the variation in the individual errors, the greater the overall error in the forecasts.**Coefficient Uncertainty**. The second source of forecast error is coefficient uncertainty. The estimated coefficients \(\hat{\beta}\) in the model would deviate from the true coefficients \(\beta\). The standard error of the estimated coefficient, given with the output from the regression, is a measure of the precision with which the estimated coefficients represent the true coefficient values.**Model Uncertainty**. Since we do not have any details about what variables we need to include in the model and the functional form of the model remains a mystery the probability of making use of the incorrect model specification is particularly large. In addition, the use of the incorrect model would also contribute towards coefficient uncertainty.

To evaluate the accuracy of the forecast we could make use of the following example. Suppose that today is time \(t\), and we are forecasting over a \(h\)-period horizon (say \(h\) months). Let \(S_{t+h}\) be the actual exchange rate at time \(t+h\), and let \(\mathbb{E}_t \left[S_{t+h}|I_t\right]\) be the forecast that is conditional on information that is available at time \(t\). The closer \(\mathbb{E}_t \left[S_{t+h}|I_t\right]\) is to \(S_{t+h}\), the more accurate the forecast, and the smaller the forecast error:

\[\begin{eqnarray*} e_{t+h} = S_{t+h}-\mathbb{E}_t \left[S_{t+h}|I_t\right] \end{eqnarray*}\]

Of course, we cannot judge a forecaster by just one forecast and as such we should calculate a number of forecasts over different periods of time, for which we need to make use of successive forecasts and realisations to allow for an informed statistical analysis. This would usually involve an extensive out-of-sample forecasting analysis, where we would generate successive forecasts over a period of time for a consistent forecasting horizon before evaluating all the forecasting errors. In addition, we also cannot judge the accuracy of the forecasting record by simply taking the average forecast error because large errors with opposite signs would negate one another.

Therefore, to evaluate the out-of-sample forecasting ability of a model, we would usually split the sample into two parts, which relate to the:

- in-sample (training) portion, consisting of observations from \(1\) to \(R\)
- out-of-sample (testing) portion, consisting of observations \(R+h\) to \(T+h\)

This provides a sequence of \(P = T - R + 1\), \(h\)-step-ahead out-of-sample forecast errors. Under both the recursive and rolling window forecast scheme, the model parameters are re-estimated progressively over time, as the value for \(R\) increases towards \(T\), where the recursive scheme makes use of all available information from the first observation, while the rolling window scheme makes use of a constant in-sample sample size. The forecasting ability of the model is measured by a loss function. An example of a common loss functions include the Mean Absolute Error (MAE) and the Root Mean Squared Error (RMSE). These may be summarised as follows:

\[\begin{eqnarray*} MAE &\equiv& \frac{1}{P} \sum_{t=R}^{T} \left| e_{t+h} \right| \\ RMSE &\equiv& \sqrt{ \frac{1}{P} \sum_{t=R}^{T} \left[ e_{t+h} \right]^2 } \end{eqnarray*}\]

where \(T\) is the total number of available observations. The MAE is the average of the absolute values of the forecast errors. The RMSE is the square root of the average squared forecast errors. It has the same units as the standard deviation of exchange rate changes and may be used for comparative purposes.

When comparing forecasts, a number of obvious benchmarks come to mind. For example, we could simply replace the forecast with the current exchange rate or with the current forward rate for maturity \(h\). We hope that a forecaster’s MAE or RMSE is smaller than such simple forecasts. If it weren’t, why would we need to pay money for it?

In a famous article, Meese and Rogoff (1983) analyse the forecasting power of fundamental models of exchange rate determination. The models link the current spot rate to relative money supplies, interest differentials, relative industrial production, inflation differentials, and the difference in cumulated trade balances, which represents the level of net foreign assets. They estimate the parameters of these models and use them to predict future exchange rate values. Since the fundamental information is not known when the forecast is made, these predictions would normally necessitate forecasting the fundamentals first, so that the forecast is truly “out-of-sample”.

However, Meese and Rogoff use actual values for the future fundamentals combined with the parameters to predict the exchange rate. This approach gives the fundamental models an advantage relative to the other models considered, which use only current information to predict future exchange rates. As benchmarks, they considered several alternative models, including the random walk \([\hat{S}_{t+h}=S_t]\), a model for the the unbiasedness hypothesis \([\hat{S}_{t+h}=F_{t, h}]\) and several statistical models that link the current exchange rate to past exchange rates and past values of other variables.

Computing the root mean squared error (RMSE) for the predictions at various horizons, Meese and Rogoff found that the random walk model beat all the other models in the majority of the cases considered. Particularly surprising was that the fundamental models did not even perform better at longer horizons. This result has been confirmed by a large number of researchers over the years and continues to puzzle international economists (see Rogoff (2009)).

Recent research by Meese and Prins (2011) points to the importance of order flow in the short-run determination of exchange rates and market fundamentals in the longer run. They find that market fundamentals do a poor job of explaining the time series movements of exchange rates, especially at short horizons, whereas fundamentals perform better cross-sectionally and at longer horizons. Given the poor performance of fundamental models in forecasting exchange rates, we have provided only a cursory overview of the major models. However, fundamental models still provide useful insights, and, as we will see, it may not be so surprising that they are beaten by a random walk model in forecasting exchange rates.

In addition, as we have previously noted, market based forecasts are also not as promising as we possibly would have expected. For example, when discussing the use of the unbiasedness hypothesis we noted that the forward provided a biased estimate of the future spot rate, which would suggest that we would make systematic errors when using this method to forecast the exchange rate. Hence, while our ability to generate accurate forecasts for the exchange rate is significantly impaired, this would not necessarily imply that the need for reasonable forecasts will wane.

This chapter considers a number of different techniques that may be used to generate a forecast for the future exchange rate. Multinational corporations and all the other professionals involved in international finance make use of predictions for the exchange rate to make decisions on their investment, their financing, capital budgeting, on hedging payables and receivables, and other short-term and long-term financial decisions. Some of the most common forecasting techniques include: technical forecasting, fundamental forecasting, market-based forecasting, forecasting with the use of currency betas, and combined forecasts. Of course there are a number of other techniques that have also been used to forecast exchange rates and this discussion should not be regarded as complete.

In addition we also considered a few methods for evaluating forecast accuracy. Forecast accuracy is economically meaningful in a number of settings. For example, suppose we need to evaluate a foreign investment project that will generate foreign currency profits. This would require a forecast for the future rand values of the cash flows generated by the project (by converting future foreign currency profits into future rand values) that would then be discounted at an appropriate discount rate to determine whether the investment project will be profitable. If these calculations lead to the acceptance of the project and a currency crisis erupts in the country in which we invested (resulting in a significant currency depreciation), then the currency crisis will depress the company’s rand earnings, if local competition prevents us from passing through the currency loss in the form of higher local prices. In this case accuracy matters, as the investment decision would have been a disaster. A more accurate assessment of the future would have led us to forgo the investment.

Even if the foreign currency appreciates after the investment is made and the investment decision looks good, forecasting accuracy still matters. A better exchange rate forecast might have caused the firm to invest more in the foreign country. Pricing decisions and long-term strategic planning are other examples in which the accuracy of exchange rate forecasts matters a great deal. For these reasons exchange rate forecasts remain important.

Bhundia, A. J., and L. A. Ricci. 2005. “Post-Apartheid South Africa: The First Ten Years.” In, edited by M. Nowak and L. A. Ricci, 156–73. Washington: International Monetary Fund.

Bilson, John F. 1978. “The Economics of Exchange Rates.” In, edited by Jacob A. Frenkel and Harry G. Johnson, 75–96. Reading, MA: Addison-Wesley.

Box, George, and Gwilym Jenkins. 1979. *Time Series Analysis: Forecasting and Control*. New York: Wiley.

Meese, Richard A., and Kenneth Rogoff. 1983. “Empirical Exchange Rate Models of the Seventies : Do They Fit Out of Sample?” *Journal of International Economics* 14 (1-2): 3–24.

Meese, Richard, and John Prins. 2011. “On the Natural Limits of Exchange Rate Predictability by Fundamentals.” Manuscript. Princeton University.

Neely, Christopher J., and Lucio Sarno. 2002. “Review, Federal Reserve Bank of St. Louis.” *How Well Do Monetary Fundamentals Forecast Exchange Rates?*, September, 51–72.

Rogoff, Kenneth. 2009. “Exchange Rates in the Modern Floating Era: What Do We Really Know?” *Review of World Economics* 145: 1–12.

Sims, Christopher A. 1972. “Money, Income, and Causality.” *American Economic Review* 62 (4): 540–52.

———. 1980. “Macroeconomics and Reality.” *Econometrica* 48 (2): 1–49.