Price Forecasting Through Multivariate Spectral Analysis: Evidence for Commodities of BM&Fbovespa

This study aimed to forecast the prices of a group f commodities through the multivariate spectral analysis model and compare them with those obtained by classical forecasting and neural network models. The choice of commodities su ch as ethanol, cattle, corn, coffee and soy was due to the emphasis in the exports in 2013. The multivariate spectral model has proved to be suitable, when compared with others, b y enabling a better predictive performance. The results obtained in the out-of-sam ple period, through the use of measurement error and statistical test, confirm thi s. This research may help market professionals in formulating and implementing polic ies targeted to the agricultural sector due to the relevance of price forecast as a planning in strument and analysis of the finance market behavior for those who need protection against pric e fluctuations.

Considering that, through smoothing a process of moving averages is unintuitive to represent the behavior of a particular time series and, whereas the application of autoregressive models is common in different fields of knowledge, we can use autoregressive and moving average terms simultaneously in accordance with the objective of improvement.
Thus, this combination characterizes the model defined by the literature as Autoregressive Moving Average Model (ARMA).
Another possibility is to make the time series stationary through a differentiation process, that is, to take successive differences of the original time series. Hence, we begin the formation of the Autoregressive Integrated Moving Average (ARIMA) model. This model is based on the construction of methods adjusted in their probabilistic properties.
In some situations, time series may exhibit periodic fluctuations, like the meteorological phenomena that, when evaluated on a quarterly basis, often have higher correlations when lags multiple of four are used, in accordance with the seasons of the year or in economic data that require lags multiple of twelve, in accordance with the months of the year (ESQUIVEL, 2012). Thus, it is appropriate to consider a stochastic periodicity to evaluate time series behavior. When the ARIMA model takes into account this periodicity it becomes known as Seasonal Autoregressive Integrated Moving Average (SARIMA).
Given the restrictions ARIMA model to maintain the constant error variance over time, Engle (1982) suggested a model for forecasting. Defined as the Autoregressive Conditional Hetoreskedasticity (ARCH), this model introduces the error's conditional variance determined by the lag squared errors. The idea is to be able to measure the persistence of shocks to the variance by a coefficient. The closer it is to the unit, the indication is that shock impacts on prices take time to dissipate. Another possibility is given by the generalized ARCH model or Generalized Autoregressive Conditional Heterocedasticity (GARCH) proposed by Bollerslev (1987).
If in the case of the classical models described, the sign (trend and periodicity) of the time series is studied in terms of units of time, for the spectral models the extraction of time series information is performed in terms of frequency units. The basis of spectral models lies in the fact that any function in time can be defined by the superposition of sine waves of different frequencies. In the literature, spectral models, when compared to classical models, decompose time series in various components with features of simpler periodicity, featuring advantages on the elimination of noise from the original series, according to Marques and Antunes (2009) Wavelet.
In addition to these predictive models, another one that does not require parameters of time series analysis is the Artificial Neural Networks (ANN) model, which through automatic catches to approximate equations without having to deduce them. In addition to not requiring series parameters, the model differs from the classical and exponential smoothing forecast models for being a model that operates with learning algorithm. Such an algorithm seeks to imitate the interconnection structure of the human brain, with the purpose of incorporating the pattern of a time series behavior in order to efficiently provide, future values (TURBAN, 1993).
The construction of the ANN model involves from the appropriate neural network modeling to the transformations used to transmit data to the network and the methods used to interpret the results. These aspects are given by modeling, processing and interpretation and are critical in the use of the model to perform forecast.

FORECASTING PERFORMANCE OF CLASSIC AND SPECTRAL MODELS
The research conducted by Hassani (2007) -when comparing forecasting results between the spectral analysis model, whose analysis is univariate, and some classic modelswe verified that the spectral model presented better performance. The author, in addition to using the spectrum method, uses the model of moving averages, the ARIMA model and the HW seasonal algorithm, which were also employed by Brockwell and Davis (2002), to forecast the accidental death time series in the United States in the seventies. The research revealed that the spectral model generated more accurate predictions than those obtained by the classical models.
Still on the spectral model, Menezes et al. (2014), when confronting forecast data results for the electricity consumption of a distributor that serves part of the state of Rio de Janeiro, confirmed improved performance of spectral analysis in relation to the ARMA and the HW seasonal algorithm. Esquivel (2012), when using meteorological and financial time data, whose series have different characteristics, concluded that the spectral model from univariate analysis produced forecast results as good as or superior to those obtained by the SARIMA model and the HW seasonal algorithm.

METHODOLOGY
In order to assess the contribution of the M-SSA model in price forecasting, in addition to exploring its application in time series for agricultural commodities, the research will use the HW seasonal algorithm, the SARIMA model, ARIMA-GARCH and the ANN model. To identify the characteristics of time series, statistical tests for normality will be applied, according to Anderson-Darling (AD) and Shapiro-Wilk (SW), in addition to Doornik-Hansen-Omnibus (DHO) for multivariate normality. In the research we also use tests by McLeod and Li (1983) and Tsay (1986) Diebold and Mariano (1995), defined in this study as DM test.
Bellow, we describe the forecasting models we used, in addition to the error measurements.

M-SSA MODEL
Early research was conducted by using atmospheric data. For this purpose, time series were associated with the climate and represented by localities or regions on a map (KEPPENNE; GHIL, 1993, PLAUT;VAUTARD, 1994). Similar to the spectral model of univariate analysis, the M-SSA model is defined in two stages: decomposition and reconstruction. The stage of decomposition is given by two steps: incorporation and singular value decomposition. The incorporation can be considered as a mapping that transfers a set of is a result of the lagged vectors. The result of the incorporation, as described by Hassani and Mahmoudvand (2013), It is the formation of a block trajectories matrices V X , according to: In the second step, defined as singular value decomposition, decomposition matrix is performed obtaining a sum of elementary matrices. Thus, is given according to: is given according to: And the matrix W represented by: In addition,

HW SEASONAL ALGORITHM
The incorporation of seasonality in the HW seasonal algorithm can be performed by two different approaches, dependents on the seasonality pattern identified in the series: multiplicative and additive seasonality. When considering the multiplicative seasonality, Morettin and Toloi (2006) explain that time series can be defined by: with t N the level of series, t S the seasonal factor, t m the trend component, t ε the random error in the period t and N t ,..., 1 = .
The form of recurrence for the multiplicative approach, in this research, is given by HW m , with the multiplicative seasonal factor represented by the equations involving the three smoothing constant, α , β and γ , according to: The forecasts for future values take into account the steps ahead h ; thus, in each equation the seasonal factor considers the corresponding period, according to the following equations: For the multiplicative seasonal approach the correction of errors t e is given: The other focus of the method, given in this research by HWa, is applied when the series features additive seasonality. Thus, for Morettin and Toloi (2006), by taking the additive seasonal factor, the time series is represented by the sum of all the components according to: In the additive seasonality, the form of recurrence is given by the equation: with the same conditions of the smoothing constants of the model for the multiplicative approach. The future values are predicted through the equations: The procedure of correction of errors for this type of seasonality is now given by:  The stationary seasonal autoregressive polynomial of order P is given by: The invertible polynomial of seasonal moving averages of order Q is given by: with the operator seasonal difference of order D represented by: In which, in general, the first seasonal differentiation can exclude the seasonality of the time series (ESQUIVEL, 2012).

GARCH MODEL
The estimation of a model to represent a time series and its forecast may have different treatment from that given in classical models for time series, such as the ARMA model, since they do not reproduce stylized facts: conditional/unconditional non-normality and variance and non-constant variance over time (MORETTIN;TOLOI, 2006).
Proposed by Bollerslev (1987) for the modeling of volatility, the model AR(p)-ARCH(q) can be represented by the equation

ANN MODEL
The model is adaptable to the time series and differs from classical forecasting models for being a nonparametric model and involving learning algorithms (LIMA et al., 2010).
Simply put, a neural network is a computational structure based on a biological process inspired by the human brain architecture. For Pasquotto (2010), each artificial neuron functions as a unit with autonomy whose goal is to convert an input signal on another output signal. Because neurons are active in the network, the intensity of these signals is amplified or damped according to the parameters that are assigned to synapses, also defined by synaptic weights or simply weights.

The artificial neuron model
Thus, we can be mathematically describe the neuron g by: threshold function which is a discontinuous and binary function; ii) sigmoid function, which is a continuous S-shaped function, varying from 0 to 1 and iii) hyperbolic tangent function, which is a continuous and differentiable function in all its points.

Architecture of neural networks
The architecture for the ANN model goes through changes according to its purpose. The

Types of training
Training for ANN consists in aligning the parameters of the network interactively, requiring a sequence of events, in accordance with: i) stimulation of the neural network by the environment, ii) changes on the weights due to the stimuli and iii) network response in a different way to the environment due to changes. For the neural network, there are two learning patterns (PASQUOTTO, 2010).

Supervised learning
This type of learning works by indicating at the network output the correct answer for each situation. There is a set of input data presented to the neural network as examples which generate a net output which is compared with the expected output, thus obtaining the corresponding error.
Considering the neuron g on the output of a network in the instant t , the corresponding error g e will be defined by: as the neuron's desired response signal g in the instant t and ) (t y g as output signal of the neuron g in the instant t . The error is used as an interactive parameter of weights adjustment whose intention is to gradually reduce the error to a minimum acceptable value. The back-propagation algorithm is widely used for supervised learning (LIMA et al., 2010).

Basic back-propagation algorithm
The back-propagation algorithm crosses the error function at the network output looking for a minimum point. The synaptic weights can be altered after crossing two stages: i) forward propagation and ii) back-propagation. In the first stage, the signal is propagated along the network, starting from the first layer until it generates the error in the last layer. In the second stage, the error is corrected layer by layer, by changing the weights in the reverse direction.
In the back-propagation algorithm, the input-output pairs are presented, each of them to the neural network, and there are two ways to apply the correction of errors. In the former, defined as incremental change, the change of weights is performed whenever a new input-

PERFORMANCE OF FORECASTING EVALUATION
Following the application of the models M-SSA, SARIMA, ARIMA-GARCH and ANN, in addition to the HW seasonal algorithm, it is necessary to evaluate the performance of the obtained forecasts. As the forecasts may have errors, regardless of the adopted model, it is common to evaluate the outcome of the estimates by comparing the obtained values from the original time series and determine its performance by a particular measure. Thus, in the research, forecasts will be compared with 12 weeks following the final week of the sample.
For this purpose, For this, the performance evaluation makes use of the MSE measure as defined by: with j Y representing the value of the original series, j Yˆ the value of the forecast and h , the number of observations provided and reserved for evaluation. In addition to this measure, the research uses the methodology proposed by Goyal and Welch (2003), given by the difference between the accumulated squared prediction errors of the best subsequent performance model, considering the CRMS given by: Whenever such difference is positive, the best subsequent performance model surpasses the best performance one.
Considering two forecasts from a time series t Y , and defining it e and jt e as the respective prediction errors, an analysis of the losses associated with each of these forecasts can be done through the DM test, which makes use of a loss function to measure the forecast error, that is, the loss is calculated from the actual and predicted values of the variable in question. Thus, the test verifies whether the differential loss is not significant between two performed predictions. In order to test whether the sample data originates from a population with a specific distribution, the AD and SW testes are applied. In the survey, two tests allow for a comprehensive overview of the results. As can be seen, from the results shown in Table 1, to a significance at the 5% level, the time series are not normally distributed. Next, in order to evaluate the aspect of normality of the data set, we use the DHO test, which is a multivariate normality test applied between pairs formed by time series. The results presented in Table 2 indicate that there is strong evidence of multivariate normality to a level of significance of 5% between the pair CATT/COFF. Exceptions occur for other pairs. In the survey, its use is justified for knowledge of the characteristics of the analyzed time series as the M-SSA does not require the normality assumption.     In addition to the parameters used in the M-SSA, SARIMA and ARIMA-GARCH models, we can observe in Table 5, in accordance with the study by Esquivel (2012) (23). The exceptions to this are given in the steps ahead h (6, 9 and 12 weeks) on the time series COFF.  In Figure 3 we present, through graphs for commodity prices in the weekly period,

CONCLUSIONS AND SUGGESTIONS
The analysis of agricultural commodity prices is of singular importance to market participants due to the relevance of the information on the behavior of prices. It turns out that research on commodity price forecasts are given for the behavior of prices from the use of data only for the commodity being studied. As the dynamics of time series of agricultural commodities shows changes in time, we must ensure that the forecast model is not sensitive to these changes. The motivation for using the M-SSA model in the research takes place due to its ability to capture structures representing the most comprehensive behavior taking into account the effects of the set of time series.
In research by Lima et al. (2010) and Ceretta, Righi and Schlender (2010) in order to investigate the behavior of soybean prices -the former based on the model ARIMA-GARCH / ANN, and the latter comparing the model ARIMA with ANN -, for the authors, the forecast results were favorable to the models ARIMA-GARCH and ANN. Different from research, the results indicate predictive superiority of the M-SSA model.
For the commodities soybeans, cattle and corn, the study by Ferreira et al. (2011) highlights the possibility of using neural networks as a pricing strategy due to the favorable results. Also in relation to these commodities, the study developed by Lima, Góis and Ulises (2007) indicates that the integrated autoregressive model showed better predictive power.
These results are not confirmed in the research since for the same commodities the predictive performance indicates superiority of the M-SSA model in relation to autoregressive models and neural networks.
In the evaluation of forecasts for the commodity coffee, Miranda, Coronel and Vieira (2013) concluded that the ANN model, when compared with the ARMA model, demonstrated to be effective in the coffee price forecasting, since the expected prices were close to those observed. For the coffee commodity, the results of predictive performance were favorable to the exponential smoothing model.  policies directed to the agricultural sector, on account of the relevance of price forecasting as a planning and analysis instrument of the behavior of the trend in commodities prices.
For future research, we suggest the use of other databases, such prices in international markets, the inclusion of other commodities, the adoption of other periods of analysis, in addition to the use of other variables that may increase the explanatory power of the M-SSA model given its multivariable character.