Trend Estimation - Data As Trend Plus Noise

Data As Trend Plus Noise

To analyse a (time) series of data, we assume that it may be represented as trend plus noise:

where and are unknown constants and the 's are randomly distributed errors. If one can reject the null hypothesis that the errors are non-stationary, then the non-stationary series {yt } is called trend stationary. The least squares method assumes the errors to be independently distributed with a normal distribution. If this is not the case, hypothesis tests about the estimated values of a and b may be inaccurate. It is simplest if the 's all have the same distribution, but if not (if some have higher variance, meaning that those data points are effectively less certain) then this can be taken into account during the least squares fitting, by weighting each point by the inverse of the variance of that point.

In most cases, where only a single time series exists to be analysed, the variance of the 's is estimated by fitting a trend, thus allowing to be subtracted from the data (thus detrending the data) and leaving the residuals as the detrended data, and calculating the variance of the 's from the residuals — this is often the only way of estimating the variance of the 's.

One particular special case of great interest, the (global) temperature time series, is known not to be homogeneous in time: apart from anything else, the number of weather observations has (generally) increased with time, and thus the error associated with estimating the global temperature from a limited set of observations has decreased with time. In fitting a trend to this data, this can be taken into account, as described above. Though many people do attempt to fit a "trend" to climate data the climate trend is clearly not a straight line and the idea of attributing a straight line is not mathematically correct because the assumptions of the method are not valid in this context.

Once we know the "noise" of the series, we can then assess the significance of the trend by making the null hypothesis that the trend, is not significantly different from 0. From the above discussion of trends in random data with known variance, we know the distribution of trends to be expected from random (trendless) data. If the calculated trend, is larger than the value, then the trend is deemed significantly different from zero at significance level .

The use of a linear trend line has been the subject of criticism, leading to a search for alternative approaches to avoid its use in model estimation. One of the alternative approaches involves unit root tests and the cointegration technique in econometric studies.

The estimated coefficient associated with a linear time trend variable is interpreted as a measure of the impact of a number of unknown or known but unmeasurable factors on the dependent variable over one unit of time. Strictly speaking, that interpretation is applicable for the estimation time frame only. Outside that time frame, one does not know how those unmeasurable factors behave both qualitatively and quantitatively. Furthermore, the linearity of the time trend poses many questions:

(i) Why should it be linear?

(ii) If the trend is non-linear then under what conditions does its inclusion influence the magnitude as well as the statistical significance of the estimates of other parameters in the model?

(iii) The inclusion of a linear time trend in a model precludes by assumption the presence of fluctuations in the tendencies of the dependent variable over time; is this necessarily valid in a particular context?

(iv) And, does a spurious relationship exist in the model because an underlying causative variable is itself time-trending?

Research results of mathematicians, statisticians, econometricians, and economists have been published in response to those questions. For example, detailed notes on the meaning of linear time trends in regression model are given in Cameron (2005); Granger, Engle and many other econometricians have written on stationarity, unit root testing, co-integration and related issues (a summary of some of the works in this area can be found in an information paper by the Royal Swedish Academy of Sciences (2003); and Ho-Trieu & Tucker (1990) have written on logarithmic time trends with results indicating linear time trends are special cases of business cycles

Read more about this topic:  Trend Estimation

Famous quotes containing the words data and/or noise:

    This city is neither a jungle nor the moon.... In long shot: a cosmic smudge, a conglomerate of bleeding energies. Close up, it is a fairly legible printed circuit, a transistorized labyrinth of beastly tracks, a data bank for asthmatic voice-prints.
    Susan Sontag (b. 1933)

    I throw myself down in my chamber, and I call in, and invite God, and his Angels thither, and when they are there, I neglect God and his Angels, for the noise of a fly, for the rattling of a coach, for the whining of a door.
    John Donne (c. 1572–1631)