Directional Statistics - The Fundamental Difference Between Linear and Circular Statistics

The Fundamental Difference Between Linear and Circular Statistics

A simple way to calculate the mean of a series of angles (in the interval [0°, 360°)) is to calculate the mean of the cosines and sines of each angle, and obtain the angle by calculating the inverse tangent. Consider the following three angles as an example: 10, 20, and 30 degrees. Intuitively, calculating the mean would involve adding these three angles together and dividing by 3, in this case indeed resulting in a correct mean angle of 20 degrees. By rotating this system anticlockwise through 15 degrees the three angles become 355 degrees, 5 degrees and 15 degrees. The naive mean is now 125 degrees, which is the wrong answer, as it should be 5 degrees. The vector mean can be calculated in the following way, using the mean sine and the mean cosine :


\bar s = \frac{1}{3} \left( \sin (355^\circ) + \sin (5^\circ) + \sin (15^\circ) \right)
= \frac{1}{3} \left( -0.087 + 0.087 + 0.259 \right)
\approx 0.086

\bar c = \frac{1}{3} \left( \cos (355^\circ) + \cos (5^\circ) + \cos (15^\circ) \right)
= \frac{1}{3} \left( 0.996 + 0.996 + 0.966 \right)
\approx 0.986

\bar \theta =
\left.
\begin{cases}
\arctan \left( \frac{\bar s}{ \bar c} \right) & \bar s > 0 ,\ \bar c > 0 \\ \arctan \left( \frac{\bar s}{ \bar c} \right) + 180^\circ & \bar c < 0 \\
\arctan \left (\frac{\bar s}{\bar c}
\right)+360^\circ & \bar s <0 ,\ \bar c >0
\end{cases}
\right\}
= \arctan \left( \frac{0.086}{0.986} \right)
= \arctan (0.087) = 5^\circ.

This may be more succinctly stated by realizing that directional data are in fact vectors of unit length. In the case of one-dimensional data, these data points can be represented conveniently as complex numbers of unit magnitude, where is the measured angle. The mean resultant vector for the sample is then:


\overline{\mathbf{\rho}}=\frac{1}{N}\sum_{n=1}^N z_n.

The sample mean angle is then the argument of the mean resultant:


\overline{\theta}=\mathrm{Arg}(\overline{\mathbf{\rho}}).

The length of the sample mean resultant vector is:


\overline{R}=|\overline{\mathbf{\rho}}|

and will have a value between 0 and 1. Thus the sample mean resultant vector can be represented as:


\overline{\mathbf{\rho}}=\overline{R}\,e^{i\overline{\theta}}.

Read more about this topic:  Directional Statistics

Famous quotes containing the words fundamental, difference, circular and/or statistics:

    One of the fundamental reasons why so many doctors become cynical and disillusioned is precisely because, when the abstract idealism has worn thin, they are uncertain about the value of the actual lives of the patients they are treating. This is not because they are callous or personally inhuman: it is because they live in and accept a society which is incapable of knowing what a human life is worth.
    John Berger (b. 1926)

    What difference is there, do you think, between those in Plato’s cave who can only marvel at the shadows and images of various objects, provided they are content and don’t know what they miss, and the philosopher who has emerged from the cave and sees the real things?
    Desiderius Erasmus (c. 1466–1536)

    Loving a baby is a circular business, a kind of feedback loop. The more you give the more you get and the more you get the more you feel like giving.
    Penelope Leach (20th century)

    Maybe a nation that consumes as much booze and dope as we do and has our kind of divorce statistics should pipe down about “character issues.” Either that or just go ahead and determine the presidency with three-legged races and pie-eating contests. It would make better TV.
    —P.J. (Patrick Jake)