Directional Statistics - The Fundamental Difference Between Linear and Circular Statistics

The Fundamental Difference Between Linear and Circular Statistics

A simple way to calculate the mean of a series of angles (in the interval [0°, 360°)) is to calculate the mean of the cosines and sines of each angle, and obtain the angle by calculating the inverse tangent. Consider the following three angles as an example: 10, 20, and 30 degrees. Intuitively, calculating the mean would involve adding these three angles together and dividing by 3, in this case indeed resulting in a correct mean angle of 20 degrees. By rotating this system anticlockwise through 15 degrees the three angles become 355 degrees, 5 degrees and 15 degrees. The naive mean is now 125 degrees, which is the wrong answer, as it should be 5 degrees. The vector mean can be calculated in the following way, using the mean sine and the mean cosine :


\bar s = \frac{1}{3} \left( \sin (355^\circ) + \sin (5^\circ) + \sin (15^\circ) \right)
= \frac{1}{3} \left( -0.087 + 0.087 + 0.259 \right)
\approx 0.086

\bar c = \frac{1}{3} \left( \cos (355^\circ) + \cos (5^\circ) + \cos (15^\circ) \right)
= \frac{1}{3} \left( 0.996 + 0.996 + 0.966 \right)
\approx 0.986

\bar \theta =
\left.
\begin{cases}
\arctan \left( \frac{\bar s}{ \bar c} \right) & \bar s > 0 ,\ \bar c > 0 \\ \arctan \left( \frac{\bar s}{ \bar c} \right) + 180^\circ & \bar c < 0 \\
\arctan \left (\frac{\bar s}{\bar c}
\right)+360^\circ & \bar s <0 ,\ \bar c >0
\end{cases}
\right\}
= \arctan \left( \frac{0.086}{0.986} \right)
= \arctan (0.087) = 5^\circ.

This may be more succinctly stated by realizing that directional data are in fact vectors of unit length. In the case of one-dimensional data, these data points can be represented conveniently as complex numbers of unit magnitude, where is the measured angle. The mean resultant vector for the sample is then:


\overline{\mathbf{\rho}}=\frac{1}{N}\sum_{n=1}^N z_n.

The sample mean angle is then the argument of the mean resultant:


\overline{\theta}=\mathrm{Arg}(\overline{\mathbf{\rho}}).

The length of the sample mean resultant vector is:


\overline{R}=|\overline{\mathbf{\rho}}|

and will have a value between 0 and 1. Thus the sample mean resultant vector can be represented as:


\overline{\mathbf{\rho}}=\overline{R}\,e^{i\overline{\theta}}.

Read more about this topic:  Directional Statistics

Famous quotes containing the words fundamental, difference, circular and/or statistics:

    No democracy can long survive which does not accept as fundamental to its very existence the recognition of the rights of minorities.
    Franklin D. Roosevelt (1882–1945)

    The difference of the English and Irish character is nowhere more plainly discerned than in their respective kitchens. With the former, this apartment is probably the cleanest, and certainly the most orderly, in the house.... An Irish kitchen ... is usually a temple dedicated to the goddess of disorder; and, too often, joined with her, is the potent deity of dirt.
    Anthony Trollope (1815–1882)

    ‘A thing is called by a certain name because it instantiates a certain universal’ is obviously circular when particularized, but it looks imposing when left in this general form. And it looks imposing in this general form largely because of the inveterate philosophical habit of treating the shadows cast by words and sentences as if they were separately identifiable. Universals, like facts and propositions, are such shadows.
    David Pears (b. 1921)

    We ask for no statistics of the killed,
    For nothing political impinges on
    This single casualty, or all those gone,
    Missing or healing, sinking or dispersed,
    Hundreds of thousands counted, millions lost.
    Karl Shapiro (b. 1913)