Sample Standard Deviation - Combining Standard Deviations - Sample-based Statistics

Sample-based Statistics

Standard deviations of non-overlapping (XY = ∅) sub-samples can be aggregated as follows if the actual size and means of each are known:

\begin{align} \mu_{X \cup Y} &= \frac{1}{N_{X \cup Y}}\left(N_X\mu_X + N_Y\mu_Y\right)\\ \sigma_{X \cup Y} &= \sqrt{\frac{1}{N_{X \cup Y} - 1}\left(\sigma_X^2 + N_X\mu_X^2 + \sigma_Y^2 + N_Y\mu _Y^2 - \mu_{X \cup Y}^2\right) }
\end{align}

For the more general case of M non-overlapping data sets, X1 through XM, and the aggregate data set :

\begin{align} \mu_X &= \frac{1}{\sum_i { N_{X_i}}} \left(\sum_i { N_{X_i} \mu_{X_i}}\right)\\ \sigma_X &= \sqrt{\frac{1}{\sum_i {N_{X_i} - 1}} \left( \sum_i { \left } - \left\mu_X^2 \right) }
\end{align}

where:

If the size, mean, and standard deviation of two overlapping samples are known for the samples as well as their intersection, then the standard deviation of the aggregated sample can still be calculated. In general:

\begin{align} \mu_{X \cup Y} &= \frac{1}{N_{X \cup Y}}\left(N_X\mu_X + N_Y\mu_Y - N_{X\cap Y}\mu_{X\cap Y}\right)\\ \sigma_{X \cup Y} &= \sqrt{ \frac{1}{N_{X \cup Y} - 1}\left(\sigma_X^2 + N_X\mu_X^2 + \sigma_Y^2 + N_Y\mu _Y^2 - \sigma_{X \cap Y}^2 - N_{X \cap Y}\mu_{X \cap Y}^2 - \mu_{X \cup Y}^2\right) }
\end{align}

Read more about this topic:  Sample Standard Deviation, Combining Standard Deviations

Famous quotes containing the word statistics:

    O for a man who is a man, and, as my neighbor says, has a bone in his back which you cannot pass your hand through! Our statistics are at fault: the population has been returned too large. How many men are there to a square thousand miles in this country? Hardly one.
    Henry David Thoreau (1817–1862)