Sample Standard Deviation - Combining Standard Deviations - Sample-based Statistics

Sample-based Statistics

Standard deviations of non-overlapping (XY = ∅) sub-samples can be aggregated as follows if the actual size and means of each are known:

\begin{align} \mu_{X \cup Y} &= \frac{1}{N_{X \cup Y}}\left(N_X\mu_X + N_Y\mu_Y\right)\\ \sigma_{X \cup Y} &= \sqrt{\frac{1}{N_{X \cup Y} - 1}\left(\sigma_X^2 + N_X\mu_X^2 + \sigma_Y^2 + N_Y\mu _Y^2 - \mu_{X \cup Y}^2\right) }
\end{align}

For the more general case of M non-overlapping data sets, X1 through XM, and the aggregate data set :

\begin{align} \mu_X &= \frac{1}{\sum_i { N_{X_i}}} \left(\sum_i { N_{X_i} \mu_{X_i}}\right)\\ \sigma_X &= \sqrt{\frac{1}{\sum_i {N_{X_i} - 1}} \left( \sum_i { \left } - \left\mu_X^2 \right) }
\end{align}

where:

If the size, mean, and standard deviation of two overlapping samples are known for the samples as well as their intersection, then the standard deviation of the aggregated sample can still be calculated. In general:

\begin{align} \mu_{X \cup Y} &= \frac{1}{N_{X \cup Y}}\left(N_X\mu_X + N_Y\mu_Y - N_{X\cap Y}\mu_{X\cap Y}\right)\\ \sigma_{X \cup Y} &= \sqrt{ \frac{1}{N_{X \cup Y} - 1}\left(\sigma_X^2 + N_X\mu_X^2 + \sigma_Y^2 + N_Y\mu _Y^2 - \sigma_{X \cap Y}^2 - N_{X \cap Y}\mu_{X \cap Y}^2 - \mu_{X \cup Y}^2\right) }
\end{align}

Read more about this topic:  Sample Standard Deviation, Combining Standard Deviations

Famous quotes containing the word statistics:

    We already have the statistics for the future: the growth percentages of pollution, overpopulation, desertification. The future is already in place.
    Günther Grass (b. 1927)