Sample Standard Deviation - Combining Standard Deviations - Sample-based Statistics

Sample-based Statistics

Standard deviations of non-overlapping (XY = ∅) sub-samples can be aggregated as follows if the actual size and means of each are known:

\begin{align} \mu_{X \cup Y} &= \frac{1}{N_{X \cup Y}}\left(N_X\mu_X + N_Y\mu_Y\right)\\ \sigma_{X \cup Y} &= \sqrt{\frac{1}{N_{X \cup Y} - 1}\left(\sigma_X^2 + N_X\mu_X^2 + \sigma_Y^2 + N_Y\mu _Y^2 - \mu_{X \cup Y}^2\right) }
\end{align}

For the more general case of M non-overlapping data sets, X1 through XM, and the aggregate data set :

\begin{align} \mu_X &= \frac{1}{\sum_i { N_{X_i}}} \left(\sum_i { N_{X_i} \mu_{X_i}}\right)\\ \sigma_X &= \sqrt{\frac{1}{\sum_i {N_{X_i} - 1}} \left( \sum_i { \left } - \left\mu_X^2 \right) }
\end{align}

where:

If the size, mean, and standard deviation of two overlapping samples are known for the samples as well as their intersection, then the standard deviation of the aggregated sample can still be calculated. In general:

\begin{align} \mu_{X \cup Y} &= \frac{1}{N_{X \cup Y}}\left(N_X\mu_X + N_Y\mu_Y - N_{X\cap Y}\mu_{X\cap Y}\right)\\ \sigma_{X \cup Y} &= \sqrt{ \frac{1}{N_{X \cup Y} - 1}\left(\sigma_X^2 + N_X\mu_X^2 + \sigma_Y^2 + N_Y\mu _Y^2 - \sigma_{X \cap Y}^2 - N_{X \cap Y}\mu_{X \cap Y}^2 - \mu_{X \cup Y}^2\right) }
\end{align}

Read more about this topic:  Sample Standard Deviation, Combining Standard Deviations

Famous quotes containing the word statistics:

    July 4. Statistics show that we lose more fools on this day than in all the other days of the year put together. This proves, by the number left in stock, that one Fourth of July per year is now inadequate, the country has grown so.
    Mark Twain [Samuel Langhorne Clemens] (1835–1910)