Statistics in Analytical Chemistry

Comparison of standard deviations - The F-test:

When comparing one sample against another, or a sample against what we would expect to see given a certain population distribution, we are interested in whether or not the spread or dispersion of the two sets of data are comparable. Techmically, we wish to compare the variance of the two (where the variance is the square of the standard deviation.) Doing so allows us to answer various questions such as:

Is the precision of one set of values better or worse than the other?
Is a set of replicate measurements representative of the population of expected results, or is it possible that it derives from some other population?

Such questions are relevant to evaluating whether: a given sample is representative; if a new method of analysis performs comparably to the existing one; and evaluating the technical proficency of individual analysts and laboratories employing starndard methods.

The F-test provides the means for performing such comparisons, which are a necessary precursor to employing t-test for the comparison of two sample means

Comparison of variance:

Whether we are comparing two sample variances, or a sample and a population variance, we need to define our null and alternate hypotheses first.

Obviously, if the two variances were exactly equal, then the ratio of the variances would be one; conversely, if they were significantly different, the ratio would be greater or less than 1. For convenience, the Fisher F-test is defined so that the larger variance is always the numerator and the smaller the denominator.

That is, we test the null hypothesis

H₀: σ₁² = σ₂²

against the appropriate alternate hypothesis

H₁: σ₁² > σ₂² or σ₁² < σ₂²

We therefore calculate the Fisher F-value as:

where s₁² ≥ s₂², so that F ≥ 1.

The degrees of freedom for the numerator and denominator are n₁-1 and n₂-1, respectively. Note that it is not necessary to have exactly the same number of replicate values in each set.

As with the t-test, we can either compare F_calc to a tabulated value F_tab or calculate the probability that we would expect such a value given our two variances to see if we should accept or reject the null hypothesis. We can also perform 1- or 2-tailed F-tests. The following two examples illustrate the use of such tests.

Example 1:

As an example, assume we want to see if a method (Method A) for determining the arsenic concentration in soil is significantly more precise than a second method (Method B). Each method was tested ten times, yielding the following values:

Method	Mean (ppm)	Standard Deviation (ppm)
A	6.7	0.8
B	8.2	1.2

A method is more precise if its standard deviation is lower than that of the other method. So we want to test the null hypothesis H₀: σ₂² = σ₁², against the alternate hypothesis H_A: σ₂² > σ₁².

Since s₂ > s₁, F_calc = s₂²/s₁² = 1.2²/0.8² = 2.25. The tabulated value for d.o.f. ν₂ = ν₁ = 9 in each case, and a 1-tailed, 95% confidence level is F_9,9 = 3.179. In this case, F_calc < F_9,9, so we accept the null hypothesis that the two standard deviations are equal, and we are 95% confident that any difference in the sample standard deviations is due to random error. We use a 1-tailed test in this case because the only information we are interested in is whether Method 1 is more precise than Method 2.

Example 2

If we are not interested in whether one method is better compared to another, but were simply trying to determine if the variances of were the same or different, we would need to use a 2-tailed test. For instance, assume we made two sets of measurements of ethanol concentration in a sample of vodka using the same instrument, but on two different days. On the first day, we found a standard deviation of s₁ = 9 ppm and on the next day we found s₂ = 2 ppm. Both datasets comprised 6 measurements. We want to know if we can combine the two datasets, or if there is a significant difference between the datasets, and that we should discard one of them.

As usual, we begin by defining the null hypothesis, H₀: σ₁² = σ₂², and the alternate hypothesis, H_A: σ₁² ≠ σ₂². The "≠" sign indicates that this is a 2-tailed test, because we are interested in both cases: σ₁² > σ₂² and σ₁² < σ₂². For the F-test, you can perform a 2-tailed test by multiplying the confidence level P by 2, so from a table for a 1-tailed test at the P = 0.05 confidence level, we would perform a 2-tailed test at P = 0.10, or a 90% confidence level.

For this dataset, s₂ > s₁, F_calc = s₁²/ s₂² = 9²/2² = 20.25. The tabulated value for ν = 5 at 90% confidence is F_5,5 = 5.050. Since F_calc > F_5,5, we reject the null hypothesis, and can say with 90% certainty that there is a difference between the standard deviations of the two methods.

Tables for other confidence levels can be found in most statistics or analytical chemistry textbooks. Be careful when using these tables, to pay attention to whether the table is for a 1- or a 2-tailed test. In most cases, tables are given for 2-tailed tests, so you can divide by 2 for the 1-tailed test. For the F-test, always ensure that the larger standard deviation is in the numerator, so that F ≥ 1.

Stats Tutorial - Instrumental Analysis and Calibration

Comparison of standard deviations - The F-test:

Tips & links:

Comparison of variance:

Example 1:

Example 2