A t-test is a way of using a student’s t-distribution to decide whether there are statistically significant differences between data sets. The t-test in Excel is a two-sample t-test that compares the means of two samples. This article explains what statistical significance means and shows how to perform a t-test in Excel.
The instructions in this article apply to Excel 2019, 2016, 2013, 2010, 2007; Excel for Microsoft 365 and Excel Online.
What is statistical significance?
Imagine you want to know which of the two dice gives the better score. You roll the first die and get a 2; You roll the second die and get a 6. Does that tell you that the second die usually gives a better score? If you answered “Of course not,” you already have some knowledge of statistical significance. You understand that the difference is due to the random change in score with each roll of the dice. Because the sample was very small (a single ray), it showed nothing significant.
Now imagine that you roll each die 6 times:
- The first die is 3, 6, 6, 4, 3, 3; Average = 4.17
- The second die is 5, 6, 2, 5, 2, 4; Average = 4.00
Now, is this proof that the first die scores more points than the second? Probably not. A small sample with a relatively small difference between the means makes it likely that the difference is always due to random variation. As the number of dice rolls increases, it becomes more difficult to give a reasonable answer to the following question: Is the difference between the values the result of random variation, or does either one give a higher score than ? the other?
Significance is the probability that an observed difference between samples is due to random variation. Importance is often referred to as the alpha level or simply “α”. The confidence level, or simply “c”, is the probability that the difference between samples is not due to random variation; in other words, that there is a difference between the underlying populations. Therefore, the confidence level, or simply “c”, is the probability that the difference between the samples is not due to random variation, that is, that there is a difference between the underlying populations: c = 1 – α
We can place “α” at any level to ensure that we have proved our importance. Very often α=5% is used (95% confidence), but if we really want to be sure that the differences are not due to random variations, we can apply a higher confidence level by using α=1% or even α= 0.1%.
Different statistical tests are used to calculate significance in different situations. T-tests are used to determine whether the means of two populations are different, and F-tests are used to determine whether the variances are different.
Why a statistical significance test?
When we compare different things, we need to use significance tests to see if one is better than the other. This applies to many areas, for example:
- In the business world, people need to compare different products and marketing methods.
- In sports, people need to compare different devices, techniques, and competitors.
- In engineering, people need to compare different designs and different parameters.
If you want to check whether one thing performs better than another in any area, you need to perform a statistical significance test.
What is a student’s t-distribution?
A student’s t-distribution resembles a normal (or Gaussian) distribution. Both distributions are bell-shaped, with most results close to the mean, but a few rare events are quite far from the mean in both directions, called tails of the distribution.
The exact shape of Student’s t-distribution depends on the sample size. For samples with more than 30 people, it is very similar to the normal distribution. As the sample size is reduced, the tails get larger, representing the increased uncertainty that comes from drawing conclusions based on a small sample.
Before you can apply a t-test to determine whether there is a statistically significant difference between the means of two samples, you must first perform an F-test because the calculations performed for the t-test are different depending on whether a statistically significant difference is whether or not there is a significant difference between the deviations.
You need the Supplement to the Analysis Toolpak made this analysis possible.
Review and load the Analysis Toolpak add-on
Follow the steps below to check and activate the Scan Toolpak:
- Choose DOSSIER Tab > select options.
- In the dialog box, select Options add-ins from the tabs on the left.
- At the bottom of the window, select the Manage drop-down menu button and then select Excel add-ins. Choose Go to.
- Make sure the check box next to Toolpack d’analysis is enabled, then select OK.
- The Analysis Toolpak is now active and you can apply F and T tests.
Perform F-Test and T-Test in Excel
- Enter two records in a table. In this case, we’re looking at selling two products in one week. It also calculates the average daily sales value of each product along with its standard deviation.
- Choose Data tab > data analysis
- Choose F-Test two samples for variances from the list, and then select OK.
The F-test is very sensitive to non-normality. It may therefore be safer to use a Welch test, but this is more difficult in Excel.
- Select the range of variable 1 and the range of variable 2; Adjust the alpha (0.05 gives a 95% confidence level); Select a cell for the top left corner of the output provided it fills 3 columns and 10 rows. Choose OK.
For the interval of variable 1, the sample with the largest standard deviation (or variance) should be selected.
- Review the F-Test results to see if there is a significant difference between the deviations. The results provide three important values:
- f: The ratio between the deviations.
- P(F<=f) a tail: The probability that variable 1 actually has no greater variance than variable 2. If this is greater than alpha, which is typically 0.05, then there is no significant difference between the variances.
- F Unilaterally critical: The value of F that would be needed to give P(F<=f)=α. If this value is greater than F, this also indicates that there is no significant difference between the deviations.
P(F<=f) can also be calculated using the FDIST function with F and the degrees of freedom of each sample as inputs. The degrees of freedom are simply the number of observations in a sample minus one.
- Now that you know if there is a difference between the gaps, you can choose the appropriate t-test. Choose Data tab > data analysis then choose either t-test: Two samples assuming equal differences or t-Test: Two samples assuming unequal differences.
- Regardless of the option you chose in the previous step, you will be presented with the same dialog box for entering the scan details. First, select the areas that contain the samples Variable 1 area etc Variable 2 area.
- Suppose you want to check if there is no difference between the means, set those Hypothetical mean difference to zero.
- Set the alpha significance level (0.05 gives 95% confidence) and select a cell for the top left corner of the output, assuming it fills 3 columns and 14 rows. Choose OK.
- Examine the results to decide if there is a significant difference between the means.
Just like the F-test, there is no significant difference when the p-value, in this case P(T<=t), is greater than alpha. In this case, however, two p-values are reported, one for a one-tailed test and the other for a two-tailed test. In this case, use the two-tailed value because if either variable has a higher mean, the difference is significant.