Fisher's exact test

Fisher's Exact Test nyaéta tes statistical significance dipaké dina analisa categorical data dina waktu ukuran sample leutik. Istilah ieu dimimitian ku R. A. Fisher, tur mangrupa salah sahiji tina kelas tes pasti. Fisher ngarancang ieu tes dumasar kana pamanggihna Muriel Bristol, nu nyebutkeun yén mampu nangtukeun mana nu diasupekun ti heula kana gelasna naha téh atawa susu.

Tes dipaké keur nangtukeun siginifikan nu pakait antara dua variabel dina a 2 x 2 contingency table. Nilai-p tina tes bisa diitung lamun wates tabel 2 x 2 geus dipastikeun, upamana dina conto tes rasa teh, Bristol nyaho jumlah gelas dina unggal percobaan (teh atawa susu nu mimiti diasupkeun) tur mampuh keur nebak kalayan bener jumlahna dina unggal katagori. Saperti nu dijelakeun ku Fisher, hal ieu aya dina null hipotesa bébas nu dipaké dina distribusi hipergeometri keur itungan nu geus ditangtukeun dina tabel.

Dina kaayaan sampel badag bisa digunakeun tes chi-kuadrat. However, this test is not suitable when the "expected values" in any of the cells of the table is below 10 and there is only one tingkat kabebasan: the sampling distribution of the test statistic that is calculated is only approximately equal to the théoretical chi-squared distribution, and the approximation is inadequate in these conditions (which arise when sample sizes are small, or the data are very unequally distributed among the cells of the table). The Fisher test is, as its name states, exact, and it can therefore be used regardless of the sample characteristics. It becomes difficult to calculate with large samples or well-balanced tables, but fortunately these are exactly the conditions where the chi-square test is available.

Artikel ieu keur dikeureuyeuh, ditarjamahkeun tina basa Inggris.
Bantuanna didagoan pikeun narjamahkeun.

The need for the Fisher test arises when we have data that are divided into two categories in two separate ways. For example, a sample of teenagers might be divided into male and female on the one hand, and those that are and are not currently dieting on the other. We hypothesise, perhaps, that the proportion of dieting individuals is higher among the women than among the men, and we want to test whether any difference of proportions that we observe is significant. The data might look like this:

	men	women	total
dieting	1	9	10
not dieting	11	3	14
totals	12	12	24

These data would not be suitable for analysis by a chi-squared test, because the expected values in the table are all below 10, and in a 2 x 2 contingency table, the number of degrees of freedom is always 1.

To proceed with the Fisher test, w have to introduce some notation. We represent the cells by the letters a, b, c and d, call the totals across rows and columns marginal totals, and represent the grand total by n. So the table now looks like this:

	men	women	total
dieting	a	b	a+b
not dieting	c	d	c+d
totals	a+c	b+d	n

Fisher showed that the probability of obtaining any such set of values could be calculated from the multinomial distribution, and that it equalled:

p={\frac {(a+b)!(c+d)!(a+c)!(b+d)!}{n!a!b!c!d!}}

where the symbol ! indicates the factorial, i.e. 1 multiplied by 2 multiplied by 3 etc, up to the number whose factorial is required.

This formula gives the exact probability of observing this particular arrangement of the data on the null hypothesis that the proportions of dieters and non-dieters among men and women are equal in the population from which our sample was drawn. However, this is not the required significance of the difference of proportions in the table. As usual in significance testing, we also have to consider possible results that are more extreme than the one we observed. Fisher showed that we only have to consider cases where the marginal totals are the same as in the observed table. In the example, there is only one such; it would look like this:

	men	women	total
dieting	0	10	10
not dieting	12	2	14
totals	12	12	24

In order to calculate the significance of the observed data, i.e. the total probability of observing data as extreme or more extreme if the null hypothesis is true, we have to calculate the p values for both these tables, and add them together. This gives a one-tailed test; for a two-tailed test we must also consider tables that are equally extreme but in the opposite direction. Unlike most statistical tests, it is not always the case that the two-tailed significance level is exactly twice the one-tailed significance level. In the example above, the one-tailed significance level is 0.0014; calculation of the two-tailed significance level is left as an exercise for the réader.

Calculating significance values for the Fisher exact test is slow and requires care even with the aid of a computer, because the factorial terms quickly become very large, and with larger samples, the number of possible tables more extreme than that observed quickly becomes substantial. Even for small samples (which fortunately is where the test is usually needed), the calculations are tedious, but published tables are available; they are bulky, because the grand total and two of the four cell sizes have to be specified. Given these data, the table then gives the criterial value of the third cell size for specified significance levels. The observed table may have to be re-arranged (for example by réarranging the rows or the columns) to maké it compatible with the way the significance levels are tabulated. Most modérn statistical packages will calculate the significance of Fisher tests, in some cases even where the chi-squared approximation would also be acceptable.