Hypergeometric test
Example
Let's assume you've sampled gene expression for 10,000 genes in two different conditions X and Y and found 300 genes differentially over-expressed. In the entire gene set, 2000 are known to be associated with a particular biological function B. You've noticed that there are quite a few of these F-associated genes in your list of differentially expressed genes, 60 to be precise.
The hypergeometric test might help you to assess whether your observation is indeed statistically significant, i.e. whether function F is enriched in condition X beyond what might be expected by chance.
From our little story above we can extract the following numbers to feed into the test:
Which should result in a probability of p ~ 0.52 to draw 60 F-associated genes or more from 300 randomly selected genes in the list -- not really very significant at all!
Population size: | 10,000 | (total number of genes) |
Number of successes in population: | 2,000 | (all F-associated genes) |
Sample size: | 300 | (over-expressed genes) |
Number of successes in sample | 60 | (F-associated genes in condition X) |
Reference:
https://www.geneprof.org/GeneProf/tools/hypergeometric.jsp
http://mengnote.blogspot.com/2012/12/calculate-correct-hypergeometric-p.html
Comments
Post a Comment