Hypergeometric test

Example

Let's assume you've sampled gene expression for 10,000 genes in two different conditions X and Y and found 300 genes differentially over-expressed. In the entire gene set, 2000 are known to be associated with a particular biological function B. You've noticed that there are quite a few of these F-associated genes in your list of differentially expressed genes, 60 to be precise.
The hypergeometric test might help you to assess whether your observation is indeed statistically significant, i.e. whether function F is enriched in condition X beyond what might be expected by chance.

From our little story above we can extract the following numbers to feed into the test:
Population size:10,000(total number of genes)
Number of successes in population:2,000(all F-associated genes)
Sample size:300(over-expressed genes)
Number of successes in sample60(F-associated genes in condition X)
Which should result in a probability of p ~ 0.52 to draw 60 F-associated genes or more from 300 randomly selected genes in the list -- not really very significant at all!



Reference:
https://www.geneprof.org/GeneProf/tools/hypergeometric.jsp

http://mengnote.blogspot.com/2012/12/calculate-correct-hypergeometric-p.html


Comments

Popular posts from this blog

gspread error:gspread.exceptions.SpreadsheetNotFound

Miniconda installation problem: concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

转载:彻底搞清楚promoter, exon, intron, and UTR