Posts

Showing posts from January, 2017

(Not from me) Decode 23andme for MTHFR genes

Image
Decode 23andme for MTHFR genes 2 min read To find out what MTHFR mutation you might have using your 23andme data, you will need to dig down into the raw data.  MTHFR is involved in folate metabolism . Note that for 23andme, MTHFR mutations are shown on the positive strand (+). Log in to your 23andme account in order to access the links below. MTHFR A1298C:  rs1801131 https://www.23andme.com/you/explorer/snp/?snp_name=rs1801131 The risk allele is G*. GG = homozygous for MTHFR A1298C (+/+) GT = heterozygous for MTHFR A1298C (+/-) TT = no SNP of MTHFR A1298C (-/-) MTHFR C677T:  rs1801133 https://www.23andme.com/you/explorer/snp/?snp_name=rs1801133 The risk allele is A. AA = homozygous for MTHFR C677T (+/+) AG = heterozygous for MTHFR C677T (+/-) GG = no SNP of MTHFR C677T (-/-) Compound Heterozygous A1298C=GT + C677T=AG is referred to as “Compound Heterozygous” meaning one has a single copy of each mutation. The compound heterozygous MTHFR mutation oc

"Edit plan settings" opens every time I click enter

Problem: I made the mistake of letting my 1-year-old granddaughter plan with my computer for about one minute.  Now every time I press Enter a window with the following path pops up: control panel>power options>edit plan settings How to solve: I got this same problem... I suspected a key was stuck down or something.... so I was trying all the keys trying to get it unstuck and I managed to try the combination "windowsKey x"  which launches the mobility center accidentally sometimes, and with this the enter key stopped activating the wrong panel, so the quick fix for me was to press  windows x .   References: https://answers.microsoft.com/en-us/windows/forum/windows_vista-update/edit-plan-settings-opens-every-time-i-click-enter/62ce1c5f-a3ea-4eac-b327-ca94158c199f

Update the TOC (table of content) of MS Word .docx documents with Python

Here is a snippet to update the TOC of a word 2013 .docx document which includes only one table of content (e.g. just TOC of headings, no TOC of figures etc.). If the script  update_toc.py  is run from the command promt (windows 10, command promt not "running as admin") using  python update_toc.py  the system installation of python opens the file  doc_with_toc.docx  in the same directory, updates the TOC (in my case the headings) and saves the changes into the same file. The document may not be opened in another instance of Word 2013 and may not be write-protected. Be aware of that this script does  not the same as selecting the whole document content and pressing the F9 key . Content of  update_toc.py : import win32com . client import inspect , os def update_toc ( docx_file ): word = win32com . client . DispatchEx ( "Word.Application" ) doc = word . Documents . Open ( docx_file ) doc . TablesOfContents ( 1 ). Update () doc . Close ( S

(Copied) IGV problem: How to turn off the mismatch or SNP color on coverage view

Image
How to turn off the mismatch or SNP color on coverage view I would like to learn how to turn off the mismatch or SNP colors that showed on the coverage view.( have attached an image to show what I meant with this message) I tired to change the setting in the "Preference"and the "Color Legends" but it did not work. Could any one who know how to do it provide any advice to me? Many thanks in advanced!! Answer: You can right click on the coverage track, select "Set allele frequency threshold..." and set it to 1.0. This sets the highest threshold of when to display the colored bar, and you should only see them where 100% of the reads are high-quality mismatches. You can also compute the read coverage from the bam file (using the igvtools count command) and load the resulting file explicitly as a separate track. The separate track will not show any mismatches. Helga References : https://groups.google.com/forum/#!topic/igv-help/aukP9W765j

My concerns about rRNA-depleted RNA-seq data

Image
Recently I did some analysis on rRNA-depleted RNA-seq data. I found that there were a lot reads from intronic regions and intergenic regions. Here I showed you one example from Illumina  commercial samples : How you can get the figure above: Go to basespace. illumina .com, login and go to Public Data tab. Select HiSeq 2000: TruSeq Stranded Total RNA (MAQC) and import Project into your account. Select RNA-Seq alignment and scroll down to alignment distribution. You can see that ~50% intronic and intergenic is typical with commercial samples in our internal workflow. What are the possible sources of these intronic and intergenic reads? "Some reads are definitely expected as this kit sequences both coding and noncoding RNA. The library prep uses an rRNA depletion, so everything else should be present aside from small RNA <200 nt or so. This means t he reads we sequenced could be from mRNA, pre-mRNA, nascent RNA and/or   degraded mRNA, which is also the reason that w

A nice video on how to understand PCA

A nice video on how to understand PCA https://www.youtube.com/watch?v=_UVHneBUBW0&t=1s StatQuest : https://statquest.org/suggestion-box/

Applied statistical tests

Test outliers: Grubbs outlier test Correlation Spearman, Pearson Normality tests: Shapiro-Wilks test Kurtosis: flatness of distribution Skewness: coefficients of asymmetry Parametric tests: analysis of variance (ANOVA) to assess significant effects of multiple independent variables on a single dependent variable multiple analysis of variance (MANOVA) to assess significant effects of multiple independent variables on multiple dependent variables t  test (paired, unpaired, one- or two-tailed) to assess significant effects of a single independent variable on a single dependent variable Nonparametric tests: Wilcoxon signed rank test: paired test Mann-Whitney (U test): unpaired test Kruskal-Wallis test

How to change the default fold of the downloading sra data using prefetch ?

1, come to bin directory, such as: /home/sratoolkit.2.5.5-centos_linux64/bin 2, run it in the terminal: ./vdb-config -i 3, change the fold of Workspace Name to a big harddisk. 4, by the way, use Tab to move References: https://www.biostars.org/p/175096/

Cornell Center for Comparative and Population Genomics

http://www.cornell.edu/video/contributor/center-for-comparative-and-population-genomics