(By Cyriac Kandoth) Tutorial: Working with MAF files (Mutation Annotation Format) from the TCGA (The Cancer Genome Atlas)
Update: (2/8/2017) This tutorial applies to TCGA MAFs in the GDC Legacy Archive . Most of this tutorial is still valid, but I'll need to update some notes and broken links. Purpose For folks familiar with the VCF format , TCGA's MAF files can be quite a pain to work with. You might just download the latest MAFs, pull loci and alleles for each variant, and redo annotations with ANNOVAR , snpEff , or Ensembl's VEP . Problem solved, right? Nope. You don't know the half of it! There are lots of caveats you should know about, and I try to document them below. Most of these caveats are handled with safe solutions in the MAFs at this page , and the specificity of variant calls are made more comparable across MAFs, at this page . How TCGA MAFs are made Tumor-specific Analysis Working Groups ( AWGs ) take the auto-generated variant calls from the Genome Sequencing Centers ( GSCs ), and remove false-positive variants, or recover those missed by the GSCs. This i