Posts

How to understand Bowtie2 mapping statistics

  16182999 reads; of these: 16182999 (100.00%) were paired; of these: 5731231 (35.42%) aligned concordantly 0 times 4522376 (27.95%) aligned concordantly exactly 1 time 5929392 (36.64%) aligned concordantly >1 times ---- 5731231 pairs aligned concordantly 0 times; of these: 2381431 (41.55%) aligned discordantly 1 time ---- 3349800 pairs aligned 0 times concordantly or discordantly; of these: 6699600 mates make up the pairs; of these: 3814736 (56.94%) aligned 0 times 1883429 (28.11%) aligned exactly 1 time 1001435 (14.95%) aligned >1 times 88.21% overall alignment rate The Bowtie2 result summary is divided in 3 sections: Concordant alignment - In your data (4522376 + 5929392) reads align concordantly. Which is 64.59% of reads Discordant alignment - So now 5731231 reads remain which is 35.41% (100-64.59). Of these, 2381431 reads align discordantly. That is to say, of the non-concordant fraction, 41.55% of reads (2381431 reads) align discordantly. The rest - Now, remember that align

Extract note from google slide using google-slide-api in Python

    This article is one after the previous post ( https://omicsacademy.blogspot.com/2021/03/save-all-google-presentation-slides-as.html ).      In this post, you are going to see how we can extract the note information from Google slide using  google-slide-api in Python. The first part of the code is the same as the previous post .  import urllib.request import json import os.path from googleapiclient.discovery import build from google_auth_oauthlib.flow import InstalledAppFlow from google.auth.transport.requests import Request from google.oauth2.credentials import Credentials # If modifying these scopes, delete the file token.json. SCOPES = ['https://www.googleapis.com/auth/presentations.readonly'] # The ID of a sample presentation. PRESENTATION_ID = '1-aTBNXcSIqlMRzn-FHnRmRPbGlh5eY8MgZNaBwo15IM' creds = None # The file token.json stores the user's access and refresh tokens, and is # created automatically when the authorization flow completes for the first # ti

Save all Google presentation slides as images using python

Image
Prerequisites Step 1: Turn on the Google Slides API Go to the link here: https://developers.google.com/slides/quickstart/python In the "Step 1: Turn on the Google Slides API" section, Click the button to create a new Cloud Platform project and automatically enable the Google Slides API.  In resulting dialog click DOWNLOAD CLIENT CONFIGURATION and save the file  credentials.json  to your working directory. Step 2: Install the Google Client Library pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib Image I have a google slide. See the link here:  https://docs.google.com/presentation/d/ 1-aTBNXcSIqlMRzn-FHnRmRPbGlh5eY8MgZNaBwo15IM /edit?usp=sharing " 1-aTBNXcSIqlMRzn-FHnRmRPbGlh5eY8MgZNaBwo15IM " is the presentation ID.  Python code: Using the code above, you can export the slides to PNG images using Python.  Reference https://developers.google.com/slides/quickstart/python

Miniconda installation problem: concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

We got error message shown as below: [/home/omicsacademy/miniconda3] >>> PREFIX=/home/omicsacademy/miniconda3 Unpacking payload ... concurrent.futures.process._RemoteTraceback: ''' Traceback (most recent call last):   File "concurrent/futures/process.py", line 368, in _queue_management_worker   File "multiprocessing/connection.py", line 251, in recv TypeError: __init__() missing 1 required positional argument: 'msg' ''' The above exception was the direct cause of the following exception: Traceback (most recent call last):   File "entry_point.py", line 69, in <module>   File "concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists   File "concurrent/futures/_base.py", line 611, in result_iterator   File "concurrent/futures/_base.py", line 439, in result   File "concurrent/futures/_base.py", line 388, in __get_result concurrent.futures.process.BrokenProc

How To Merge Two Fastq.Gz Files?

  With gzip files, you can simply concatenate the files using `cat`: cat sample1_R1.gz sample2_R1.gz file3.gz > sample_merge_R1.gz cat sample1_R2.gz sample2_R2.gz file3.gz > sample_merge_R2.gz You can also do this: zcat sample1_R1.gz sample2_R1.gz file3.gz |gzip - > sample_merge_R1.gz zcat sample1_R2.gz sample2_R2.gz file3.gz |gzip - > sample_merge_R2.gz

转载:彻底搞清楚promoter, exon, intron, and UTR

Image
 • 启动子 : RNA 聚合酶特异性识别和结合的 DNA 序列。 • promoter 自然不属于 intron 和 Exon 的任何一个,属于 noncodingsequence 。         • noncoding  RNA 是现在研究的热点之一。我们常见的 MiRNA,SiRNA ,antisense RNA tech, 这些都是属于 ncRNA 的范围。只要你在进一步问下:这些 RNA 是哪里来的?你就知道部分答案,跟那些看似跟编码蛋白没有关系的 DNA 序列有关系。这部分 DNA 有个统称就 junk DNA ,垃圾DNA或者冗余 DNA ,他们编码的 RNA 就属于  ncRNA . RNAi 就是迄今最经典的 ncRNA 功能典范。         • An  exon  is asequence of DNA that is expressed (transcribed) into RNA and then often, butwith many noteworthy exceptions[1] , translated into protein. Adjacent  exons  maybe separated by an  intron , which is later removed from the RNAtranscript via the splicing mechanism.  (From Wikipedia)         • UTR ( UntranslatedRegions) 即非翻译区,是信使 RNA ( mRNA )分子两端的非编码片段。           • 5'-UTR 从 mRNA 起点的甲基化鸟嘌呤核苷酸帽延伸至 AUG 起始密码子, 3'-UTR 从编码区末端的终止密码子延伸至多聚 A 尾巴( Poly-A )的末端。    转录(Transcription)是遗传信息从DNA到RNA的转移。即以双链DNA中的一条链为模板,以ATP、CTP、GTP和UTP4种核苷三磷酸为原料,在RNA聚合酶催化下合成RNA的过程。 常见问题: 问: Promoter 在 DNA 序列中是算内含子还是外显子? 答:都不是。 属于 noncoding sequen

The Proportion of Variability in Y Accounted for by the Linear Relationship Between X and Y

http://www.angelfire.com/wv/bwhomedir/notes/coefficient_of_determination.pdf