poltpipe.blogg.se - Fastqc illumina universal adapter

They were downloaded to: `/bgfs/genomics/refs/refs/GDC_Refs/RNA-Seq`. Since, we planned to implement GDC pipeline for RNA-Seq analysis in this project, all the reference files were downloaded from GDC (). The raw data can be found in `/bgfs/uchandran/Pat_Murphy/Fastq/Raw` The analysis folder on HTC can be found in: `/bgfs/uchandran/Pat_Murphy` The second batch was downloaded via DNAnexus links provided in excel sheet. The first batch data was received on HTC in the folder: `/bgfs/uchandran/shared/uchandran_psm25` In total, the remaining 31 samples were considered for differential expression analysis. Sample 19-255 does not have status information for both genes SF3B1 and BAP1.

Samples 18-141, 18-218 and 19-224 are poorly mapped by STAR aligner and has high rRNA contamination (discussed in detail in QC and Mapping)ģ. Samples 18-240 and 19-59 were orginally labelled as BAD_DATA by Pat's group.Ģ. The table below shows the 37 samples used in the analysis and the details of color highlights are:ġ. Some samples were not given a group label (NA's) and such samples can be removed from performing DE analysis. Metadata sheet was provided labeling sample groups for differential expression analysis. Though we received the samples in 2 batches, in reality the samples were sequenced in 7 batches, giving raise to the possibility of batch effects. In total, we have 37 samples for this analysis. In the second batch, all 5 samples were included. In total, only 32 samples were included in the first batch. In the first batch, 3 samples were extra and not part of the analysis. This is a bulk RNA-Seq project, which includes human data.