Inspecting raw data using FastQC

Download the raw data files from http://www.well.ox.ac.uk/bioinformatics/training/RNASeq_Data_Analysis/Monday_am/practical/rawdata
Download FastQC
Run the software by clicking on the "run_fastqc.bat" or "fastqc" files.
From the "File" menu, click "Open" and select the downloaded files.
Once loaded, inspect both files and compare the results.

Open both of the example dataset files provided and identify the attributes of good and bad data.

Which file contains better quality data?
Try to identify in each of the plots a cutoff or pattern by which data can be classified as good.
Do you think that raw data metrics patterns for RNA will be different than for DNA? If so, could you think which of them could change and how?

Inspecting raw data using PrinSeq

Compare an Illumina dataset against a 454 dataset. Can you see the patterns that differentiate the sequencing technologies from each other?
Can you interpret the results of the "Tag Sequence Check" plot?

Download the alignment files from http://www.well.ox.ac.uk/bioinformatics/training/RNASeq_Data_Analysis/Monday_am/practical/alignment/
Go to the IGV page
Download the version that is appropriate for your operating system.
Open IGV by following the instructions provided on the download page.
Once opened, from the top left dropdown menu, choose "Human hg19".
Then, from the "File" menu, click "Load from file" and select the recently downloaded BAM files and click "Open".
Play around with the tool by choosing and zooming in different regions of the genome. In the blank box, you could even write the name of a gene to quickly go to it.

What do the colors of the reads mean?
Why are some reads connected by a line?
How could you quickly compare the expression levels in a particular region on both loaded samples without scrolling to count the reads?
Find gene "GNB1" and compare the expression levels in each sample. Can you see any differentially expressed region?