Version: 0.1
Summary | Value |
---|---|
Samples | 12 |
Genome | None |
Genes | 124370106 |
Detected genes | 15771 |
Poor-quality samples often have a smaller number of genes represented than the rest, or a disproportionately large number of reads mapping to only a subset of genes.
These curves show how many of the total reads map to the most-expressed genes for each sample. Samples that start at a high value, or follow a markedly different curve to the rest, are likely to be unreliable.
These curves show the total number of detected genes for each sample, as the number varies with counts-per-million thresholds.
For the following plots, the gene counts are normalised using a variance-stabilising technique1 In some projects, with poor-quality data, the normalisation is less effective than is ideal. The ideal result is for the red trend line to be close to horizontal. For projects where it is not, the rest of the QC may not be reliable.
Each cell is coloured depending on how many genes fall within it.
Inter-sample similarity can be used to confirm that different experimental groups for clusters in the data. It can also reveal unexpected features, such as batch effects, that may need to be adjusted for in downstream analysis.
Two methods of measuring this similarity are given here: Euclidean distance and Spearman’s Rank correlation. The former is more sensitive to outlier samples but typically provides more detail as to clustering.
The heatmaps show sample-to-sample similarity scores. The dendrograms show how samples cluster in a binary tree.
PCA aims to simplify the display of high dimensional data set in such a way as to make any clustering immediately apparent. The most significant components are shown here as one of the axes in each plot. Often, only two components are significant and hence only a single plot is required.
The amount of the total variation accounted for by each component is shown.
Ideally each experimental group will be its own cluster, on at least one of these plots.
Each sample is represented by its name.
Each grouping category is shown separately, if more than one is given. Each dot is one sample.
“Picard Toolkit.” 2019. Broad Institute, GitHub Repository. https://broadinstitute.github.io/picard/; Broad Institute
“R: A language and environment for statistical computing.” 2021. R Core Team https://www.R-project.org/; R Foundation for Statistical Computing, Vienna, Austria.
Allaire J, Xie Y, McPherson J, Luraschi J, Ushey K, Atkins A, Wickham H, Cheng J, Chang W, Iannone R (2022). “rmarkdown: Dynamic Documents for R” https://github.com/rstudio/rmarkdown
Galili T (2015). “dendextend: an R package for visualizing, adjusting, and comparing trees of hierarchical clustering.” Bioinformatics https://doi.org/10.1093/bioinformatics/btv428
Huber W, von Heydebreck A, Sueltmann H, Poustka A, Vingron M (2002). “Variance Stabilization Applied to Microarray Data Calibration and to the Quantification of Differential Expression.” Bioinformatics, 18 Suppl. 1, S96-S104.
Love MI, Huber W, Anders S (2014). “Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.” Genome Biology, 15, 550. doi: 10.1186/s13059-014-0550-8
Morgan M, Obenchain V, Hester J, Pagès H (2023). “SummarizedExperiment: SummarizedExperiment container” https://doi.org/10.18129/B9.bioc.SummarizedExperiment
Wickham H (2007). “Reshaping Data with the reshape Package.” Journal of Statistical Software, 21(12), 1–20. https://www.jstatsoft.org/v21/i12/.
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
Wickham H, Henry L, Pedersen T, Luciani T, Decorde M, Lise V (2023). “svglite: An ‘SVG’ Graphics Device” https://CRAN.R-project.org/package=svglite.
Xie Y, Allaire J, Grolemund G (2018). R Markdown: The Definitive Guide. Chapman and Hall/CRC, Boca Raton, Florida. ISBN 9781138359338, https://bookdown.org/yihui/rmarkdown.
Xie Y (2019). “TinyTeX: A lightweight, cross-platform, and easy-to-maintain LaTeX distribution based on TeX Live.” TUGboat 40(1), 30-32. https://tug.org/TUGboat/Contents/contents40-1.html.
Xie Y, Dervieux C, Riederer E (2020). R Markdown Cookbook. Chapman and Hall/CRC, Boca Raton, Florida. ISBN 9780367563837, https://bookdown.org/yihui/rmarkdown-cookbook.
Xie Y (2024). “tinytex: Helper Functions to Install and Maintain TeX Live, and Compile LaTeX Documents” https://github.com/rstudio/tinytex.
The details can be found in the DESeq2 documentation.↩︎