ContrApption allows users to easily create an interactive JavaScript widget that accepts user input via a searchable drop-down menu and produces boxplot visualizations of targets in RNA-Seq and similarly structured datasets. This permits easy, visual inspection of results from a single function call in R and, because ContrApption is written in JavaScript and runs in a browser, sharing of those results without hosting a Shiny server. For the most basic usage, ContrApption simply requires:
Note that the row names of the annotation data must mach the column names of the numeric data. This reflects the requirements of the RNA-Seq experiments for which ContrApption was designed. An intuitive visualization of this data structure can be found on page 5 of the Beginner’s guide to using the DESeq2 package. There is no reason a user from other domains with data in the same structure cannot use the tool however, as the inputs required are simply strings and numbers.
For more complex usage, the user may create a shared data object from the counts dataframe so it can be visualized in ContrApption and in the DT datatables widget simultaneously. Selections in one widget will update the other. Furthermore, the user may provide differential expression data and counts data so that selections in a table of differential expression results can be plotted live in ContrApption as they are explored.
In this vignette we will walk through simple and complex usage of ContrApption with the pasilla dataset used in the DESeq2 vignette.
We load the requires libraries:
library(ContrApption)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.1 ✓ dplyr 1.0.5
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
# libraries for optional integration
library(DT)
library(crosstalk)
suppressMessages(
{
library(DESeq2)
library(edgeR)
library(limma)
library(pasilla)
}
)
## Warning: package 'GenomeInfoDb' was built under R version 4.0.5
load("ContrApptionQuickStart.RData")
ContrApption provides function for several styles of interactive boxplot-based widgets for the interactive exploration of RNA-Seq or similarly structured data. The three primary modes are 1) boxplot, 2) boxplot and counts table, and 3) boxplot and differential expression table. We provide the createContrApption
function that takes counts, annotation, and optional differential expression results. Based on the inputs and their SharedData
status, ContrApption will infer the mode and produce widgets appropriately. Additionally, we expose the core ContrApption
function for users who want more customize the way ContrApption
interacts with the DT
widget.
The most basic form of ContrApption
produces a standalone boxplot from (non-shared) counts and annotations.
# the basic usage of ContrApption
createContrApption(counts = normCountsSig, annotation = coldata)
Substituting a crosstalk::SharedData
object in place of the counts data.frame
will produce a paired widget that displays the counts in an interactive table as well as a ContrApption
widget.
createContrApption(counts = normCountsSigShared, annotation = coldata)
Passing a differential expression table (in the form of a crosstalk::SharedData
object) with basic, non-shared counts will produce a paired widget that displays the differential expression results in and interactive table in addition to the boxplot.
createContrApption(deResults = resShared, counts = normCountsSig, annotation = coldata)
The ContrApption (“Contrast App”) vignette makes use of DESeq2 and the pasilla dataset to demonstrate the intended use of ContrApption. The following is a modified version of the “Count matrix input” section of the DESeq2 vignette.
The following snippet retrieves the column metadata for the Pasilla experiment.
# loads sample annotation from the pasilla package
pasAnno <- system.file("extdata", "pasilla_sample_annotation.csv",
package = "pasilla", mustWork = TRUE)
# read in the sample data
coldata <- read.csv(pasAnno, row.names = 1)
# select relevant columns
coldata <- data.frame(coldata[ , c("condition","type")])
# remove un-needed characters
rownames(coldata) <- gsub("fb", "", rownames(coldata))
# visualize our metadata
coldata
## condition type
## treated1 treated single-read
## treated2 treated paired-end
## treated3 treated paired-end
## untreated1 untreated single-read
## untreated2 untreated single-read
## untreated3 untreated paired-end
## untreated4 untreated paired-end
This matrix will be our annotation
input. Here the rownames are the sample names of the experiment.
The Pasilla dataset contains per-exon and per-gene read counts of RNA-seq samples which will be our data
input. The exons/genes are the rows, and the samples in the experiment are the columns. The following code extracts the counts from the package.
# loads counts file from the pasilla package
pasCts <- system.file("extdata", "pasilla_gene_counts.tsv",
package = "pasilla", mustWork = TRUE)
# reads counts of genes
cts <- read.csv(pasCts, sep = "\t", row.names = "gene_id")
# make sure the order is correct
cts <- cts[, rownames(coldata)]
# examine the counts
head(cts)
## treated1 treated2 treated3 untreated1 untreated2 untreated3
## FBgn0000003 0 0 1 0 0 0
## FBgn0000008 140 88 70 92 161 76
## FBgn0000014 4 0 0 5 1 0
## FBgn0000015 1 0 0 0 2 1
## FBgn0000017 6205 3072 3334 4664 8714 3564
## FBgn0000018 722 299 308 583 761 245
## untreated4
## FBgn0000003 0
## FBgn0000008 70
## FBgn0000014 0
## FBgn0000015 2
## FBgn0000017 3150
## FBgn0000018 310
We now have all we need to run DESeq2 and thus to run ContrApption.
We can create a DESeq2 dataset, run DESeq2 on it, and then extract the normalized counts for visualization.
# create a DESeq2 dataset from the metadata and counts
dds <- DESeq(DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ condition))
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
# extract the normalized counts
normCounts <- counts(dds, normalized = TRUE)
head(normCounts)
## treated1 treated2 treated3 untreated1 untreated2
## FBgn0000003 0.0000000 0.0000 1.200981 0.000000 0.0000000
## FBgn0000008 85.5968034 115.5963 84.068670 80.824908 89.7936241
## FBgn0000014 2.4456230 0.0000 0.000000 4.392658 0.5577244
## FBgn0000015 0.6114057 0.0000 0.000000 0.000000 1.1154487
## FBgn0000017 3793.7726082 4035.3632 4004.070676 4097.471407 4860.0101913
## FBgn0000018 441.4349433 392.7648 369.902150 512.183926 424.4282483
## untreated3 untreated4
## FBgn0000003 0.000000 0.000000
## FBgn0000008 117.004615 93.123591
## FBgn0000014 0.000000 0.000000
## FBgn0000015 1.539534 2.660674
## FBgn0000017 5486.900612 4190.561607
## FBgn0000018 377.185929 412.404476
Given that embedding multiples widgets with 14599 rows of data may lead to performance issues, the user may want to focus on genes of interest as determined by the differential expression experiment:
# extract significant results
res <- results(dds, tidy = TRUE) %>%
data.frame %>%
filter(padj < 0.05)
# get the counts for signficant transcripts
normCountsSig <- normCounts %>%
data.frame %>%
filter(rownames(.) %in% res$row) %>%
format(digits = 2)
normCountsSig %>% nrow
## [1] 845
When used without crosstalk
, ContrApption
will look for rownnames to determine the list of targets of the targets. If there is an existing column the user would like to use, they can change this behavior by specifying a targetCol
argument. Similarly, the the sample IDs are assumed to be the rownames of the annotation file, but the user may specify a column using the sampleCol
argument.
We’ll demonstrate the most straightforward use cases of ContrApption
- visualizing count data and using pre-defined functions to leverage crosstalk
compatibility.
The most basic usage of ContrApption
is to pass counts and annotations to it. We pass the dataset and the annotation file to ContrApption
, resulting in a widget with a searchable drop-down menu of experimental features and boxplots of the input data grouped as directed.
# the basic usage of ContrApption
createContrApption(counts = normCountsSig, annotation = coldata)
Users may customize the title and y-axis labels of the plot by setting the plotName
and yAxisName
arguments. The showLegend
argument turns on the plot legend.
createContrApption(
counts = normCountsSig,
annotation = coldata,
plotName = "DESeq2 normalized counts by sample types",
yAxisName = "norm. counts",
showLegend = TRUE
)
ContrApption
operates on experimental factors with an arbitrary numbers of levels. We can leverage this to explore the interaction of multiple factors of interest by merging the columns of the annotation file.
# derive a new variable - every combination of condition and type
coldata$interaction <- paste0(coldata$condition, ":", coldata$type)
head(coldata)
## condition type interaction
## treated1 treated single-read treated:single-read
## treated2 treated paired-end treated:paired-end
## treated3 treated paired-end treated:paired-end
## untreated1 untreated single-read untreated:single-read
## untreated2 untreated single-read untreated:single-read
## untreated3 untreated paired-end untreated:paired-end
We can then use the function normally, and we’ll se an extra option in the dropdown menu that allows us to explore our new variable.
# pass our new variable
createContrApption(counts = normCountsSig, annotation = coldata)
Note that users interested in this sort of experiment would generally re-run DESeq2 with a different experimental design:
>DESeq(DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ condition * type))
ContrApption
is crosstalk compatible for usage with DT. Users may leverage this by creating a shared data object and passing it to the datatables
and ContrApption
functions to produce interactive widgets that react to inputs in one another. Selecting a row in the datatable will result in a call to visualize the selected row in ContrApption
, and selecting a target in ContrApption will filter the table to that row in the datatable. ContrApption
will not changed an object once it becomes a shared data object, and they do no have rownames, so when using crosstalk
, create the “gene” column before creating the shared data object.
# create a column for out targets
normCountsSig$gene <- rownames(normCountsSig)
# create a shared data object
normCountsSigShared <- SharedData$new(normCountsSig)
We’re now ready to use the cross-enabled function of ContrApption.
The createContrApption
will create a side-by-side pair of widgets - one showing the counts and one showing the ContrApption
visualization. To use it in this manner simply pass a shared counts object to the counts
argument and the column data toannotation
.
createContrApption(counts = normCountsSigShared, annotation = coldata)
Users may also be interested in viewing the results of tests for differential expression with the visualizations provided. Also note that the gene
, or other target name column must be created before passing the data to SharedData$new
.
normCountsSig <- normCounts %>%
data.frame %>%
filter(rownames(.) %in% res$row)
normCountsSig$gene <- rownames(normCountsSig)
# get the significant targets from the DESeq2 experiment and format them
res <- results(dds) %>%
data.frame %>%
dplyr::filter(padj < 0.05) %>%
format(digits = 2)
resShared <- SharedData$new(res)
The createContrApption
function can also be used with differential expression results data. To use the function in this manner, pass the shared differential expression results table as a shared object to the deResults
argument, the non-shared counts to counts
, and the “coldata” to annotation
.
createContrApption(deResults = resShared, counts = normCountsSig, annotation = coldata)
This section of the vignette demonstrates the methods used in the ContrApption
internals to leverage datatables
for users interested in creating customized widgets.
Users interested in customizing the paired widgets are encouraged to visit the documentation for DT and DataTables, the JavaScript library it wraps, to learn to of the numerous options it provides and how they can be passed to crosstalk
for use with ContrApption
. A full tutorial on customized the table is beyond the scope of this document, however useres should be aware of the options available to them through DT
and how those are passed to ContrApption
.
This block demonstrates how different components of createContrApption
function work. A customized datatables
widget is made, and passed to crosstalk
’s bscols
function along with a ContrApption
widget to create a pair.
# first up, a fairly extensively customized datatables object
dtCountsWidget <- datatable(
normCountsSigShared, # Our shared data
extensions = "Scroller", # Various tweaks to the table described in the DT docs
style = "bootstrap",
class = "compact",
width = "100%",
selection = 'single', # ContrApption works on one transcript at at time, we we only need one
# This is a list of options that are passed to the DataTables, described in the DataTables docs
options = list(
deferRender = TRUE,
scroller = TRUE,
scrollY = 300,
sScrollX = "100%",
pagingType = "simple",
# This option allows the user to pass raw, arbitrary JavaScript to be executed when the table
# is rendered. proceed with caution.
initComplete = JS(
"function(settings, json) {",
"$(this.api().table().header()).css({'font-size': '75%'});",
"}"
)
)
# This is a convenience function to add style tweaks to the table. In this case,
# the fontsize of columns is changed on columns listed by number (all of them here)
) %>% formatStyle(columns = seq(1, ncol(normCountsSigShared$origData())), fontSize = '75%')
# ContrApption, used in a fairly basic way, scaled down 50%
cntrCountsWidget <- ContrApption(
data = normCountsSigShared,
annotation = coldata,
scaleWidth = 0.5 # makes room for other widgets,
)
# creates a side-by-side pair of widgets using bscols
createContrApptionsCounts <- bscols(dtCountsWidget, cntrCountsWidget)
createContrApptionsCounts
This example show the same as above, with minor modification to incorporate differential expression results. Most notable, the shared object is created from the expression data, and counts are passed as supplemental information using the counts
argument. This can be achieved by creating a shared data object from the results table and passing "diff-expression"
to the mode
argument of ContrApption.
This disables the dropdown for the target column in ContrApption
as it is no longer needed when the user has a searchable table.
# create a datatables widget
dtDEwidget <- datatable(
resShared,
extensions = "Scroller",
style = "bootstrap",
class = "compact",
width = "100%",
selection = 'single',
options = list(
deferRender = TRUE,
scroller = TRUE,
scrollY = 300,
sScrollX = "100%",
pagingType = "simple",
initComplete = JS(
"function(settings, json) {",
"$(this.api().table().header()).css({'font-size': '75%'});",
"}"
)
)
) %>% formatStyle(columns = seq(1, ncol(resShared$origData())), fontSize = '75%')
# create a ContrApption widget
cDEwidget <- ContrApption(
data = resShared,
countsData = normCountsSig,
annotation = coldata,
mode = "diff-expression",
scaleWidth = 0.5
)
createContrApptionsDE <- bscols(dtDEwidget, cDEwidget)
createContrApptionsDE
This section demonstrates additional examples in slightly different contexts and the use of accesory arguments.
This section shows a basic limma-voom differential expression analysis the ends in exploration of the results using ContrApption
.
We conduct a basic limma-voom
analysis:
# the experimental design matrix
design <- model.matrix(~1 + condition, coldata)
# edgeR differential expression list
dge <- DGEList(counts = as.matrix(cts))
# created index of non-expressed for filtration
toKeep <- filterByExpr(dge, design)
# filter out non-expressed
dge <- dge[toKeep, keep.lib.sizes = FALSE]
# calc normnalization factors
dge <- calcNormFactors(dge)
# add voom adjustment
voomObj <- voom(dge, design, plot = FALSE)
# fit the model
fit <- lmFit(voomObj, design)
# calculate tests statistics
fit <- eBayes(fit)
# return the results
resLV <- topTable(fit, n = Inf, coef = NULL) %>% filter(adj.P.Val < 0.05) %>%
format(digits = 2)
## Removing intercept from test coefficients
# "normalized counts" https://support.bioconductor.org/p/103747/#103769
normCountsLV <- cpm(dge, log = TRUE, prior.count = 3) %>%
format(digits = 2)
normCountsSigLV <- normCountsLV %>%
data.frame %>%
dplyr::filter(rownames(.) %in% rownames(resLV))
We pass the results to ContrApption
:
ContrApption(data = normCountsSigLV, annotation = coldata)
An example with the interactive counts, customized explicitly:
normCountsSigLV$gene <- rownames(normCountsSigLV)
normCountsSigLVShared <- SharedData$new(normCountsSigLV)
dtDEwidgetLV <- datatable(
normCountsSigLVShared,
extensions = "Scroller",
style = "bootstrap",
class = "compact",
width = "100%",
selection = 'single',
options = list(
deferRender = TRUE,
scroller = TRUE,
scrollY = 300,
sScrollX = "100%",
pagingType = "simple",
initComplete = JS(
"function(settings, json) {",
"$(this.api().table().header()).css({'font-size': '75%'});",
"}"
)
)
) %>% formatStyle(columns = seq(1, ncol(normCountsSigLVShared$origData())), fontSize = '75%')
cntrDEwidgetLV <- ContrApption(
data = normCountsSigLVShared, # new shared results
annotation = coldata,
scaleWidth = 0.5
)
createContrApptionsDELV <- bscols(dtDEwidgetLV, cntrDEwidgetLV)
createContrApptionsDELV
An example with the interactive differential expression, customized explicitly:
resLVShared <- SharedData$new(resLV)
dDiffExWidgetLV <- datatable(
resLVShared,
extensions = "Scroller",
style = "bootstrap",
class = "compact",
width = "100%",
selection = 'single',
options = list(
deferRender = TRUE,
scroller = TRUE,
scrollY = 300,
sScrollX = "100%",
pagingType = "simple",
initComplete = JS(
"function(settings, json) {",
"$(this.api().table().header()).css({'font-size': '75%'});",
"}"
)
)
) %>% formatStyle(columns = seq(1, ncol(resLVShared$origData())), fontSize = '75%')
cntrDiffExWidgetLV <- ContrApption(
data = resLVShared,
countsData = normCountsSigLV,
annotation = coldata,
mode = "diff-expression",
scaleWidth = 0.5
)
createContrApptionsDELV <- bscols(dDiffExWidgetLV, cntrDiffExWidgetLV)
createContrApptionsDELV
targetCol
Using a user-specified column instead of the implicit rownames.
# remove gene column
normCountsSig$gene <- NULL
# make our own
normCountsSig$ArbitraryString <- rownames(normCountsSig)
# stash rownames so we can add them back for next example
rowNames <- rownames(normCountsSig)
rownames(normCountsSig) <- NULL
createContrApption(counts = normCountsSig, annotation = coldata, targetCol = "ArbitraryString")
sampleCol
Using a user-specified column instead of the implicit rownames.
# restore rownames
rownames(normCountsSig) <- rowNames
# remove gene column
normCountsSig$ArbitraryString <- NULL
# make a column in the annotation matrix
coldata$MyCustomID <- rownames(coldata)
# remove rownames
rownames(coldata) <- NULL
createContrApption(counts = normCountsSig, annotation = coldata, sampleCol = "MyCustomID")
Given the simple inputs used by ContrApption
, it can be used for any dataset that shared a structure with RNA-Seq experiments.
arbData <- data.frame(one = c(1,5,3), two = c(1,5,3), three = c(5,1,3), four = c(5,1,3))
arbData
## one two three four
## 1 1 1 5 5
## 2 5 5 1 1
## 3 3 3 3 3
arbSamples <- data.frame(sampleNames = c("one", "two", "three", "four"), group = c("A", "A", "B", "B"))
arbSamples
## sampleNames group
## 1 one A
## 2 two A
## 3 three B
## 4 four B
createContrApption(counts = arbData, annotation = arbSamples, sampleCol = "sampleNames")
sessionInfo()
## R version 4.0.4 (2021-02-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] pasilla_1.18.1 edgeR_3.32.1
## [3] limma_3.46.0 DESeq2_1.30.1
## [5] SummarizedExperiment_1.20.0 Biobase_2.50.0
## [7] MatrixGenerics_1.2.1 matrixStats_0.58.0
## [9] GenomicRanges_1.42.0 GenomeInfoDb_1.26.7
## [11] IRanges_2.24.1 S4Vectors_0.28.1
## [13] BiocGenerics_0.36.0 crosstalk_1.1.1
## [15] DT_0.18 forcats_0.5.1
## [17] stringr_1.4.0 dplyr_1.0.5
## [19] purrr_0.3.4 readr_1.4.0
## [21] tidyr_1.1.3 tibble_3.1.1
## [23] ggplot2_3.3.3 tidyverse_1.3.0
## [25] ContrApption_0.0.4.9000
##
## loaded via a namespace (and not attached):
## [1] colorspace_2.0-0 ellipsis_0.3.1 XVector_0.30.0
## [4] fs_1.5.0 rstudioapi_0.13 bit64_4.0.5
## [7] AnnotationDbi_1.52.0 fansi_0.4.2 lubridate_1.7.10
## [10] xml2_1.3.2 splines_4.0.4 cachem_1.0.4
## [13] geneplotter_1.68.0 knitr_1.32 jsonlite_1.7.2
## [16] broom_0.7.6 annotate_1.68.0 dbplyr_2.1.1
## [19] shiny_1.6.0 compiler_4.0.4 httr_1.4.2
## [22] backports_1.2.1 assertthat_0.2.1 Matrix_1.3-2
## [25] fastmap_1.1.0 cli_2.4.0 later_1.1.0.1
## [28] htmltools_0.5.1.1 tools_4.0.4 gtable_0.3.0
## [31] glue_1.4.2 GenomeInfoDbData_1.2.4 Rcpp_1.0.6
## [34] cellranger_1.1.0 jquerylib_0.1.3 vctrs_0.3.7
## [37] xfun_0.22 rvest_1.0.0 mime_0.10
## [40] lifecycle_1.0.0 XML_3.99-0.6 zlibbioc_1.36.0
## [43] scales_1.1.1 hms_1.0.0 promises_1.2.0.1
## [46] RColorBrewer_1.1-2 yaml_2.2.1 memoise_2.0.0
## [49] sass_0.3.1 stringi_1.5.3 RSQLite_2.2.6
## [52] genefilter_1.72.1 BiocParallel_1.24.1 rlang_0.4.10
## [55] pkgconfig_2.0.3 bitops_1.0-6 evaluate_0.14
## [58] lattice_0.20-41 htmlwidgets_1.5.3 bit_4.0.4
## [61] tidyselect_1.1.0 magrittr_2.0.1 R6_2.5.0
## [64] generics_0.1.0 DelayedArray_0.16.3 DBI_1.1.1
## [67] pillar_1.6.0 haven_2.4.0 withr_2.4.1
## [70] survival_3.2-10 RCurl_1.98-1.3 modelr_0.1.8
## [73] crayon_1.4.1 utf8_1.2.1 rmarkdown_2.7
## [76] locfit_1.5-9.4 grid_4.0.4 readxl_1.3.1
## [79] blob_1.2.1 reprex_2.0.0 digest_0.6.27
## [82] xtable_1.8-4 httpuv_1.5.5 munsell_0.5.0
## [85] bslib_0.2.4