ContrApption allows users to easily create an interactive JavaScript widget that accepts user input via a searchable drop-down menu and produces boxplot visualizations of targets in RNA-Seq and similarly structured datasets. This permits easy, visual inspection of results from a single function call in R and, because ContrApption is written in JavaScript and runs in a browser, sharing of those results without hosting a Shiny server. For the most basic usage, ContrApption simply requires:

  • A data.frame of numeric data to visualize where the columns are the samples in the experiment (“numeric data”).
  • A data.frame of metadata describing the experimental features and denoting which samples belong to which experimental conditions (“annotation data”).

Note that the row names of the annotation data must mach the column names of the numeric data. This reflects the requirements of the RNA-Seq experiments for which ContrApption was designed. An intuitive visualization of this data structure can be found on page 5 of the Beginner’s guide to using the DESeq2 package. There is no reason a user from other domains with data in the same structure cannot use the tool however, as the inputs required are simply strings and numbers.

For more complex usage, the user may create a shared data object from the counts dataframe so it can be visualized in ContrApption and in the DT datatables widget simultaneously. Selections in one widget will update the other. Furthermore, the user may provide differential expression data and counts data so that selections in a table of differential expression results can be plotted live in ContrApption as they are explored.

In this vignette we will walk through simple and complex usage of ContrApption with the pasilla dataset used in the DESeq2 vignette.

Vignette dependencies

We load the requires libraries:

library(ContrApption)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.1.1     ✓ dplyr   1.0.5
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
# libraries for optional integration
library(DT)
library(crosstalk)

suppressMessages(
  {
    library(DESeq2)
    library(edgeR)
    library(limma)
    library(pasilla)
  }
)
## Warning: package 'GenomeInfoDb' was built under R version 4.0.5

Quick start

load("ContrApptionQuickStart.RData")

ContrApption provides function for several styles of interactive boxplot-based widgets for the interactive exploration of RNA-Seq or similarly structured data. The three primary modes are 1) boxplot, 2) boxplot and counts table, and 3) boxplot and differential expression table. We provide the createContrApption function that takes counts, annotation, and optional differential expression results. Based on the inputs and their SharedData status, ContrApption will infer the mode and produce widgets appropriately. Additionally, we expose the core ContrApption function for users who want more customize the way ContrApption interacts with the DT widget.

Counts boxplot

The most basic form of ContrApption produces a standalone boxplot from (non-shared) counts and annotations.

# the basic usage of ContrApption
createContrApption(counts = normCountsSig, annotation = coldata)

Counts table and boxplot

Substituting a crosstalk::SharedData object in place of the counts data.frame will produce a paired widget that displays the counts in an interactive table as well as a ContrApption widget.

createContrApption(counts = normCountsSigShared, annotation = coldata)

Differential expression table & boxplot

Passing a differential expression table (in the form of a crosstalk::SharedData object) with basic, non-shared counts will produce a paired widget that displays the differential expression results in and interactive table in addition to the boxplot.

createContrApption(deResults = resShared, counts = normCountsSig, annotation = coldata)

Dataset

The ContrApption (“Contrast App”) vignette makes use of DESeq2 and the pasilla dataset to demonstrate the intended use of ContrApption. The following is a modified version of the “Count matrix input” section of the DESeq2 vignette.

Annotation data and groups

The following snippet retrieves the column metadata for the Pasilla experiment.

# loads sample annotation from the pasilla package
pasAnno <- system.file("extdata", "pasilla_sample_annotation.csv",
                       package = "pasilla", mustWork = TRUE)

# read in the sample data
coldata <- read.csv(pasAnno, row.names = 1)

# select relevant columns
coldata <- data.frame(coldata[ , c("condition","type")])

# remove un-needed characters
rownames(coldata) <- gsub("fb", "", rownames(coldata))

# visualize our metadata
coldata
##            condition        type
## treated1     treated single-read
## treated2     treated  paired-end
## treated3     treated  paired-end
## untreated1 untreated single-read
## untreated2 untreated single-read
## untreated3 untreated  paired-end
## untreated4 untreated  paired-end

This matrix will be our annotation input. Here the rownames are the sample names of the experiment.

Numeric data

The Pasilla dataset contains per-exon and per-gene read counts of RNA-seq samples which will be our data input. The exons/genes are the rows, and the samples in the experiment are the columns. The following code extracts the counts from the package.

# loads counts file from the pasilla package
pasCts <- system.file("extdata", "pasilla_gene_counts.tsv",
                      package = "pasilla", mustWork = TRUE)

# reads counts of genes
cts <- read.csv(pasCts, sep = "\t", row.names = "gene_id")

# make sure the order is correct
cts <- cts[, rownames(coldata)]

# examine the counts
head(cts)
##             treated1 treated2 treated3 untreated1 untreated2 untreated3
## FBgn0000003        0        0        1          0          0          0
## FBgn0000008      140       88       70         92        161         76
## FBgn0000014        4        0        0          5          1          0
## FBgn0000015        1        0        0          0          2          1
## FBgn0000017     6205     3072     3334       4664       8714       3564
## FBgn0000018      722      299      308        583        761        245
##             untreated4
## FBgn0000003          0
## FBgn0000008         70
## FBgn0000014          0
## FBgn0000015          2
## FBgn0000017       3150
## FBgn0000018        310

We now have all we need to run DESeq2 and thus to run ContrApption.

DESeq2 normalized counts example

We can create a DESeq2 dataset, run DESeq2 on it, and then extract the normalized counts for visualization.

# create a DESeq2 dataset from the metadata and counts
dds <- DESeq(DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ condition))
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
# extract the normalized counts
normCounts <- counts(dds, normalized = TRUE)

head(normCounts)
##                 treated1  treated2    treated3  untreated1   untreated2
## FBgn0000003    0.0000000    0.0000    1.200981    0.000000    0.0000000
## FBgn0000008   85.5968034  115.5963   84.068670   80.824908   89.7936241
## FBgn0000014    2.4456230    0.0000    0.000000    4.392658    0.5577244
## FBgn0000015    0.6114057    0.0000    0.000000    0.000000    1.1154487
## FBgn0000017 3793.7726082 4035.3632 4004.070676 4097.471407 4860.0101913
## FBgn0000018  441.4349433  392.7648  369.902150  512.183926  424.4282483
##              untreated3  untreated4
## FBgn0000003    0.000000    0.000000
## FBgn0000008  117.004615   93.123591
## FBgn0000014    0.000000    0.000000
## FBgn0000015    1.539534    2.660674
## FBgn0000017 5486.900612 4190.561607
## FBgn0000018  377.185929  412.404476

Given that embedding multiples widgets with 14599 rows of data may lead to performance issues, the user may want to focus on genes of interest as determined by the differential expression experiment:

# extract significant results
res <- results(dds, tidy = TRUE) %>% 
  data.frame %>% 
  filter(padj < 0.05)

# get the counts for signficant transcripts
normCountsSig <- normCounts %>%
  data.frame %>% 
  filter(rownames(.) %in% res$row) %>% 
  format(digits = 2)

normCountsSig %>% nrow
## [1] 845

When used without crosstalk, ContrApption will look for rownnames to determine the list of targets of the targets. If there is an existing column the user would like to use, they can change this behavior by specifying a targetCol argument. Similarly, the the sample IDs are assumed to be the rownames of the annotation file, but the user may specify a column using the sampleCol argument.

Basic usage

We’ll demonstrate the most straightforward use cases of ContrApption - visualizing count data and using pre-defined functions to leverage crosstalk compatibility.

Basic counts usage

The most basic usage of ContrApption is to pass counts and annotations to it. We pass the dataset and the annotation file to ContrApption, resulting in a widget with a searchable drop-down menu of experimental features and boxplots of the input data grouped as directed.

# the basic usage of ContrApption
createContrApption(counts = normCountsSig, annotation = coldata)

Users may customize the title and y-axis labels of the plot by setting the plotName and yAxisName arguments. The showLegend argument turns on the plot legend.

createContrApption(
  counts = normCountsSig,
  annotation = coldata,
  plotName = "DESeq2 normalized counts by sample types",
  yAxisName = "norm. counts", 
  showLegend = TRUE
)

Interaction-style example

ContrApption operates on experimental factors with an arbitrary numbers of levels. We can leverage this to explore the interaction of multiple factors of interest by merging the columns of the annotation file.

# derive a new variable - every combination of condition and type
coldata$interaction <- paste0(coldata$condition, ":", coldata$type)

head(coldata)
##            condition        type           interaction
## treated1     treated single-read   treated:single-read
## treated2     treated  paired-end    treated:paired-end
## treated3     treated  paired-end    treated:paired-end
## untreated1 untreated single-read untreated:single-read
## untreated2 untreated single-read untreated:single-read
## untreated3 untreated  paired-end  untreated:paired-end

We can then use the function normally, and we’ll se an extra option in the dropdown menu that allows us to explore our new variable.

# pass our new variable
createContrApption(counts = normCountsSig, annotation = coldata)
Note that users interested in this sort of experiment would generally re-run DESeq2 with a different experimental design: 
>DESeq(DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ condition * type))

Crosstalk integration

Counts

ContrApption is crosstalk compatible for usage with DT. Users may leverage this by creating a shared data object and passing it to the datatables and ContrApption functions to produce interactive widgets that react to inputs in one another. Selecting a row in the datatable will result in a call to visualize the selected row in ContrApption, and selecting a target in ContrApption will filter the table to that row in the datatable. ContrApption will not changed an object once it becomes a shared data object, and they do no have rownames, so when using crosstalk, create the “gene” column before creating the shared data object.

# create a column for out targets
normCountsSig$gene <- rownames(normCountsSig)

# create a shared data object
normCountsSigShared <- SharedData$new(normCountsSig)

We’re now ready to use the cross-enabled function of ContrApption.

Paried-widget counts function

The createContrApption will create a side-by-side pair of widgets - one showing the counts and one showing the ContrApption visualization. To use it in this manner simply pass a shared counts object to the counts argument and the column data toannotation.

createContrApption(counts = normCountsSigShared, annotation = coldata)

Differential expression data

Users may also be interested in viewing the results of tests for differential expression with the visualizations provided. Also note that the gene, or other target name column must be created before passing the data to SharedData$new.

normCountsSig <- normCounts %>%
  data.frame %>% 
  filter(rownames(.) %in% res$row)

normCountsSig$gene <- rownames(normCountsSig)

# get the significant targets from the DESeq2 experiment and format them
res <- results(dds) %>% 
  data.frame %>% 
  dplyr::filter(padj < 0.05) %>% 
  format(digits = 2)

resShared <- SharedData$new(res)

Paried-widget differential expression function

The createContrApption function can also be used with differential expression results data. To use the function in this manner, pass the shared differential expression results table as a shared object to the deResults argument, the non-shared counts to counts, and the “coldata” to annotation.

createContrApption(deResults = resShared, counts = normCountsSig, annotation = coldata)

Advanced usage

This section of the vignette demonstrates the methods used in the ContrApption internals to leverage datatables for users interested in creating customized widgets.

Customizing the table

Users interested in customizing the paired widgets are encouraged to visit the documentation for DT and DataTables, the JavaScript library it wraps, to learn to of the numerous options it provides and how they can be passed to crosstalk for use with ContrApption. A full tutorial on customized the table is beyond the scope of this document, however useres should be aware of the options available to them through DT and how those are passed to ContrApption.

Counts

This block demonstrates how different components of createContrApption function work. A customized datatables widget is made, and passed to crosstalk’s bscols function along with a ContrApption widget to create a pair.

# first up, a fairly extensively customized datatables object
dtCountsWidget <-  datatable(
  normCountsSigShared,     # Our shared data
  extensions = "Scroller", # Various tweaks to the table described in the DT docs 
  style = "bootstrap",     
  class = "compact",
  width = "100%",
  selection = 'single', # ContrApption works on one transcript at at time, we we only need one
  # This is a list of options that are passed to the DataTables, described in the DataTables docs
  options = list(
    deferRender = TRUE,
    scroller = TRUE,
    scrollY = 300,
    sScrollX = "100%",
    pagingType = "simple",
    # This option allows the user to pass raw, arbitrary JavaScript to be executed when the table
    # is rendered. proceed with caution.
    initComplete = JS(
      "function(settings, json) {",
        "$(this.api().table().header()).css({'font-size': '75%'});",
      "}"
    )
  )
  # This is a convenience function to add style tweaks to the table. In this case, 
  # the fontsize of columns is changed on columns listed by number (all of them here)
) %>% formatStyle(columns = seq(1, ncol(normCountsSigShared$origData())), fontSize = '75%')

# ContrApption, used in a fairly basic way, scaled down 50%
cntrCountsWidget <- ContrApption(
  data = normCountsSigShared,
  annotation = coldata,
  scaleWidth = 0.5 # makes room for other widgets,
)

# creates a side-by-side pair of widgets using bscols
createContrApptionsCounts <- bscols(dtCountsWidget, cntrCountsWidget)

createContrApptionsCounts

Differential expression

This example show the same as above, with minor modification to incorporate differential expression results. Most notable, the shared object is created from the expression data, and counts are passed as supplemental information using the counts argument. This can be achieved by creating a shared data object from the results table and passing "diff-expression" to the mode argument of ContrApption. This disables the dropdown for the target column in ContrApption as it is no longer needed when the user has a searchable table.

# create a datatables widget
dtDEwidget <- datatable(
  resShared,
  extensions = "Scroller",
  style = "bootstrap",
  class = "compact",
  width = "100%",
  selection = 'single',
  options = list(
    deferRender = TRUE,
    scroller = TRUE,
    scrollY = 300,
    sScrollX = "100%",
    pagingType = "simple",
    initComplete = JS(
      "function(settings, json) {",
        "$(this.api().table().header()).css({'font-size': '75%'});",
      "}"
    )
  )
) %>% formatStyle(columns = seq(1, ncol(resShared$origData())), fontSize = '75%')

# create a ContrApption widget
cDEwidget <- ContrApption(
  data = resShared,
  countsData = normCountsSig,
  annotation = coldata,
  mode = "diff-expression",
  scaleWidth = 0.5
)

createContrApptionsDE <- bscols(dtDEwidget, cDEwidget)

createContrApptionsDE

Misc

This section demonstrates additional examples in slightly different contexts and the use of accesory arguments.

Example with limma-voom results

This section shows a basic limma-voom differential expression analysis the ends in exploration of the results using ContrApption.

Basic

We conduct a basic limma-voom analysis:

# the experimental design matrix
design <- model.matrix(~1 + condition, coldata)

# edgeR differential expression list
dge <- DGEList(counts = as.matrix(cts))

# created index of non-expressed for filtration
toKeep <- filterByExpr(dge, design)

# filter out non-expressed
dge <- dge[toKeep, keep.lib.sizes = FALSE]

# calc normnalization factors
dge <- calcNormFactors(dge)

# add voom adjustment
voomObj <- voom(dge, design, plot = FALSE)

# fit the model
fit <- lmFit(voomObj, design)

# calculate tests statistics
fit <- eBayes(fit)

# return the results
resLV <- topTable(fit, n = Inf, coef = NULL) %>% filter(adj.P.Val < 0.05) %>%
  format(digits = 2)
## Removing intercept from test coefficients
# "normalized counts" https://support.bioconductor.org/p/103747/#103769
normCountsLV <- cpm(dge, log = TRUE, prior.count = 3) %>% 
  format(digits = 2)

normCountsSigLV <- normCountsLV %>%
  data.frame %>%
  dplyr::filter(rownames(.) %in% rownames(resLV))

We pass the results to ContrApption:

ContrApption(data = normCountsSigLV, annotation = coldata)

Interactive counts

An example with the interactive counts, customized explicitly:

normCountsSigLV$gene <- rownames(normCountsSigLV)

normCountsSigLVShared <- SharedData$new(normCountsSigLV)

dtDEwidgetLV <- datatable(
  normCountsSigLVShared,
  extensions = "Scroller",
  style = "bootstrap",
  class = "compact",
  width = "100%",
  selection = 'single',
  options = list(
    deferRender = TRUE,
    scroller = TRUE,
    scrollY = 300,
    sScrollX = "100%",
    pagingType = "simple",
    initComplete = JS(
      "function(settings, json) {",
        "$(this.api().table().header()).css({'font-size': '75%'});",
      "}"
    )
  )
) %>% formatStyle(columns = seq(1, ncol(normCountsSigLVShared$origData())), fontSize = '75%')

cntrDEwidgetLV <- ContrApption(
  data = normCountsSigLVShared, # new shared results
  annotation = coldata,
  scaleWidth = 0.5
)

createContrApptionsDELV <- bscols(dtDEwidgetLV, cntrDEwidgetLV)

createContrApptionsDELV

Differential expression

An example with the interactive differential expression, customized explicitly:

resLVShared <-  SharedData$new(resLV)

dDiffExWidgetLV <- datatable(
  resLVShared,
  extensions = "Scroller",
  style = "bootstrap",
  class = "compact",
  width = "100%",
  selection = 'single',
  options = list(
    deferRender = TRUE,
    scroller = TRUE,
    scrollY = 300,
    sScrollX = "100%",
    pagingType = "simple",
    initComplete = JS(
      "function(settings, json) {",
        "$(this.api().table().header()).css({'font-size': '75%'});",
      "}"
    )
  )
) %>% formatStyle(columns = seq(1, ncol(resLVShared$origData())), fontSize = '75%')

cntrDiffExWidgetLV <- ContrApption(
  data = resLVShared,
  countsData = normCountsSigLV,
  annotation = coldata,
  mode = "diff-expression",
  scaleWidth = 0.5
)

createContrApptionsDELV <- bscols(dDiffExWidgetLV, cntrDiffExWidgetLV)

createContrApptionsDELV

Specifying targetCol

Using a user-specified column instead of the implicit rownames.

# remove gene column
normCountsSig$gene <- NULL

# make our own
normCountsSig$ArbitraryString <- rownames(normCountsSig)

# stash rownames so we can add them back for next example
rowNames <- rownames(normCountsSig)

rownames(normCountsSig) <- NULL

createContrApption(counts = normCountsSig, annotation = coldata, targetCol = "ArbitraryString")

Specifying sampleCol

Using a user-specified column instead of the implicit rownames.

# restore rownames
rownames(normCountsSig) <- rowNames

# remove gene column
normCountsSig$ArbitraryString <- NULL

# make a column in the annotation matrix
coldata$MyCustomID <- rownames(coldata)

# remove rownames
rownames(coldata) <- NULL

createContrApption(counts = normCountsSig, annotation = coldata, sampleCol = "MyCustomID")

Arbitrary data

Given the simple inputs used by ContrApption, it can be used for any dataset that shared a structure with RNA-Seq experiments.

arbData <- data.frame(one = c(1,5,3), two = c(1,5,3), three = c(5,1,3), four = c(5,1,3))

arbData
##   one two three four
## 1   1   1     5    5
## 2   5   5     1    1
## 3   3   3     3    3
arbSamples <- data.frame(sampleNames = c("one", "two", "three", "four"), group = c("A", "A", "B", "B"))

arbSamples
##   sampleNames group
## 1         one     A
## 2         two     A
## 3       three     B
## 4        four     B
createContrApption(counts = arbData, annotation = arbSamples, sampleCol = "sampleNames")
sessionInfo()
## R version 4.0.4 (2021-02-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] pasilla_1.18.1              edgeR_3.32.1               
##  [3] limma_3.46.0                DESeq2_1.30.1              
##  [5] SummarizedExperiment_1.20.0 Biobase_2.50.0             
##  [7] MatrixGenerics_1.2.1        matrixStats_0.58.0         
##  [9] GenomicRanges_1.42.0        GenomeInfoDb_1.26.7        
## [11] IRanges_2.24.1              S4Vectors_0.28.1           
## [13] BiocGenerics_0.36.0         crosstalk_1.1.1            
## [15] DT_0.18                     forcats_0.5.1              
## [17] stringr_1.4.0               dplyr_1.0.5                
## [19] purrr_0.3.4                 readr_1.4.0                
## [21] tidyr_1.1.3                 tibble_3.1.1               
## [23] ggplot2_3.3.3               tidyverse_1.3.0            
## [25] ContrApption_0.0.4.9000    
## 
## loaded via a namespace (and not attached):
##  [1] colorspace_2.0-0       ellipsis_0.3.1         XVector_0.30.0        
##  [4] fs_1.5.0               rstudioapi_0.13        bit64_4.0.5           
##  [7] AnnotationDbi_1.52.0   fansi_0.4.2            lubridate_1.7.10      
## [10] xml2_1.3.2             splines_4.0.4          cachem_1.0.4          
## [13] geneplotter_1.68.0     knitr_1.32             jsonlite_1.7.2        
## [16] broom_0.7.6            annotate_1.68.0        dbplyr_2.1.1          
## [19] shiny_1.6.0            compiler_4.0.4         httr_1.4.2            
## [22] backports_1.2.1        assertthat_0.2.1       Matrix_1.3-2          
## [25] fastmap_1.1.0          cli_2.4.0              later_1.1.0.1         
## [28] htmltools_0.5.1.1      tools_4.0.4            gtable_0.3.0          
## [31] glue_1.4.2             GenomeInfoDbData_1.2.4 Rcpp_1.0.6            
## [34] cellranger_1.1.0       jquerylib_0.1.3        vctrs_0.3.7           
## [37] xfun_0.22              rvest_1.0.0            mime_0.10             
## [40] lifecycle_1.0.0        XML_3.99-0.6           zlibbioc_1.36.0       
## [43] scales_1.1.1           hms_1.0.0              promises_1.2.0.1      
## [46] RColorBrewer_1.1-2     yaml_2.2.1             memoise_2.0.0         
## [49] sass_0.3.1             stringi_1.5.3          RSQLite_2.2.6         
## [52] genefilter_1.72.1      BiocParallel_1.24.1    rlang_0.4.10          
## [55] pkgconfig_2.0.3        bitops_1.0-6           evaluate_0.14         
## [58] lattice_0.20-41        htmlwidgets_1.5.3      bit_4.0.4             
## [61] tidyselect_1.1.0       magrittr_2.0.1         R6_2.5.0              
## [64] generics_0.1.0         DelayedArray_0.16.3    DBI_1.1.1             
## [67] pillar_1.6.0           haven_2.4.0            withr_2.4.1           
## [70] survival_3.2-10        RCurl_1.98-1.3         modelr_0.1.8          
## [73] crayon_1.4.1           utf8_1.2.1             rmarkdown_2.7         
## [76] locfit_1.5-9.4         grid_4.0.4             readxl_1.3.1          
## [79] blob_1.2.1             reprex_2.0.0           digest_0.6.27         
## [82] xtable_1.8-4           httpuv_1.5.5           munsell_0.5.0         
## [85] bslib_0.2.4