Why do we need a list of ranges?

We showed previously how to specify multiple genomic ranges in a GRanges object. And this made sense for specifying the general region in the genome where the genes start and end, for example.

However, genes in many organisms are themselves made up of sub-regions: exons. The exons and introns are transcribed into RNA, but then the introns are spliced out, resulting in molecules of mRNA which only contain the exonic sequence.

In order to specify where the exonic sequence lies in the genome, we can use a single GRanges. However, if we want to specify multiple genes or transcripts, we need a list of GRanges. Bioconductor has a specific object for this purpose called the GRangesList.

What’s the distinction between genes and transcripts? A gene is generally defined as a region in the genome which is transcribed. However, the specific combination of exons is referred to as a transcript. The different transcripts that a gene produces can be referred to as isoforms of the gene. To give an example, suppose we have a gene with exons E1, E2, E3, E4. And there are two transcripts, or isoforms, of this gene: one which includes E3, and one which excludes E3. To specify all the isoforms of the gene, we need a GRangesList, which has two elements: one of length 4 (E1,E2,E3,E4), and one of length 3 (E1,E2,E4). We will see concrete examples of this below, using the human transcriptome: all of the transcripts produced by all of the genes in the human genome.

We start by loading the Ensembl database used previously.

suppressPackageStartupMessages(library(EnsDb.Hsapiens.v86))
edb <- EnsDb.Hsapiens.v86

We don’t run the following code, chunk but we include it to show you could also build an Ensembl database using a Bioconductor package called AnnotationHub, which we’ll see more of soon.

library(AnnotationHub)
ah <- AnnotationHub()
edb <- query(ah, c("EnsDb", "Hsapiens", "v87"))[[1]]

Let’s take a look at the information contained in edb:

columns(edb)
##  [1] "ENTREZID"            "EXONID"              "EXONIDX"            
##  [4] "EXONSEQEND"          "EXONSEQSTART"        "GENEBIOTYPE"        
##  [7] "GENEID"              "GENENAME"            "GENESEQEND"         
## [10] "GENESEQSTART"        "INTERPROACCESSION"   "ISCIRCULAR"         
## [13] "PROTDOMEND"          "PROTDOMSTART"        "PROTEINDOMAINID"    
## [16] "PROTEINDOMAINSOURCE" "PROTEINID"           "PROTEINSEQUENCE"    
## [19] "SEQCOORDSYSTEM"      "SEQLENGTH"           "SEQNAME"            
## [22] "SEQSTRAND"           "SYMBOL"              "TXBIOTYPE"          
## [25] "TXCDSSEQEND"         "TXCDSSEQSTART"       "TXID"               
## [28] "TXNAME"              "TXSEQEND"            "TXSEQSTART"         
## [31] "UNIPROTDB"           "UNIPROTID"           "UNIPROTMAPPINGTYPE"

Let’s pull out a table of the gene and transcript IDs. We ask for all the genes, and then provide those as keys to the select function. This produces a table linking every transcript to every gene.

However, if we want to index the GRangesList it would be better to add this information directly to metadata columns of the object. We will do this subsequently.

txdf <- select(edb,
               keys=keys(edb, "GENEID"),
               columns=c("GENEID","TXID"),
               keytype="GENEID")
head(txdf,20)
##             GENEID            TXID
## 1  ENSG00000000003 ENST00000373020
## 2  ENSG00000000003 ENST00000496771
## 3  ENSG00000000003 ENST00000494424
## 4  ENSG00000000003 ENST00000612152
## 5  ENSG00000000003 ENST00000614008
## 6  ENSG00000000005 ENST00000373031
## 7  ENSG00000000005 ENST00000485971
## 8  ENSG00000000419 ENST00000371588
## 9  ENSG00000000419 ENST00000466152
## 10 ENSG00000000419 ENST00000371582
## 11 ENSG00000000419 ENST00000494752
## 12 ENSG00000000419 ENST00000371584
## 13 ENSG00000000419 ENST00000413082
## 14 ENSG00000000457 ENST00000367771
## 15 ENSG00000000457 ENST00000367770
## 16 ENSG00000000457 ENST00000423670
## 17 ENSG00000000457 ENST00000470238
## 18 ENSG00000000457 ENST00000367772
## 19 ENSG00000000460 ENST00000498289
## 20 ENSG00000000460 ENST00000472795

If a gene appears once in this table, it means it has only one isoform, if it appears twice, it has two isoforms, and so on. We can run table on the GENEID column to calculate how many isoforms each gene has. If we run table again, we will find out how many 1-isoform genes there are, how many 2-isoform genes there are, etc.

table(table(txdf$GENEID))
## 
##     1     2     3     4     5     6     7     8     9    10    11    12 
## 39890  5024  3035  2349  2021  1644  1454  1258  1118   939   780   748 
##    13    14    15    16    17    18    19    20    21    22    23    24 
##   551   506   411   351   284   211   209   179   143   132   131    88 
##    25    26    27    28    29    30    31    32    33    34    35    36 
##    70    52    50    44    46    39    19    15    27    11    24     8 
##    37    38    39    40    41    42    43    44    45    46    47    48 
##    10    14     6     2     7     3     2     4     2     5     4     3 
##    49    50    51    52    53    55    56    57    58    59    61    63 
##     1     2     2     2     2     2     4     3     1     3     1     1 
##    64    65    67    68    69    71    73    74    77    79    90   103 
##     1     1     2     1     1     1     2     1     1     1     1     1 
##   138   139   142   170 
##     3     2     3     1

Let’s construct a GRangesList of all the exons, grouped by the transcript they belong to. It takes maybe half a minute to pull out this information from the database object:

ebt <- exonsBy(edb, by="tx")

Just to show we have a GRangesList:

class(ebt)
## [1] "CompressedGRangesList"
## attr(,"package")
## [1] "GenomicRanges"
length(ebt)
## [1] 216741

If we try to print it, it will just show us the first few list elements (each a GRanges), and then say how many additional elements are in the list.

ebt
## GRangesList object of length 216741:
## $ENST00000000233
## GRanges object with 6 ranges and 2 metadata columns:
##       seqnames              ranges strand |         exon_id exon_rank
##          <Rle>           <IRanges>  <Rle> |     <character> <integer>
##   [1]        7 127588345-127588565      + | ENSE00001872691         1
##   [2]        7 127589083-127589163      + | ENSE00003494180         2
##   [3]        7 127589485-127589594      + | ENSE00003504066         3
##   [4]        7 127590066-127590137      + | ENSE00003678978         4
##   [5]        7 127590963-127591088      + | ENSE00003676786         5
##   [6]        7 127591213-127591705      + | ENSE00000882271         6
##   -------
##   seqinfo: 357 sequences (1 circular) from GRCh38 genome
## 
## $ENST00000000412
## GRanges object with 7 ranges and 2 metadata columns:
##       seqnames          ranges strand |         exon_id exon_rank
##          <Rle>       <IRanges>  <Rle> |     <character> <integer>
##   [1]       12 8949488-8949955      - | ENSE00002286327         1
##   [2]       12 8946229-8946405      - | ENSE00003523177         2
##   [3]       12 8945418-8945584      - | ENSE00003631241         3
##   [4]       12 8943801-8943910      - | ENSE00003492441         4
##   [5]       12 8943405-8943535      - | ENSE00000717490         5
##   [6]       12 8942416-8942542      - | ENSE00003610229         6
##   [7]       12 8940365-8941940      - | ENSE00002254457         7
##   -------
##   seqinfo: 357 sequences (1 circular) from GRCh38 genome
## 
## $ENST00000000442
## GRanges object with 7 ranges and 2 metadata columns:
##       seqnames            ranges strand |         exon_id exon_rank
##          <Rle>         <IRanges>  <Rle> |     <character> <integer>
##   [1]       11 64305578-64305736      + | ENSE00001884684         1
##   [2]       11 64307168-64307504      + | ENSE00001195360         2
##   [3]       11 64313951-64314067      + | ENSE00003589560         3
##   [4]       11 64314239-64314367      + | ENSE00003471121         4
##   [5]       11 64314741-64314911      + | ENSE00001195335         5
##   [6]       11 64315001-64315270      + | ENSE00000727249         6
##   [7]       11 64315707-64316738      + | ENSE00001271942         7
##   -------
##   seqinfo: 357 sequences (1 circular) from GRCh38 genome
## 
## ...
## <216738 more elements>

Note the difference between using single square brackets…

ebt[1]
## GRangesList object of length 1:
## $ENST00000000233
## GRanges object with 6 ranges and 2 metadata columns:
##       seqnames              ranges strand |         exon_id exon_rank
##          <Rle>           <IRanges>  <Rle> |     <character> <integer>
##   [1]        7 127588345-127588565      + | ENSE00001872691         1
##   [2]        7 127589083-127589163      + | ENSE00003494180         2
##   [3]        7 127589485-127589594      + | ENSE00003504066         3
##   [4]        7 127590066-127590137      + | ENSE00003678978         4
##   [5]        7 127590963-127591088      + | ENSE00003676786         5
##   [6]        7 127591213-127591705      + | ENSE00000882271         6
##   -------
##   seqinfo: 357 sequences (1 circular) from GRCh38 genome

…and double square brackets:

ebt[[1]]
## GRanges object with 6 ranges and 2 metadata columns:
##       seqnames              ranges strand |         exon_id exon_rank
##          <Rle>           <IRanges>  <Rle> |     <character> <integer>
##   [1]        7 127588345-127588565      + | ENSE00001872691         1
##   [2]        7 127589083-127589163      + | ENSE00003494180         2
##   [3]        7 127589485-127589594      + | ENSE00003504066         3
##   [4]        7 127590066-127590137      + | ENSE00003678978         4
##   [5]        7 127590963-127591088      + | ENSE00003676786         5
##   [6]        7 127591213-127591705      + | ENSE00000882271         6
##   -------
##   seqinfo: 357 sequences (1 circular) from GRCh38 genome

We have metadata for each element, but none at the transcript level:

mcols(ebt)
## DataFrame with 216741 rows and 0 columns

Let’s add the gene identifiers:

table(names(ebt) %in% txdf$TXID)
## 
##   TRUE 
## 216741
mcols(ebt)$GENEID <- mapIds(edb, names(ebt), "GENEID", "TXID")
mcols(ebt)
## DataFrame with 216741 rows and 1 column
##                          GENEID
##                     <character>
## ENST00000000233 ENSG00000004059
## ENST00000000412 ENSG00000003056
## ENST00000000442 ENSG00000173153
## ENST00000001008 ENSG00000004478
## ENST00000001146 ENSG00000003137
## ...                         ...
## LRG_96t1                 LRG_96
## LRG_97t1                 LRG_97
## LRG_98t1                 LRG_98
## LRG_992t1               LRG_992
## LRG_99t1                 LRG_99

One way to find out the number of exons per transcript would be to apply length to each element:

sapply(ebt[1:5], length)
## ENST00000000233 ENST00000000412 ENST00000000442 ENST00000001008 
##               6               7               7              10 
## ENST00000001146 
##               6

A faster implementation is lengths():

mcols(ebt)$num_exons <- lengths(ebt)
mcols(ebt)
## DataFrame with 216741 rows and 2 columns
##                          GENEID num_exons
##                     <character> <integer>
## ENST00000000233 ENSG00000004059         6
## ENST00000000412 ENSG00000003056         7
## ENST00000000442 ENSG00000173153         7
## ENST00000001008 ENSG00000004478        10
## ENST00000001146 ENSG00000003137         6
## ...                         ...       ...
## LRG_96t1                 LRG_96         7
## LRG_97t1                 LRG_97         7
## LRG_98t1                 LRG_98         2
## LRG_992t1               LRG_992         8
## LRG_99t1                 LRG_99         2

Compare isoforms of a gene

Suppose we want to compare the different isoforms of a specific gene. We can pull out the names of all the transcripts for a given gene, from the table we made earlier:

gid <- "ENSG00000196839"
idx <- mcols(ebt)$GENEID == gid
sum(idx) # how many transcripts?
## [1] 9

Now, we want to see all the exons for these transcripts. We’ll pull these out of larger GRangesList using the [ ] brackets, as this returns a GRangesList.

ebt_sub <- ebt[ idx ]
ebt_sub
## GRangesList object of length 9:
## $ENST00000372874
## GRanges object with 12 ranges and 2 metadata columns:
##        seqnames            ranges strand |         exon_id exon_rank
##           <Rle>         <IRanges>  <Rle> |     <character> <integer>
##    [1]       20 44651575-44651742      - | ENSE00001458877         1
##    [2]       20 44636227-44636288      - | ENSE00003478001         2
##    [3]       20 44629047-44629169      - | ENSE00003645967         3
##    [4]       20 44626456-44626599      - | ENSE00003463626         4
##    [5]       20 44625569-44625684      - | ENSE00003691626         5
##    ...      ...               ...    ... .             ...       ...
##    [8]       20 44622829-44622930      - | ENSE00003512095         8
##    [9]       20 44622588-44622652      - | ENSE00003551087         9
##   [10]       20 44621018-44621147      - | ENSE00003524429        10
##   [11]       20 44620299-44620401      - | ENSE00003572076        11
##   [12]       20 44619522-44619847      - | ENSE00003479916        12
##   -------
##   seqinfo: 357 sequences (1 circular) from GRCh38 genome
## 
## $ENST00000464097
## GRanges object with 7 ranges and 2 metadata columns:
##       seqnames            ranges strand |         exon_id exon_rank
##          <Rle>         <IRanges>  <Rle> |     <character> <integer>
##   [1]       20 44626456-44626491      - | ENSE00001889246         1
##   [2]       20 44625569-44625684      - | ENSE00003494592         2
##   [3]       20 44624202-44624329      - | ENSE00003581053         3
##   [4]       20 44622829-44623078      - | ENSE00001924195         4
##   [5]       20 44622588-44622652      - | ENSE00003667259         5
##   [6]       20 44620299-44621147      - | ENSE00001878832         6
##   [7]       20 44619522-44619847      - | ENSE00003563480         7
##   -------
##   seqinfo: 357 sequences (1 circular) from GRCh38 genome
## 
## ...
## <7 more elements>

We can look at the exons of the first transcript:

ebt_sub[[1]]
## GRanges object with 12 ranges and 2 metadata columns:
##        seqnames            ranges strand |         exon_id exon_rank
##           <Rle>         <IRanges>  <Rle> |     <character> <integer>
##    [1]       20 44651575-44651742      - | ENSE00001458877         1
##    [2]       20 44636227-44636288      - | ENSE00003478001         2
##    [3]       20 44629047-44629169      - | ENSE00003645967         3
##    [4]       20 44626456-44626599      - | ENSE00003463626         4
##    [5]       20 44625569-44625684      - | ENSE00003691626         5
##    ...      ...               ...    ... .             ...       ...
##    [8]       20 44622829-44622930      - | ENSE00003512095         8
##    [9]       20 44622588-44622652      - | ENSE00003551087         9
##   [10]       20 44621018-44621147      - | ENSE00003524429        10
##   [11]       20 44620299-44620401      - | ENSE00003572076        11
##   [12]       20 44619522-44619847      - | ENSE00003479916        12
##   -------
##   seqinfo: 357 sequences (1 circular) from GRCh38 genome
l <- length(ebt_sub[[1]])
ebt_sub[[1]][1] # just first exon
## GRanges object with 1 range and 2 metadata columns:
##       seqnames            ranges strand |         exon_id exon_rank
##          <Rle>         <IRanges>  <Rle> |     <character> <integer>
##   [1]       20 44651575-44651742      - | ENSE00001458877         1
##   -------
##   seqinfo: 357 sequences (1 circular) from GRCh38 genome
ebt_sub[[1]][l] # last exon
## GRanges object with 1 range and 2 metadata columns:
##       seqnames            ranges strand |         exon_id exon_rank
##          <Rle>         <IRanges>  <Rle> |     <character> <integer>
##   [1]       20 44619522-44619847      - | ENSE00003479916        12
##   -------
##   seqinfo: 357 sequences (1 circular) from GRCh38 genome

Note that, because this gene is transcribed on the - strand, the first exon is to the right of the last exon.

start(ebt_sub[[1]][1]) > start(ebt_sub[[1]][l])
## [1] TRUE

Visualizing gene models with Gviz

One Bioconductor package for visualizing genomic “tracks” in R is Gviz. A track basically refers to a set of features displayed with the genome going from left to right. Users typically want to see annotations (where the genes and exons are), as well as experiment data (RNA sequences, epigenetic information, etc.)

Why would you use this package? If you wanted to make plots of regions programmatically, or to add data on top of the plot (see the Gviz vignette for examples). If however, you just want to explore the gene model interactively, I recommend UCSC or Ensembl Genome Browsers, or the stand-alone program IGV.

It’s a little bit of extra work to visualize a GRangesList (you have to convert the data to a data.frame first) using Gviz, but the package does have in-depth (albiet dense) documentation. In my opinion, this could still be easier. There are people working on other approaches, such as epivizr, or plotgardener, which we will explore in the next section.

The following function I define takes a GRangesList object and turns it into a data.frame that can be read by the Gviz package.

library(Gviz)
granges2df <- function(x) {
  df <- as(x, "data.frame")
  df <- df[,c("seqnames","start","end","strand","group_name")]
  colnames(df)[1] <- "chromosome"
  colnames(df)[5] <- "transcript"
  df
}

Now we will make a data.frame called df with the information about the transcripts of our gene, and make a GeneRegionTrack out of it.

df <- granges2df(ebt_sub)
df$gene <- gid
grt <- GeneRegionTrack(df)

We need to set the range of the plot, and for this we can find out the range of all the ranges within ebt_sub. I do this by unlist-ing the GRangesList (making a big GRanges object), and then asking the range.

range(unlist(ebt_sub))
## GRanges object with 1 range and 0 metadata columns:
##       seqnames            ranges strand
##          <Rle>         <IRanges>  <Rle>
##   [1]       20 44619522-44652233      -
##   -------
##   seqinfo: 357 sequences (1 circular) from GRCh38 genome

Now we put our tracks together, with a genome axis track as well. What can we see here? There transcripts are mostly overlapping. There are a cluster of transcripts that are long and basically overlapping except for a few exons excluded. Then there are transcripts which only contain the beginning exons (the arrows indicate that the beginning of the transcript is on the right-hand side), and there is one transcript with just the ending exons (left-hand side). This is pretty typical for a multiple-isoform gene.

gax <- GenomeAxisTrack()
plotTracks(list(gax,grt), chromosome="20",
           transcriptAnnotation="transcript",
           from=44619522 - 10000, to=44652233 + 10000)

Note that it is also possible to label certain features. We just create a new column called feature, and then we can put whatever grouping or labels we like. Here we are coloring an individual exon of one of the transcripts:

df$feature <- "A"
df$feature[12] <- "B"
grt <- GeneRegionTrack(df)

Now we add a specific color for that exon:

plotTracks(list(gax,grt), chromosome="20",
           transcriptAnnotation="transcript",
           from=44619522 - 10000, to=44652233 + 10000,
           A="#FFD58A", B="dodgerblue")

# this hex code is found deep in the manual for Gviz...

Visualizing gene models with plotgardener

plotgardener has a different approach to visualizing genomic data compared to most other tools in R/Bioconductor. From the package homepage:

plotgardener empowers users to programmatically and flexibly generate multi-panel figures. plotgardener accomplishes these goals by utilizing 1) a coordinate-based plotting system, and 2) edge-to-edge containerized data visualization.

More motivation for the plotgardener package from the paper:

While [previous tools for visualization] have been transformative for data exploration, they are largely based on single-panel figures and vertical stacking of genomic tracks and are often ill-suited for the generation of complex multi-panel figures that include both genomic and non-genomic plot types. Such complex figures are often critical for evaluating the underlying biology and are almost always used to present multi-omic data in publications.

And on containerized plotting:

Each plot and annotation is drawn within its own defined graphical region, or viewport, and then placed on a larger plotgardener page. These viewports give the power to specify the size and placement of plot containers and clip data to precise genomic and data axis measurements.

So a key difference is that plotgardener allows for absolute positioning, whereas most exploratory visualization packages in R like Gviz often use relative positioning. For example, if we layer multiple tracks in Gviz, we don’t specify the location or width of these tracks, the plotTracks() function in Gviz mostly makes those decisions for you (although some customization is possible). This can be convenient when exploring data, but for publication diagrams, it is helpful to position the individual pieces based on a coordinate system.

Let’s try our transcript plot with plotgardener:

suppressPackageStartupMessages(library(plotgardener))
library("TxDb.Hsapiens.UCSC.hg38.knownGene")
library("org.Hs.eg.db")

We don’t have to use the “knownGene” TxDb, we could also specify a custom TxDb using assembly(), but we use this one here for simplicity.

pageCreate(width = 5, height = 3.25)
suppressMessages(
  plotTranscripts(
    chrom = "chr20", 
    chromstart = 44619522, chromend = 44652233,
    assembly = "hg38", labels = "transcript",
    x = 0.5, y = 2.5, width = 4, height = 3,
    just = c("left", "bottom")
  )
)
suppressMessages(
  plotGenomeLabel(
    chrom = "chr20", 
    chromstart = 44619522, chromend = 44652233,
    assembly = "hg38",
    x = 0.5, y = 2.75, length = 4
  )
)
suppressMessages(
  plotLegend(
    legend = c("+ strand", "- strand"),
    fill = c("#669fd9", "#abcc8e"), border = FALSE,
    x = 0.5, y = 0.1, width = 1, height = 0.5
    )
)

We can also programmatically color elements:

gr <- ebt_sub[[1]]
gr$highlight <- factor(rep(c("A","B"),c(11,1)))
gr <- keepStandardChromosomes(gr, pruning.mode="coarse")
seqlevelsStyle(gr) <- "UCSC"
pageCreate(width = 5, height = 2)
 # colorblind-friendly palette
cols <- function(n) palette.colors(n+1)[-1]
suppressMessages(
  plotRanges(
    data = gr,
    chrom = "chr20", 
    chromstart = 44619522, chromend = 44652233,
    assembly = "hg38",
    spaceWidth = 0,
    fill = colorby("highlight", palette = cols),
    x = 0.5, y = 0.5, width = 4, height = 0.5
  )
)
suppressMessages(
  plotGenomeLabel(
    chrom = "chr20", 
    chromstart = 44619522, chromend = 44652233,
    assembly = "hg38",
    x = 0.5, y = 1.1, length = 4
  )
)

Once you’ve customized your plot, you can turn off the guides by setting showGuides=FALSE in pageCreate().

sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 11.6.8
## 
## Matrix products: default
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] grid      stats4    stats     graphics  grDevices datasets  utils    
## [8] methods   base     
## 
## other attached packages:
##  [1] org.Hs.eg.db_3.15.0                     
##  [2] TxDb.Hsapiens.UCSC.hg38.knownGene_3.15.0
##  [3] plotgardener_1.3.7                      
##  [4] Gviz_1.41.1                             
##  [5] EnsDb.Hsapiens.v86_2.99.0               
##  [6] ensembldb_2.21.2                        
##  [7] AnnotationFilter_1.21.0                 
##  [8] GenomicFeatures_1.49.5                  
##  [9] AnnotationDbi_1.59.1                    
## [10] Biobase_2.57.1                          
## [11] GenomicRanges_1.49.0                    
## [12] GenomeInfoDb_1.33.3                     
## [13] IRanges_2.31.0                          
## [14] S4Vectors_0.35.1                        
## [15] BiocGenerics_0.43.0                     
## [16] pkgdown_2.0.5                           
## [17] testthat_3.1.4                          
## [18] rmarkdown_2.14                          
## [19] devtools_2.4.3                          
## [20] usethis_2.1.6                           
## 
## loaded via a namespace (and not attached):
##   [1] backports_1.4.1             Hmisc_4.7-0                
##   [3] BiocFileCache_2.5.0         lazyeval_0.2.2             
##   [5] splines_4.2.1               BiocParallel_1.31.10       
##   [7] ggplot2_3.3.6               digest_0.6.29              
##   [9] yulab.utils_0.0.5           htmltools_0.5.2            
##  [11] fansi_1.0.3                 magrittr_2.0.3             
##  [13] checkmate_2.1.0             memoise_2.0.1              
##  [15] BSgenome_1.65.2             cluster_2.1.3              
##  [17] remotes_2.4.2               Biostrings_2.65.1          
##  [19] matrixStats_0.62.0          prettyunits_1.1.1          
##  [21] jpeg_0.1-9                  colorspace_2.0-3           
##  [23] blob_1.2.3                  rappdirs_0.3.3             
##  [25] xfun_0.31                   dplyr_1.0.9                
##  [27] jsonlite_1.8.0              callr_3.7.0                
##  [29] crayon_1.5.1                RCurl_1.98-1.7             
##  [31] survival_3.3-1              VariantAnnotation_1.43.2   
##  [33] glue_1.6.2                  gtable_0.3.0               
##  [35] zlibbioc_1.43.0             XVector_0.37.0             
##  [37] strawr_0.0.9                DelayedArray_0.23.0        
##  [39] plyranges_1.17.0            pkgbuild_1.3.1             
##  [41] scales_1.2.0                DBI_1.1.3                  
##  [43] Rcpp_1.0.8.3                progress_1.2.2             
##  [45] htmlTable_2.4.1             gridGraphics_0.5-1         
##  [47] foreign_0.8-82              bit_4.0.4                  
##  [49] Formula_1.2-4               htmlwidgets_1.5.4          
##  [51] httr_1.4.3                  RColorBrewer_1.1-3         
##  [53] ellipsis_0.3.2              pkgconfig_2.0.3            
##  [55] XML_3.99-0.10               sass_0.4.1                 
##  [57] nnet_7.3-17                 dbplyr_2.2.1               
##  [59] deldir_1.0-6                utf8_1.2.2                 
##  [61] ggplotify_0.1.0             tidyselect_1.1.2           
##  [63] rlang_1.0.3                 munsell_0.5.0              
##  [65] tools_4.2.1                 cachem_1.0.6               
##  [67] cli_3.3.0                   generics_0.1.3             
##  [69] RSQLite_2.2.14              evaluate_0.15              
##  [71] stringr_1.4.0               fastmap_1.1.0              
##  [73] yaml_2.3.5                  processx_3.6.1             
##  [75] knitr_1.39                  bit64_4.0.5                
##  [77] fs_1.5.2                    purrr_0.3.4                
##  [79] KEGGREST_1.37.2             xml2_1.3.3                 
##  [81] biomaRt_2.53.2              brio_1.1.3                 
##  [83] compiler_4.2.1              rstudioapi_0.13            
##  [85] filelock_1.0.2              curl_4.3.2                 
##  [87] png_0.1-7                   tibble_3.1.7               
##  [89] bslib_0.3.1                 stringi_1.7.6              
##  [91] highr_0.9                   ps_1.7.1                   
##  [93] lattice_0.20-45             ProtGenerics_1.29.0        
##  [95] Matrix_1.4-1                vctrs_0.4.1                
##  [97] pillar_1.7.0                lifecycle_1.0.1            
##  [99] BiocManager_1.30.18         jquerylib_0.1.4            
## [101] data.table_1.14.2           bitops_1.0-7               
## [103] rtracklayer_1.57.0          R6_2.5.1                   
## [105] BiocIO_1.7.1                latticeExtra_0.6-30        
## [107] gridExtra_2.3               sessioninfo_1.2.2          
## [109] codetools_0.2-18            dichromat_2.0-0.1          
## [111] assertthat_0.2.1            pkgload_1.3.0              
## [113] SummarizedExperiment_1.27.1 rjson_0.2.21               
## [115] GenomicAlignments_1.33.0    Rsamtools_2.13.3           
## [117] GenomeInfoDbData_1.2.8      parallel_4.2.1             
## [119] hms_1.1.1                   rpart_4.1.16               
## [121] MatrixGenerics_1.9.1        biovizBase_1.45.0          
## [123] base64enc_0.1-3             interp_1.1-3               
## [125] restfulr_0.0.15