| Title: | Gene-Set Enrichment Analysis Tools in 'ribios' |
|---|---|
| Description: | Provides data structure and functions for gene-set analysis and post-processing of analysis results. |
| Authors: | Jitao David Zhang [aut, cre] (ORCID: <https://orcid.org/0000-0002-3085-0909>), Balasz Banfai [ctb], F.Hoffmann-La Roche AG [cph] |
| Maintainer: | Jitao David Zhang <[email protected]> |
| License: | GPL-3 |
| Version: | 1.6.7 |
| Built: | 2026-05-27 06:40:41 UTC |
| Source: | https://github.com/bedapub/ribiosGSEA |
Subset an AnnoBroadGseaRes object
## S4 method for signature 'AnnoBroadGseaRes,ANY,ANY,ANY' x[i, j, ..., drop = FALSE]## S4 method for signature 'AnnoBroadGseaRes,ANY,ANY,ANY' x[i, j, ..., drop = FALSE]
x |
An AnnoBroadGseaRes object |
i |
An integer or logical subsetting index |
j |
Not used |
... |
Not used |
drop |
Not used |
A subset of the original data as an AnnoBroadGseaRes object
Subset a FisherResultList object by indexing
## S4 method for signature 'FisherResultList,ANY,missing,missing' x[i, j, ..., drop = FALSE]## S4 method for signature 'FisherResultList,ANY,missing,missing' x[i, j, ..., drop = FALSE]
x |
A FisherResultList object |
i |
An integer or logical subsetting index |
j |
Not used |
... |
Not used |
drop |
Not used |
A subset of the original data as an FisherResultList object
Subset a FisherResultList object by namespace and name
## S4 method for signature 'FisherResultList,character,character,missing' x[i, j, ..., drop = TRUE]## S4 method for signature 'FisherResultList,character,character,missing' x[i, j, ..., drop = TRUE]
x |
A FisherResultList object |
i |
Character string, gene-set namespace |
j |
Character string, gene-set name |
... |
Not used |
drop |
Not used |
Convert a list of AnnoBroadGseaResItem objects to a list
AnnoBroadGseaRes(object)AnnoBroadGseaRes(object)
object |
A list of AnnoBroadGseaResItem |
An AnnoBroadGseaRes object
Annotated BROAD GSEA Results for one contrast
An object of class AnnoBroadGseaRes.
Convert a BroadGseaResItem object to an AnnoBroadGseaResItem object
AnnoBroadGseaResItem(object, genes, geneValues)AnnoBroadGseaResItem(object, genes, geneValues)
object |
A BroadGseaResItem object |
genes |
A character string vector |
geneValues |
A numeric vector |
An annoBroadGseaResItem object
Annotated BROAD GSEA result item
An object of class AnnoBroadGseaResItem.
gsGenesVector of character strings, gene-set genes
gsGeneValuesVector of numeric values, statistics of gene-set genes
A list of AnnoBroadGseaRes objects
An object of class AnnoBroadGseaResList.
Convert an FisherResultList object into a data.frame
## S4 method for signature 'FisherResultList' as.data.frame(x, row.names = NULL)## S4 method for signature 'FisherResultList' as.data.frame(x, row.names = NULL)
x |
An FisherResultList object |
row.names |
Character strings. |
A data.frame
An adapted and enhanced version of limma::camera
biosCamera( y, index, design = NULL, contrast = ncol(design), weights = NULL, geneLabels = NULL, use.ranks = FALSE, allow.neg.cor = FALSE, trend.var = FALSE, sort = FALSE, .fixed.inter.gene.cor = NULL, .approx.zscoreT = FALSE )biosCamera( y, index, design = NULL, contrast = ncol(design), weights = NULL, geneLabels = NULL, use.ranks = FALSE, allow.neg.cor = FALSE, trend.var = FALSE, sort = FALSE, .fixed.inter.gene.cor = NULL, .approx.zscoreT = FALSE )
y |
a numeric matrix of log-expression values or log-ratios of expression values, or any data object containing such a matrix. Rows correspond to probes and columns to samples. Any type of object that can be processed by getEAWP is acceptable. |
index |
an index vector or a list
of index vectors. Can be any vector such that y[index,] of statistic[index]
selects the rows corresponding to the test set. The list can be made using
|
design |
Design matrix |
contrast |
contrast of the linear model coefficients for which the test is required. Can be an integer specifying a column of design, or else a numeric vector of same length as the number of columns of design. |
weights |
numeric matrix of
observation weights of same size as |
geneLabels |
Labels of the features in the input matrix. |
use.ranks |
do a rank-based test (TRUE) or a parametric test (FALSE)? |
allow.neg.cor |
should reduced variance inflation factors be allowed for negative correlations? |
trend.var |
logical, should an empirical Bayes trend be estimated? See |
sort |
logical, should the results be sorted by p-value? |
.fixed.inter.gene.cor |
Numeric value, vector, or |
.approx.zscoreT |
logical, advanced parameter only used for
debugging purposes. If The function was adapted from
|
A data.frame with one row per set and the following columns:
Gene set name
Number of genes in the set
Estimated correlation
Estimated difference between the mean values of genes in the geneset and the background genes
Direction of set-wise regulation, Up or
Down
Gene-set enrichment score, defined as
log10(pValue)*I(directionality), where I(directionality) equals
1 if the directionality is Up and -1 if the
directionality is Down
A character string, containing all genes labels of genes that are in the set and regulated in the same direction as the set-wise direction, and the respective statistic
Since limma 3.29.6, the default setting of allow.neg.cor changes from
TRUE to FALSE, and a new parameter, inter.gene.cor, is added with the default
value of 0.01, namely a prior inter-gene correlation is set for all gene
sets. Currently, biosCamera does not have the parameter
inter.gene.cor, but allow.neg.cor is set by default to
FALSE to be consistent with the latest camera function.
y <- matrix(rnorm(1000*6),1000,6) design <- cbind(Intercept=1,Group=c(0,0,0,1,1,1)) # First set of 20 genes are genuinely deferentially expressed index1 <- 1:20 y[index1,4:6] <- y[index1,4:6]+1 # The second set of 20 genes are not index2 <- 21:40 biosCamera(y, index1, design) biosCamera(y, index2, design) biosCamera(y, list(index1, index2), design) # compare with the output of camera: columns 'GeneSet', 'Score', # 'ContributingGenes' are missing, and in case \code{inter.gene.cor} is (as # default) set to a numeric value, the column 'Correlation' is also missing limmaDefOut <- limma::camera(y, index1, design) limmaCorDefOut <- limma::camera(y, index1, design, inter.gene.cor=NA) ## Not run: # when \code{.approx.zscoreT=TRUE}, PValue reported by # \code{limma::camera(inter.gene.cor=NA)} and \code{ribiosGSEA::biosCamera} # should equal biosCorOut <- biosCamera(y, index1, design, .approx.zscoreT=TRUE) # when \code{.fixed.inter.gene.cor=0.01} and \code{.approx.zscoreT=TRUE}, # PValue reported by \code{limma::camera} and \code{ribiosGSEA::biosCamera} # should equal biosFixCorOut <- biosCamera(y, index1, design, .fixed.inter.gene.cor=0.01, .approx.zscoreT=TRUE) testthat::expect_equal(biosFixCorOut$PValue, limmaDefOut$PValue) testthat::expect_equal(biosCorOut$PValue, limmaCorDefOut$PValue) ## End(Not run)y <- matrix(rnorm(1000*6),1000,6) design <- cbind(Intercept=1,Group=c(0,0,0,1,1,1)) # First set of 20 genes are genuinely deferentially expressed index1 <- 1:20 y[index1,4:6] <- y[index1,4:6]+1 # The second set of 20 genes are not index2 <- 21:40 biosCamera(y, index1, design) biosCamera(y, index2, design) biosCamera(y, list(index1, index2), design) # compare with the output of camera: columns 'GeneSet', 'Score', # 'ContributingGenes' are missing, and in case \code{inter.gene.cor} is (as # default) set to a numeric value, the column 'Correlation' is also missing limmaDefOut <- limma::camera(y, index1, design) limmaCorDefOut <- limma::camera(y, index1, design, inter.gene.cor=NA) ## Not run: # when \code{.approx.zscoreT=TRUE}, PValue reported by # \code{limma::camera(inter.gene.cor=NA)} and \code{ribiosGSEA::biosCamera} # should equal biosCorOut <- biosCamera(y, index1, design, .approx.zscoreT=TRUE) # when \code{.fixed.inter.gene.cor=0.01} and \code{.approx.zscoreT=TRUE}, # PValue reported by \code{limma::camera} and \code{ribiosGSEA::biosCamera} # should equal biosFixCorOut <- biosCamera(y, index1, design, .fixed.inter.gene.cor=0.01, .approx.zscoreT=TRUE) testthat::expect_equal(biosFixCorOut$PValue, limmaDefOut$PValue) testthat::expect_equal(biosCorOut$PValue, limmaCorDefOut$PValue) ## End(Not run)
A S4 class representing the atom structure of results of the BROAD GSEA tool
An object of class BroadGseaResItem.
genesetCharacter, gene-set name
esNumeric, enrichment score
nesNumeric, normalised enrichment score
npNumeric
fdrNumeric, false discovery rate
fwerNumeric, family-wise error rate
geneIndicesInteger vector, gene indices
esProfileNumeric, enrichment score profile
coreEnrichThrNumeric
Build the command-line command to run BROAD GSEA
buildBroadGSEAcomm( gseaJar, javaBin, rnkFiles, gmtFile, chipFile, nperm = 1000L, collapse = FALSE, plotTopX = 25, outdir = "./", addShebang = TRUE )buildBroadGSEAcomm( gseaJar, javaBin, rnkFiles, gmtFile, chipFile, nperm = 1000L, collapse = FALSE, plotTopX = 25, outdir = "./", addShebang = TRUE )
gseaJar |
Character string, full file name of BROAD GSEA (gene permutation) jar file |
javaBin |
Character string, java binary file |
rnkFiles |
Character string, rank files |
gmtFile |
A |
chipFile |
A |
nperm |
Integer, number of permutations |
collapse |
Logical, whether to collapse duplicated features |
plotTopX |
Integer, top gene-sets to be visualized |
outdir |
Character string, the path of output |
addShebang |
Logical, whether to add Shebang to the script The command builds command-line command to run gene-permutation GSEA over many rank files. |
A character vector of shell commands to run GSEA.
Run CAMERA method using EdgeResult
## S3 method for class 'EdgeResult' camera(y, gmtList, doParallel = FALSE, ...)## S3 method for class 'EdgeResult' camera(y, gmtList, doParallel = FALSE, ...)
y |
A EdgeResult object |
gmtList |
Gene set collections, for example read by
|
doParallel |
Logical, whether |
... |
Not used Note that the EdgeResult object must have a column 'GeneSymbol' in its
|
A data.frame containing CAMERA results.
exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1,1,-1), ncol=2, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1", "Group1.vs.Group2"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("GeneSymbol%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exDgeList <- DGEList(exMat, genes=exFdata, samples=exPdata) exDgeList <- edgeR::estimateDisp(exDgeList, exDesign) exEdgeObject <- EdgeObject(exDgeList, exDescon) exEdgeRes <- ribiosNGS::dgeWithEdgeR(exEdgeObject) exGmt <- BioQC::GmtList(list(GeneSet1=sprintf("GeneSymbol%d", 1:5), GeneSet2=sprintf("GeneSymbol%d", 6:10))) exCameraRes <- camera(exEdgeRes, exGmt)exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1,1,-1), ncol=2, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1", "Group1.vs.Group2"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("GeneSymbol%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exDgeList <- DGEList(exMat, genes=exFdata, samples=exPdata) exDgeList <- edgeR::estimateDisp(exDgeList, exDesign) exEdgeObject <- EdgeObject(exDgeList, exDescon) exEdgeRes <- ribiosNGS::dgeWithEdgeR(exEdgeObject) exGmt <- BioQC::GmtList(list(GeneSet1=sprintf("GeneSymbol%d", 1:5), GeneSet2=sprintf("GeneSymbol%d", 6:10))) exCameraRes <- camera(exEdgeRes, exGmt)
Run the CAMERA method using LimmaVoomResult
## S3 method for class 'LimmaVoomResult' camera(y, gmtList, doParallel = FALSE, ...)## S3 method for class 'LimmaVoomResult' camera(y, gmtList, doParallel = FALSE, ...)
y |
A LimmaVoomResult object |
gmtList |
Gene set collections, for example read by
|
doParallel |
Logical, whether |
... |
Passed to Note that the LimmaVoomResult object must have a column 'GeneSymbol' in its
|
A data.frame containing CAMERA results.
exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exDgeList <- DGEList(exMat, genes=exFdata, samples=exPdata) exDgeList <- edgeR::estimateDisp(exDgeList, exDesign) edgeObj <- EdgeObject(exDgeList, exDescon) limmaVoomRes <- ribiosNGS::dgeWithLimmaVoom(edgeObj) exGmt <- BioQC::GmtList(list(GeneSet1=sprintf("GeneSymbol%d", 1:5), GeneSet2=sprintf("GeneSymbol%d", 6:10))) camera(limmaVoomRes, exGmt)exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exDgeList <- DGEList(exMat, genes=exFdata, samples=exPdata) exDgeList <- edgeR::estimateDisp(exDgeList, exDesign) edgeObj <- EdgeObject(exDgeList, exDescon) limmaVoomRes <- ribiosNGS::dgeWithLimmaVoom(edgeObj) exGmt <- BioQC::GmtList(list(GeneSet1=sprintf("GeneSymbol%d", 1:5), GeneSet2=sprintf("GeneSymbol%d", 6:10))) camera(limmaVoomRes, exGmt)
Apply the CAMERA method to a DGEList object and a contrast
cameraDGEListByContrast(dgeList, index, design, contrasts, doParallel = FALSE)cameraDGEListByContrast(dgeList, index, design, contrasts, doParallel = FALSE)
dgeList |
A DGEList object, with GeneSymbol available, and dispersion must be estimated |
index |
List of integer indices of genesets, names are names of gene sets |
design |
Design matrix |
contrasts |
Contrast matrix |
doParallel |
Logical, whether |
A data.frame containing CAMERA results across contrasts.
exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exDgeList <- DGEList(exMat, genes=exFdata, samples=exPdata) exDgeList <- edgeR::estimateDisp(exDgeList, exDesign) cameraDGEListByContrast(exDgeList, index=1:5, design=exDesign, contrasts=exContrast) cameraDGEListByContrast(exDgeList, index=list(1:5, 6:10), design=exDesign, contrasts=exContrast)exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exDgeList <- DGEList(exMat, genes=exFdata, samples=exPdata) exDgeList <- edgeR::estimateDisp(exDgeList, exDesign) cameraDGEListByContrast(exDgeList, index=1:5, design=exDesign, contrasts=exContrast) cameraDGEListByContrast(exDgeList, index=list(1:5, 6:10), design=exDesign, contrasts=exContrast)
Apply the CAMERA method to a DGEList object
cameraLimmaVoomResultsByContrast( limmaVoomResults, index, doParallel = FALSE, ... )cameraLimmaVoomResultsByContrast( limmaVoomResults, index, doParallel = FALSE, ... )
limmaVoomResults |
A LimmaVoomResults object, with GeneSymbol available |
index |
List of integer indices of genesets, names are names of gene sets |
doParallel |
Logical, whether |
... |
Not used |
A data.frame containing CAMERA results.
A data.frame containing CAMERA results across contrasts.
exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exDgeList <- DGEList(exMat, genes=exFdata, samples=exPdata) exDgeList <- edgeR::estimateDisp(exDgeList, exDesign) edgeObj <- EdgeObject(exDgeList, exDescon) limmaVoomRes <- ribiosNGS::dgeWithLimmaVoom(edgeObj) cameraLimmaVoomResultsByContrast(limmaVoomRes, index=c(1:5)) cameraLimmaVoomResultsByContrast(limmaVoomRes, index=list(GS1=1:5, GS2=6:10))exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exDgeList <- DGEList(exMat, genes=exFdata, samples=exPdata) exDgeList <- edgeR::estimateDisp(exDgeList, exDesign) edgeObj <- EdgeObject(exDgeList, exDescon) limmaVoomRes <- ribiosNGS::dgeWithLimmaVoom(edgeObj) cameraLimmaVoomResultsByContrast(limmaVoomRes, index=c(1:5)) cameraLimmaVoomResultsByContrast(limmaVoomRes, index=list(GS1=1:5, GS2=6:10))
Convert a CAMERA table into a graph
cameraTable2graph(df, jacThr = 0.25, plot = TRUE, ...)cameraTable2graph(df, jacThr = 0.25, plot = TRUE, ...)
df |
Data.frame, CAMERA results |
jacThr |
Numeric, between 0 and 1, Jaccard Index threshold |
plot |
Logical, whether plotting the results |
... |
Passed to |
A list with two elements: graph (an igraph object) and
resTbl (a data.frame with columns Namespace, GeneSet, Score).
Perform gene-set enrichment (GSE) analysis
doGse(edgeResult, gmtList, doParallel = FALSE)doGse(edgeResult, gmtList, doParallel = FALSE)
edgeResult |
An object of the class |
gmtList |
An object of the class |
doParallel |
Logical, whether The function performs gene-set enrichment analysis. By default,the CAMERA method is applied. In case this is not successful, for instance because of lack of biological replicates, the GAGE method (Generally Applicable Gene-set Enrichment for pathway analysis) is applied. |
A data.frame containing results of the gene-set enrichment analysis.
gseWithLogFCgage and gseWithCamera are wrapped by
this function to perform analysis with GAGE and CAMERA, respectively.
logFCgage, camera.EdgeResult, and camera.LimmaVoomResult implement the logic, and
return the enrichment table.
exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- ribiosNGS::dgeWithEdgeR(exObj) exGeneSets <- BioQC::GmtList(list( list(name="Set1", desc="set 1", genes=c("Gene1", "Gene2", "Gene3"), namespace="default"), list(name="Set2", desc="set 2", genes=c("Gene18", "Gene6", "Gene4"), namespace="default") )) exGse <- doGse(exDgeRes, exGeneSets) ## Not run: exMat <- matrix(rpois(120000, 10), nrow=20000, ncol=12) exGroups <- gl(4,3, labels=c("Group1", "Group2", "Group3", "Group4")) exDesign <- model.matrix(~0+exGroups) exContrast <- matrix(c(-1,1,0,0, 0,0,-1,1), ncol=2, byrow=FALSE, dimnames=list(c("Group1", "Group2", "Group3", "Group4"), c("Group2.vs.Group1", "Group4.vs.Group3"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- ribiosNGS::dgeWithEdgeR(exObj) ngeneset <- 1000 genesetSizes <- round(runif(ngeneset)*100)+1 exGeneSets <- BioQC::GmtList(lapply(seq(1:ngeneset), function(i) { name <- paste0("GeneSet", i) desc <- paste0("GeneSet", i) genes <- sample(exFdata$GeneSymbol, genesetSizes[i]) res <- list(name=name, desc=desc, genes=genes, namespace="default") })) exGse <- doGse(exDgeRes, exGeneSets) ## End(Not run)exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- ribiosNGS::dgeWithEdgeR(exObj) exGeneSets <- BioQC::GmtList(list( list(name="Set1", desc="set 1", genes=c("Gene1", "Gene2", "Gene3"), namespace="default"), list(name="Set2", desc="set 2", genes=c("Gene18", "Gene6", "Gene4"), namespace="default") )) exGse <- doGse(exDgeRes, exGeneSets) ## Not run: exMat <- matrix(rpois(120000, 10), nrow=20000, ncol=12) exGroups <- gl(4,3, labels=c("Group1", "Group2", "Group3", "Group4")) exDesign <- model.matrix(~0+exGroups) exContrast <- matrix(c(-1,1,0,0, 0,0,-1,1), ncol=2, byrow=FALSE, dimnames=list(c("Group1", "Group2", "Group3", "Group4"), c("Group2.vs.Group1", "Group4.vs.Group3"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- ribiosNGS::dgeWithEdgeR(exObj) ngeneset <- 1000 genesetSizes <- round(runif(ngeneset)*100)+1 exGeneSets <- BioQC::GmtList(lapply(seq(1:ngeneset), function(i) { name <- paste0("GeneSet", i) desc <- paste0("GeneSet", i) genes <- sample(exFdata$GeneSymbol, genesetSizes[i]) res <- list(name=name, desc=desc, genes=genes, namespace="default") })) exGse <- doGse(exDgeRes, exGeneSets) ## End(Not run)
Expand genes in the CAMERA result table
expandCameraTableGenes(tbl)expandCameraTableGenes(tbl)
tbl |
A |
A longer data.frame, with each row one gene.
Make a factor vector from a character vector by the order of the parsed numbers
factorByNumberInStr(str, decreasing = TRUE)factorByNumberInStr(str, decreasing = TRUE)
str |
Strings |
decreasing |
Logical, whether decreasing or increasing order is desied, passed to |
A factor with levels ordered by the parsed numbers.
orderByNumberInStr, which returns the order of strings
by numbers in them
factorByNumberInStr(c("D1", "D10", "D15", "D3.5")) factorByNumberInStr(c("D1", "D10", "D15", "D3.5"), decreasing=FALSE)factorByNumberInStr(c("D1", "D10", "D15", "D3.5")) factorByNumberInStr(c("D1", "D10", "D15", "D3.5"), decreasing=FALSE)
Return FDR values
fdrValue(object, ...)fdrValue(object, ...)
object |
An object |
... |
Other parameters |
A numeric vector of FDR values.
Filter by size
filterBySize(object, min, max)filterBySize(object, min, max)
object |
An object |
min |
Integer, minimum size |
max |
Integer, maximum size |
The filtered object.
Result of Fisher's exact test
An object of class FisherResult.
A list of results of Fisher's exact test
An object of class FisherResultList.
Fisher's method to combine multiple p-values
fishersMethod(p, returnValiePvalues = FALSE)fishersMethod(p, returnValiePvalues = FALSE)
p |
Numeric vector, p values to be combined |
returnValiePvalues |
Logical, whether the valid p-values used should be returned as part of the list |
A FisherMethodResult S3 object, a list of following elements
chisq: Chi-square statistic
df: Degree of freedom (which is twice the count of the valid p-values used for calculation)
p: p-value
validp (optional): valid p-values used for the calculation
The function returns the combined p-value using the sum of logs (Fisher's) method
The function was adapted from metap::sumlog
ps <- c(0.05, 0.75) fishersMethod(ps) fishersMethod(ps, returnValiePvalues=TRUE)ps <- c(0.05, 0.75) fishersMethod(ps) fishersMethod(ps, returnValiePvalues=TRUE)
Perform Fisher's exact test
fisherTest(genes, genesets, universe, ...)fisherTest(genes, genesets, universe, ...)
genes |
Genes |
genesets |
Gene-sets |
universe |
The universe of genes |
... |
Other parameters |
A FisherResult object or a data.table of results.
Perform Fisher's exact test on a gene set
## S4 method for signature 'character,character,character' fisherTest( genes, genesets, universe, gsName, gsNamespace, makeUniqueNonNA = TRUE, checkUniverse = TRUE, useEASE = FALSE )## S4 method for signature 'character,character,character' fisherTest( genes, genesets, universe, gsName, gsNamespace, makeUniqueNonNA = TRUE, checkUniverse = TRUE, useEASE = FALSE )
genes |
a collection of genes of which over-representation of the gene set is tested |
genesets |
A vector of character strings, genes belonging to one gene set. |
universe |
universe of genes |
gsName |
gene set name, can be left missing |
gsNamespace |
gene set namespace name, can be left missing |
makeUniqueNonNA |
Logical, whether genes, geneSetGenes, and universe should be filtered to remove NA and made unique. The default is set to |
checkUniverse |
Logical, if |
useEASE |
Logical, whether to use the EASE method to report the p-value. This function performs one-sided Fisher's exact test to test the over-representation of gene set genes in the input gene list. If |
Duplicated items in genes, genesets' genes, and the universe are per default removed
Hosack, Douglas A., Glynn Dennis, Brad T. Sherman, H. Clifford Lane, and Richard A. Lempicki. Identifying Biological Themes within Lists of Genes with EASE. Genome Biology 4 (2003): R70. doi:10.1186/gb-2003-4-10-r70
myGenes <- LETTERS[1:3] myGeneSet1 <- LETTERS[1:6] myGeneSet2 <- LETTERS[4:7] myUniverse <- LETTERS fisherTest(genes=myGenes, genesets=myGeneSet1, universe=myUniverse) fisherTest(genes=myGenes, genesets=myGeneSet2, universe=myUniverse) fisherTest(genes=myGenes, genesets=myGeneSet1, universe=myUniverse, gsName="My gene set1", gsNamespace="Letters") ## note that duplicated items are removed by default resWoRp <- fisherTest(genes=rep(myGenes,2), genesets=myGeneSet1, universe=myUniverse) resWithRp <- fisherTest(genes=rep(myGenes,2), genesets=myGeneSet1, universe=rep(myUniverse,2)) identical(resWoRp, resWithRp) resWithRpNoUnique <- fisherTest(genes=rep(myGenes,2), genesets=myGeneSet1, universe=rep(myUniverse,2), makeUniqueNonNA=FALSE) identical(resWoRp, resWithRpNoUnique)myGenes <- LETTERS[1:3] myGeneSet1 <- LETTERS[1:6] myGeneSet2 <- LETTERS[4:7] myUniverse <- LETTERS fisherTest(genes=myGenes, genesets=myGeneSet1, universe=myUniverse) fisherTest(genes=myGenes, genesets=myGeneSet2, universe=myUniverse) fisherTest(genes=myGenes, genesets=myGeneSet1, universe=myUniverse, gsName="My gene set1", gsNamespace="Letters") ## note that duplicated items are removed by default resWoRp <- fisherTest(genes=rep(myGenes,2), genesets=myGeneSet1, universe=myUniverse) resWithRp <- fisherTest(genes=rep(myGenes,2), genesets=myGeneSet1, universe=rep(myUniverse,2)) identical(resWoRp, resWithRp) resWithRpNoUnique <- fisherTest(genes=rep(myGenes,2), genesets=myGeneSet1, universe=rep(myUniverse,2), makeUniqueNonNA=FALSE) identical(resWoRp, resWithRpNoUnique)
Perform Fisher's exact test on a GmtList object
## S4 method for signature 'character,GmtList,character' fisherTest( genes, genesets, universe, gsNamespace, makeUniqueNonNA = TRUE, checkUniverse = TRUE, useEASE = FALSE )## S4 method for signature 'character,GmtList,character' fisherTest( genes, genesets, universe, gsNamespace, makeUniqueNonNA = TRUE, checkUniverse = TRUE, useEASE = FALSE )
genes |
character strings of gene list to be tested |
genesets |
An GmtList object |
universe |
Universe (background) gene list |
gsNamespace |
Character string, gene-set namespace(s) |
makeUniqueNonNA |
Logical, whether genes and universe should be filtered to remove NA and made unique. The default is set to |
checkUniverse |
Logical, if |
useEASE |
Logical, whether to use the EASE method to report the p-value. |
A data.table containing Fisher's exact test results of all gene-sets, in the same order as the input gene-sets, with following columns:
GeneSetNamespace
GeneSetName
GeneSetEffectiveSize, the count of genes in the gene-set that are found in the universe
HitCount, the count of genes in the genes input that are in the gene-set
Hits, a vector of character string, representing hits
PValue
FDR, PValue adjusted by the Benjamini-Hochberg method. If more than one gene-set categories are provided, the FDR correction is performed per namespace
gs1 <- list(name="GeneSet1", desc="desc", genes=LETTERS[1:4], namespace="A") gs2 <- list(name="GeneSet2", desc="desc", genes=LETTERS[5:8], namespace="A") gs3 <- list(name="GeneSet3", desc="desc", genes=LETTERS[seq(2,8,2)], namespace="A") gs4 <- list(name="GeneSet3", desc="desc", genes=LETTERS[seq(1,7,2)], namespace="B") gmtList <- BioQC::GmtList(list(gs1, gs2, gs3, gs4)) myInput <- LETTERS[2:6] myUniverse <- LETTERS myFisherRes <- fisherTest(myInput, gmtList, myUniverse)gs1 <- list(name="GeneSet1", desc="desc", genes=LETTERS[1:4], namespace="A") gs2 <- list(name="GeneSet2", desc="desc", genes=LETTERS[5:8], namespace="A") gs3 <- list(name="GeneSet3", desc="desc", genes=LETTERS[seq(2,8,2)], namespace="A") gs4 <- list(name="GeneSet3", desc="desc", genes=LETTERS[seq(1,7,2)], namespace="B") gmtList <- BioQC::GmtList(list(gs1, gs2, gs3, gs4)) myInput <- LETTERS[2:6] myUniverse <- LETTERS myFisherRes <- fisherTest(myInput, gmtList, myUniverse)
Perform Fisher's exact test on a GeneSet object
## S4 method for signature 'character,list,character' fisherTest( genes, genesets, universe, makeUniqueNonNA = TRUE, checkUniverse = TRUE, useEASE = FALSE )## S4 method for signature 'character,list,character' fisherTest( genes, genesets, universe, makeUniqueNonNA = TRUE, checkUniverse = TRUE, useEASE = FALSE )
genes |
a collection of genes of which over-representation of the gene set is tested |
genesets |
A |
universe |
universe of genes |
makeUniqueNonNA |
Logical, whether genes and universe should be filtered to remove NA and made unique. The default is set to |
checkUniverse |
Logical, if |
useEASE |
Logical, whether to use the EASE method to report the p-value. This function performs one-sided Fisher's exact test to test the over-representation of gene set genes in the input gene list. |
myGenes <- LETTERS[1:3] myS4GeneSet1 <- list(name="GeneSet1", desc="GeneSet", genes=LETTERS[1:6], namespace="My namespace 1") myS4GeneSet2 <- list(name="GeneSet1", desc="GeneSet", genes=LETTERS[2:7], namespace="My namespace 2") myUniverse <- LETTERS fisherTest(myGenes, myS4GeneSet1, myUniverse) fisherTest(myGenes, myS4GeneSet2, myUniverse)myGenes <- LETTERS[1:3] myS4GeneSet1 <- list(name="GeneSet1", desc="GeneSet", genes=LETTERS[1:6], namespace="My namespace 1") myS4GeneSet2 <- list(name="GeneSet1", desc="GeneSet", genes=LETTERS[2:7], namespace="My namespace 2") myUniverse <- LETTERS fisherTest(myGenes, myS4GeneSet1, myUniverse) fisherTest(myGenes, myS4GeneSet2, myUniverse)
Run Fisher's exact test on an EdgeResult object
fisherTestEdgeResult( edgeResult, gmtList, contrast, thr.abs.logFC = 1, thr.FDR = 0.05, minGeneSetEffectiveSize = 5, maxGeneSetEffectiveSize = 500, ... )fisherTestEdgeResult( edgeResult, gmtList, contrast, thr.abs.logFC = 1, thr.FDR = 0.05, minGeneSetEffectiveSize = 5, maxGeneSetEffectiveSize = 500, ... )
edgeResult |
An |
gmtList |
A |
contrast |
Character, the contrast of interest |
thr.abs.logFC |
Numeric, threshold of absolute log2 fold-change to define positively and negatively regulated genes |
thr.FDR |
Numeric, threshold of FDR values |
minGeneSetEffectiveSize |
Integer, minimal number of genes of a geneset that are quantified |
maxGeneSetEffectiveSize |
Integer, maximal number of genes of a geneset that are quantified |
... |
Passed to |
A data.table containing Fisher's exact test results for
positively and negatively regulated genes.
data.table returned by fisherTest
Append NewHitsProp to the result data.table returned by fisherTest
fisherTestResultNewHitsProp(fisherTestResults)fisherTestResultNewHitsProp(fisherTestResults)
fisherTestResults |
|
A new data.table containing all columns of the input and NewHitsProp, a new column including the proportion of new hits in the gene-set
GeMS base URL To set GeMS base URL in your environment, use 'GeMS_BASE_URL=value' in your "~/.Renviron" file
GeMS_BASE_URLGeMS_BASE_URL
An object of class character of length 1.
GeMS genesets retrieval URL
GeMS_GENESETS_URLGeMS_GENESETS_URL
An object of class character of length 1.
GeMS insert URL
GeMS_INSERT_URLGeMS_INSERT_URL
An object of class character of length 1.
GeMS remove URL
GeMS_REMOVE_URLGeMS_REMOVE_URL
An object of class character of length 1.
GeMS geneset retrieval URL for testing
GeMS_TEST_GENESETS_URLGeMS_TEST_GENESETS_URL
An object of class character of length 1.
GeMS URL for testing
GeMS_TEST_URLGeMS_TEST_URL
An object of class character of length 1.
Test gene set enrichment by permutating gene labels of statistics
geneSetPerm(stats, indList, Nsim = 9999)geneSetPerm(stats, indList, Nsim = 9999)
stats |
Statistics |
indList |
a list of integers, indicating indices of genes of gene sets (index starts from 1, following R's convention) |
Nsim |
number of simulations |
A data frame containg mean statistic, gene set size, and p-values
geneSetTest, a R implementation in the limma package
set.seed(1887) stats <- rnorm(1000) gsList <- list(gs1=c(3,4,5), gs2=c(7,8,9)) geneSetPerm(stats, gsList, Nsim=99) gsList2 <- list(gs1=c(3,4,5), gs2=c(7,8,9), gs3=integer()) geneSetPerm(stats, gsList2, Nsim=99) gsList3 <- sample(1:1000, 200) geneSetPerm(stats, gsList3, Nsim=99)set.seed(1887) stats <- rnorm(1000) gsList <- list(gs1=c(3,4,5), gs2=c(7,8,9)) geneSetPerm(stats, gsList, Nsim=99) gsList2 <- list(gs1=c(3,4,5), gs2=c(7,8,9), gs3=integer()) geneSetPerm(stats, gsList2, Nsim=99) gsList3 <- sample(1:1000, 200) geneSetPerm(stats, gsList3, Nsim=99)
A generic, virtual S4 class for gene-set analysis result
An object of class GeneSetResult (virtual).
Get the name of the column which store false-discovery rates (adjusted P-values) from topTables
getFDRCol(colnames)getFDRCol(colnames)
colnames |
A character string vector of column names |
The column name of the FDRs, NA if not found.
getFDRCol(c("Feature", "logFC", "PValue", "FDR")) getFDRCol(c("Feature", "logFC", "P.Value", "FDR")) getFDRCol(c("Feature", "logFC", "p.Value", "adjPvalue")) getFDRCol(c("Feature", "logFC", "PValue", "adj.PValue"))getFDRCol(c("Feature", "logFC", "PValue", "FDR")) getFDRCol(c("Feature", "logFC", "P.Value", "FDR")) getFDRCol(c("Feature", "logFC", "p.Value", "adjPvalue")) getFDRCol(c("Feature", "logFC", "PValue", "adj.PValue"))
Send a list as JSON query to an URL and fetch the response
getJsonResponse(url, body)getJsonResponse(url, body)
url |
The destination URL |
body |
A list to be sent to the URL, which will be encoded in the JSON format internally |
The response from the webserver
## Not run: ## getJsonResponse(GeMS_GENESETS_URL, list(user=ribiosUtils::whoami())) ## End(Not run)## Not run: ## getJsonResponse(GeMS_GENESETS_URL, list(user=ribiosUtils::whoami())) ## End(Not run)
Get the name of the column which store unadjusted P-values from topTables
getPvalCol(colnames)getPvalCol(colnames)
colnames |
A character string vector of column names |
The column name of the unadjusted p-values, NA if not found.
getPvalCol(c("Feature", "logFC", "PValue", "FDR")) getPvalCol(c("Feature", "logFC", "P.Value", "FDR")) getPvalCol(c("Feature", "logFC", "p.Value", "adjPvalue")) getPvalCol(c("Feature", "logFC", "pval", "adjPvalue"))getPvalCol(c("Feature", "logFC", "PValue", "FDR")) getPvalCol(c("Feature", "logFC", "P.Value", "FDR")) getPvalCol(c("Feature", "logFC", "p.Value", "adjPvalue")) getPvalCol(c("Feature", "logFC", "pval", "adjPvalue"))
Get one or more gene-sets with their names
getSetsWithNamesFromGeMS(setNames = NULL)getSetsWithNamesFromGeMS(setNames = NULL)
setNames |
Character strings |
A GmtList object
## Not run: getSetsWithNamesFromGeMS(c("Plasma_sc", "Bcell_l_Danaher17")) ## End(Not run)## Not run: getSetsWithNamesFromGeMS(c("Plasma_sc", "Bcell_l_Danaher17")) ## End(Not run)
Get gene-sets for application
getSetsWithPropertyFromGeMS(property = "meta.application", value = "")getSetsWithPropertyFromGeMS(property = "meta.application", value = "")
property |
Character string, property to query |
value |
Character string, property value |
A GmtList object
## Not run: getSetsWithPropertyFromGeMS("meta.application", "rtbeda_CIT") ## End(Not run)## Not run: getSetsWithPropertyFromGeMS("meta.application", "rtbeda_CIT") ## End(Not run)
Get one gene-set with its name
getSetWithNameFromGeMS(setName)getSetWithNameFromGeMS(setName)
setName |
Character string |
A list of two elements
name
genes
## Not run: getSetWithNameFromGeMS("Plasma_sc") ## End(Not run)## Not run: getSetWithNameFromGeMS("Plasma_sc") ## End(Not run)
Get gene sets of a user from GeMS
getUserSetsFromGeMS(user = ribiosUtils::whoami())getUserSetsFromGeMS(user = ribiosUtils::whoami())
user |
User name |
A data.frame including following columns:
setName
desc
domain
source
subtype
## Not run: #### my gene-sets ## getUserSetsFromGeMS() #### from another user ## getUserSetsFromGeMS("kanga6") ## End(Not run)## Not run: #### my gene-sets ## getUserSetsFromGeMS() #### from another user ## getUserSetsFromGeMS("kanga6") ## End(Not run)
Return GSEA core enrichment genes (also known as leading-edge genes)
gseaCoreEnrichGenes(object) ## S4 method for signature 'AnnoBroadGseaResItem' gseaCoreEnrichGenes(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaCoreEnrichGenes(object)gseaCoreEnrichGenes(object) ## S4 method for signature 'AnnoBroadGseaResItem' gseaCoreEnrichGenes(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaCoreEnrichGenes(object)
object |
An object |
A character vector of core enrichment genes.
gseaCoreEnrichGenes(AnnoBroadGseaResItem): Return core enriched genes (also known as
leading-edge genes) in an AnnoBroadGseaResItem object as a character string
vector.
gseaCoreEnrichGenes(AnnoBroadGseaRes): Return core enriched genes (also known as
leading-edge genes) in an AnnoBroadGseaRes object as a list of character
string vectors.
Return GSEA core enrichment score threshold
gseaCoreEnrichThr(object) ## S4 method for signature 'BroadGseaResItem' gseaCoreEnrichThr(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaCoreEnrichThr(object)gseaCoreEnrichThr(object) ## S4 method for signature 'BroadGseaResItem' gseaCoreEnrichThr(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaCoreEnrichThr(object)
object |
An object |
A numeric value.
gseaCoreEnrichThr(BroadGseaResItem): Get the threshold value of GSEA core enrichment
from a BroadGseaResItem object
gseaCoreEnrichThr(AnnoBroadGseaRes): Get the threshold value of GSEA core enrichment
from an AnnoBroadGseaRes object
Return GSEA enrichment scores
gseaES(object) ## S4 method for signature 'BroadGseaResItem' gseaES(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaES(object) ## S4 method for signature 'AnnoBroadGseaResList' gseaES(object)gseaES(object) ## S4 method for signature 'BroadGseaResItem' gseaES(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaES(object) ## S4 method for signature 'AnnoBroadGseaResList' gseaES(object)
object |
An object |
A numeric vector of enrichment scores.
gseaES(BroadGseaResItem): Get GSEA enrichment score from a BroadGseaResItem object
gseaES(AnnoBroadGseaRes): Get GSEA enrichment score from an AnnoBroadGseaRes object
gseaES(AnnoBroadGseaResList): Get GSEA enrichment score from an AnnoBroadGseaResList object
Return GSEA enrichment score profile
gseaESprofile(object) ## S4 method for signature 'BroadGseaResItem' gseaESprofile(object)gseaESprofile(object) ## S4 method for signature 'BroadGseaResItem' gseaESprofile(object)
object |
An object |
A numeric vector of enrichment score profiles.
gseaESprofile(BroadGseaResItem): Get GSEA enrichment profile
from a BroadGseaResItem object
Return GSEA FDR
gseaFDR(object) ## S4 method for signature 'BroadGseaResItem' gseaFDR(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaFDR(object) ## S4 method for signature 'AnnoBroadGseaResList' gseaFDR(object)gseaFDR(object) ## S4 method for signature 'BroadGseaResItem' gseaFDR(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaFDR(object) ## S4 method for signature 'AnnoBroadGseaResList' gseaFDR(object)
object |
An object |
A numeric vector of FDR values.
gseaFDR(BroadGseaResItem): Get GSEA FDR values
from a BroadGseaResItem object
gseaFDR(AnnoBroadGseaRes): Get GSEA FDR values
from an AnnoBroadGseaRes object
gseaFDR(AnnoBroadGseaResList): Get GSEA FDR values
from an AnnoBroadGseaResList object
gseaFingerprint extracts pathway fingerprints from the result of one
GSEA result. gseaFingerprintMatrix extracts multiple signatures and
organizes into the form of rectangular matrix.
gseaFingerprint( gseaDir, value = c("q", "es", "nes"), threshold = 1e-04, sortByName = TRUE ) gseaFingerprintMatrix(gseaDirs, value = c("q", "es", "nes"), ...)gseaFingerprint( gseaDir, value = c("q", "es", "nes"), threshold = 1e-04, sortByName = TRUE ) gseaFingerprintMatrix(gseaDirs, value = c("q", "es", "nes"), ...)
gseaDir |
Character, a GSEA output directory. Notice the directory must be accessible by the R session. A common mistake is to use a relative path which cannot be found. |
value |
Character, the statistic to extract, currently supporting
|
threshold |
Numeric, minimum threshold of q-value, passed to
|
sortByName |
Logical, whether signatures should be sorted by name |
gseaDirs |
Character vector, GSEA output directories |
... |
Parameters passed to |
gseaFingerprint extracts pathway signature from one GSEA output
directory. While gseaFingerprintMatrix simultaneously extracts from
more than one GSEA output directories, and organizes pathway signatures in a
rectangular matrix form.
gseaFingerprintMatrix takes care of signature mapping between
different GSEA result sets.
gseaFingerprint returns a data.frame with two columns
name and value, recording gene signature (pathway) names and
the statistic chosen by the user.
gseaFingerprintMatrix returns a matrix, with the union set of
gene signatures from all GSEA output result sets as rows, and GSEA result
names as columns.
Jitao David Zhang <[email protected]>
See gseaQvalue and gseaES for how to choose the
statistic to produce pathway signatures.
gseaDirZip <- system.file(package="ribiosGSEA","extdata/gseaDirs.zip") tmpDir <- tempdir() utils::unzip(gseaDirZip, exdir=tmpDir) gseaDir <- file.path(tmpDir, "gseaDirs") gseaDirs <- dir(gseaDir, full.names=TRUE) gseaFp <- gseaFingerprint(gseaDirs[1], value="q") gseaFps <- gseaFingerprintMatrix(gseaDirs, value="q")gseaDirZip <- system.file(package="ribiosGSEA","extdata/gseaDirs.zip") tmpDir <- tempdir() utils::unzip(gseaDirZip, exdir=tmpDir) gseaDir <- file.path(tmpDir, "gseaDirs") gseaDirs <- dir(gseaDir, full.names=TRUE) gseaFp <- gseaFingerprint(gseaDirs[1], value="q") gseaFps <- gseaFingerprintMatrix(gseaDirs, value="q")
Return GSEA FWER values
gseaFWER(object) ## S4 method for signature 'BroadGseaResItem' gseaFWER(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaFWER(object) ## S4 method for signature 'AnnoBroadGseaResList' gseaFWER(object)gseaFWER(object) ## S4 method for signature 'BroadGseaResItem' gseaFWER(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaFWER(object) ## S4 method for signature 'AnnoBroadGseaResList' gseaFWER(object)
object |
An object |
A numeric vector of FWER values.
gseaFWER(BroadGseaResItem): Get GSEA FWER values
from a BroadGseaResItem object
gseaFWER(AnnoBroadGseaRes): Get GSEA FWER values
from an AnnoBroadGseaRes object
gseaFWER(AnnoBroadGseaResList): Get GSEA FWER values
from an AnnoBroadGseaResList object
Return GSEA normalized enrichment scores
gseaNES(object) ## S4 method for signature 'BroadGseaResItem' gseaNES(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaNES(object) ## S4 method for signature 'AnnoBroadGseaResList' gseaNES(object)gseaNES(object) ## S4 method for signature 'BroadGseaResItem' gseaNES(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaNES(object) ## S4 method for signature 'AnnoBroadGseaResList' gseaNES(object)
object |
An object |
A numeric vector of normalized enrichment scores.
gseaNES(BroadGseaResItem): Get GSEA normalized enrichment score
from a BroadGseaResItem object
gseaNES(AnnoBroadGseaRes): Get GSEA normalized enrichment score
from an AnnoBroadGseaRes object
gseaNES(AnnoBroadGseaResList): Get GSEA normalized enrichment score
from an AnnoBroadGseaResList object
Return GSEA number of permutation
gseaNP(object) ## S4 method for signature 'BroadGseaResItem' gseaNP(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaNP(object) ## S4 method for signature 'AnnoBroadGseaResList' gseaNP(object)gseaNP(object) ## S4 method for signature 'BroadGseaResItem' gseaNP(object) ## S4 method for signature 'AnnoBroadGseaRes' gseaNP(object) ## S4 method for signature 'AnnoBroadGseaResList' gseaNP(object)
object |
An object |
A numeric vector.
gseaNP(BroadGseaResItem): Get GSEA number of permutations
from a BroadGseaResItem object
gseaNP(AnnoBroadGseaRes): Get GSEA number of permutations
from an AnnoBroadGseaRes object
gseaNP(AnnoBroadGseaResList): Get GSEA number of permutations
from an AnnoBroadGseaResList object
Read GSEA statistics (log-transformed q-value [q], Enrichment Score [ES], or normalized Enrichement Score [NES]) to profile pathway activitities.
gseaResQvalue(file, threshold = 1e-04, log = FALSE, posLog = FALSE) gseaResES(file, normalized = FALSE)gseaResQvalue(file, threshold = 1e-04, log = FALSE, posLog = FALSE) gseaResES(file, normalized = FALSE)
file |
GSEA output tab-delimited file, usually with the file name ‘gsea_report_for.*_pos_.*.xls’ or ‘gsea_report_for.*_neg_.*.xls’. Located in GSEA output directory. |
threshold |
Valid for q value: what is the minimum threshold of q-value
(FDR)? It can be set to the number of permutation tests divided by |
log |
Valid for q value: whether the FDR q value should be transformed
by base-10 (log10) logarithm. By default |
posLog |
Valid for q value: whether the logged FDR q value should be
negated to get positive value.This is useful when the sign of |
normalized |
Valid for enrichment score: if set to |
In many cases we want to extract pathway signatures from a set of
experiments. Both gseaResQvalue and gseaES can read GSEA
output files and extract desired statistic: q-value, ES or NES.
See the GSEA document for definitions of the three values. For comparing a few conditions to another, we recommend using q-value. For large-scale comparisons between pathways (or other gene signatures), we have found ES very useful. It is adviced to choose proper statistic to extract pathway signatures only when you are sure of the aim. Using any statistic without good reasoning may as always lead to wrong intepretations of the data.
These functions are usually not directly called by end-users. See
gseaFingerprint and link{gseaFingerprintMatrix}
instead.
A data.frame with two columns: name and value.
The column name contains gene signatures (e.g. pathways), and
value contains the statistic.
gseaResQvalue(): The function to extract the Q-value
Extract Q-values from GSEA result file
Jitao David Zhang <[email protected]>, with input from Martin Ebeling, Laura Badi and Isabelle Wells.
GSEA documentation http://www.broadinstitute.org/gsea/doc/GSEAUserGuideFrame.html
End-users will probably find gseaFingerprint and
link{gseaFingerprintMatrix} more useful, since they operate on the
level of GSEA result directories, instead of single output tab-delimited
files.
gseaDirZip <- system.file(package="ribiosGSEA","extdata/gseaDirs.zip") tmpDir <- tempdir() utils::unzip(gseaDirZip, exdir=tmpDir) gseaDir <- file.path(tmpDir, "gseaDirs") gseaFile <- file.path(gseaDir, "VitaminA_24h_High", "gsea_report_for_na_neg_1336489010730.xls") gseaQ <- gseaResQvalue(gseaFile) gseaLogQ <- gseaResQvalue(gseaFile, log=TRUE) gseaQscore <- gseaResQvalue(gseaFile, log=TRUE, posLog=TRUE) gseaEs <- gseaResES(gseaFile) gseaNes <- gseaResES(gseaFile, normalized=TRUE)gseaDirZip <- system.file(package="ribiosGSEA","extdata/gseaDirs.zip") tmpDir <- tempdir() utils::unzip(gseaDirZip, exdir=tmpDir) gseaDir <- file.path(tmpDir, "gseaDirs") gseaFile <- file.path(gseaDir, "VitaminA_24h_High", "gsea_report_for_na_neg_1336489010730.xls") gseaQ <- gseaResQvalue(gseaFile) gseaLogQ <- gseaResQvalue(gseaFile, log=TRUE) gseaQscore <- gseaResQvalue(gseaFile, log=TRUE, posLog=TRUE) gseaEs <- gseaResES(gseaFile) gseaNes <- gseaResES(gseaFile, normalized=TRUE)
One way to score GSEA results is to multiple the absolute value of log10 transformed p-values (nominal p-value, FDR, or FWER) with the sign of the enrichment scores. This score is intuitive since it combines statistical significance and the sign of regulation.
gseaScore(x, type = c("fdr", "p", "fwer")) gseaScores(..., names = NULL, type = c("fdr", "p", "fwer"))gseaScore(x, type = c("fdr", "p", "fwer")) gseaScores(..., names = NULL, type = c("fdr", "p", "fwer"))
x |
An |
type |
Character string, the type of p-value used to calculate the score. |
... |
Objects of |
names |
Character strings, names given to the result score sets. See examples below. |
gseaScores takes care of the situation where some gene sets are
missing in one or more conditions.
gseaScore returns a double vector of scores with gene set
names.
gseaScores returns a data frame of scores, with gene set names as row
names.
gseaScores(): gseaScore applied to multiple objects
Jitao David Zhang <[email protected]>
gseaNP, gseaFDR, gseaFWER
to get p-values.
Return the effective size of gene-set
gsEffectiveSize(object, ...) ## S4 method for signature 'FisherResult' gsEffectiveSize(object) ## S4 method for signature 'FisherResultList' gsEffectiveSize(object)gsEffectiveSize(object, ...) ## S4 method for signature 'FisherResult' gsEffectiveSize(object) ## S4 method for signature 'FisherResultList' gsEffectiveSize(object)
object |
An object |
... |
Other parameters |
An integer vector of effective sizes.
gsEffectiveSize(FisherResult): Effective sizes of gene-set, returning an integer.
gsEffectiveSize(FisherResultList): Effective sizes of Gene-sets, returning an integer vector.
The core algorithm to perform Fisher's exact test on a gene set
gsFisherTestCore( genes, geneSetGenes, universe, makeUniqueNonNA = TRUE, checkUniverse = TRUE, useEASE = FALSE )gsFisherTestCore( genes, geneSetGenes, universe, makeUniqueNonNA = TRUE, checkUniverse = TRUE, useEASE = FALSE )
genes |
Character vector, a collection of genes of which over-representation of the gene set is tested |
geneSetGenes |
Character vector, genes belonging to a gene set |
universe |
Character vector, universe of genes |
makeUniqueNonNA |
Logical, whether genes, geneSetGenes, and universe should be filtered to remove NA and made unique. The default is set to |
checkUniverse |
Logical, if |
useEASE |
Logical, whether to use the EASE method to report the p-value. This function performs one-sided Fisher's exact test to test the over-representation of the genes given as If |
A list of three elements
p The p-value of one-sided (over-representation of the Fisher's test)
gsEffectiveSize Gene-set's effective size, namely number of genes that are in the universe
hits Character vector, genes that are found in the gene sets
Hosack, Douglas A., Glynn Dennis, Brad T. Sherman, H. Clifford Lane, and Richard A. Lempicki. Identifying Biological Themes within Lists of Genes with EASE. Genome Biology 4 (2003): R70. doi:10.1186/gb-2003-4-10-r70
myGenes <- LETTERS[1:3] myGeneSet1 <- LETTERS[1:6] myGeneSet2 <- LETTERS[4:7] myUniverse <- LETTERS gsFisherTestCore(myGenes, myGeneSet1, myUniverse) gsFisherTestCore(myGenes, myGeneSet2, myUniverse) ## use EASE for conservative estimating gsFisherTestCore(myGenes, myGeneSet1, myUniverse, useEASE=FALSE) gsFisherTestCore(myGenes, myGeneSet1, myUniverse, useEASE=TRUE) ## checkUniverse will make sure that \code{univese} contains all element in \code{genes} gsFisherTestCore(c("OutOfUniverse", myGenes), myGeneSet1, myUniverse, checkUniverse=FALSE) gsFisherTestCore(c("OutOfUniverse", myGenes), myGeneSet1, myUniverse, checkUniverse=TRUE)myGenes <- LETTERS[1:3] myGeneSet1 <- LETTERS[1:6] myGeneSet2 <- LETTERS[4:7] myUniverse <- LETTERS gsFisherTestCore(myGenes, myGeneSet1, myUniverse) gsFisherTestCore(myGenes, myGeneSet2, myUniverse) ## use EASE for conservative estimating gsFisherTestCore(myGenes, myGeneSet1, myUniverse, useEASE=FALSE) gsFisherTestCore(myGenes, myGeneSet1, myUniverse, useEASE=TRUE) ## checkUniverse will make sure that \code{univese} contains all element in \code{genes} gsFisherTestCore(c("OutOfUniverse", myGenes), myGeneSet1, myUniverse, checkUniverse=FALSE) gsFisherTestCore(c("OutOfUniverse", myGenes), myGeneSet1, myUniverse, checkUniverse=TRUE)
Return gene-set gene count
gsGeneCount(object, ...)gsGeneCount(object, ...)
object |
An object |
... |
Other parameters |
An integer vector of gene counts.
Return gene-set gene indices
gsGeneIndices(object) ## S4 method for signature 'BroadGseaResItem' gsGeneIndices(object)gsGeneIndices(object) ## S4 method for signature 'BroadGseaResItem' gsGeneIndices(object)
object |
An object |
An integer vector of gene indices.
gsGeneIndices(BroadGseaResItem): Get gene-set gene indices
from a BroadGseaResItem object, returning a vector of integers.
Return gene-set genes
gsGenes(object, ...) ## S4 method for signature 'AnnoBroadGseaResItem' gsGenes(object) ## S4 method for signature 'AnnoBroadGseaRes' gsGenes(object) ## S4 method for signature 'GmtList' gsGenes(object)gsGenes(object, ...) ## S4 method for signature 'AnnoBroadGseaResItem' gsGenes(object) ## S4 method for signature 'AnnoBroadGseaRes' gsGenes(object) ## S4 method for signature 'GmtList' gsGenes(object)
object |
An object |
... |
Other parameters |
A character vector or list of character vectors of gene-set genes.
gsGenes(AnnoBroadGseaResItem): Get gene-set genes
from a BroadGseaResItem object, returning a character string vector.
gsGenes(AnnoBroadGseaRes): Get gene-set genes from an AnnoBroadGseaRes object,
returning a list of character string vectors.
gsGenes(GmtList): Get gene-set genes from a GmtList object, returning a
list of character string vector. It uses the implementation in BioQC.
Set gene-set genes
gsGenes(object) <- value ## S4 replacement method for signature 'AnnoBroadGseaResItem,character' gsGenes(object) <- valuegsGenes(object) <- value ## S4 replacement method for signature 'AnnoBroadGseaResItem,character' gsGenes(object) <- value
object |
An object |
value |
Value |
The modified object.
gsGenes(object = AnnoBroadGseaResItem) <- value: Assign gene-set genes to AnnoBroadGseaResItem
Return gene-set gene values
gsGeneValues(object) ## S4 method for signature 'AnnoBroadGseaResItem' gsGeneValues(object) ## S4 method for signature 'AnnoBroadGseaRes' gsGeneValues(object)gsGeneValues(object) ## S4 method for signature 'AnnoBroadGseaResItem' gsGeneValues(object) ## S4 method for signature 'AnnoBroadGseaRes' gsGeneValues(object)
object |
An object |
A numeric vector or list of numeric vectors of gene values.
gsGeneValues(AnnoBroadGseaResItem): Return values associated with the genes in a
gene-set in an AnnoBroadGseaResItem object in a numeric vector.
gsGeneValues(AnnoBroadGseaRes): Return values associated with the genes in a
gene-set in an AnnoBroadGseaRes object in a list of numeric vectors.
Set gene-set gene statistics (values)
gsGeneValues(object) <- value ## S4 replacement method for signature 'AnnoBroadGseaResItem,numeric' gsGeneValues(object) <- valuegsGeneValues(object) <- value ## S4 replacement method for signature 'AnnoBroadGseaResItem,numeric' gsGeneValues(object) <- value
object |
An object |
value |
Value |
The modified object.
gsGeneValues(object = AnnoBroadGseaResItem) <- value: Assign values associated with gene-set genes to
an annoBraoadGseaResItem object
Core algorithm to perform Fisher's exact test on a list of gene set
gsListFisherTestCore( genes, geneSetGenesList, universe, makeUniqueNonNA = TRUE, checkUniverse = TRUE, useEASE = FALSE )gsListFisherTestCore( genes, geneSetGenesList, universe, makeUniqueNonNA = TRUE, checkUniverse = TRUE, useEASE = FALSE )
genes |
Character vector, a collection of genes of which over-representation of the gene set is tested |
geneSetGenesList |
A list of character vector, genes belonging to each gene set |
universe |
Character vector, universe of genes |
makeUniqueNonNA |
Logical, whether genes, geneSetGenes, and universe should be filtered to remove NA and made unique. The default is set to |
checkUniverse |
Logical, if |
useEASE |
Logical, whether to use the EASE method to report the p-value. This function performs one-sided Fisher's exact test to test the over-representation of the genes given as If |
A list of lists, of the same length as the input geneSetGenesList, each list consisting of three elements
p The p-value of one-sided (over-representation of the Fisher's test)
gsEffectiveSize Gene-set's effective size, namely number of genes that are in the universe
hits Character vector, genes that are found in the gene sets
Hosack, Douglas A., Glynn Dennis, Brad T. Sherman, H. Clifford Lane, and Richard A. Lempicki. Identifying Biological Themes within Lists of Genes with EASE. Genome Biology 4 (2003): R70. doi:10.1186/gb-2003-4-10-r70
myGenes <- LETTERS[1:3] myGeneSet1 <- LETTERS[1:6] myGeneSet2 <- LETTERS[4:7] myUniverse <- LETTERS gsListFisherTestCore(myGenes, list(myGeneSet1, myGeneSet2), myUniverse)myGenes <- LETTERS[1:3] myGeneSet1 <- LETTERS[1:6] myGeneSet2 <- LETTERS[4:7] myUniverse <- LETTERS gsListFisherTestCore(myGenes, list(myGeneSet1, myGeneSet2), myUniverse)
Return gene-set name
gsName(object, ...) ## S4 method for signature 'BroadGseaResItem' gsName(object) ## S4 method for signature 'AnnoBroadGseaRes' gsName(object) ## S4 method for signature 'FisherResult' gsName(object) ## S4 method for signature 'FisherResultList' gsName(object, ...) ## S4 method for signature 'GmtList' gsName(object) ## S4 method for signature 'FisherResultList' gsName(object, ...)gsName(object, ...) ## S4 method for signature 'BroadGseaResItem' gsName(object) ## S4 method for signature 'AnnoBroadGseaRes' gsName(object) ## S4 method for signature 'FisherResult' gsName(object) ## S4 method for signature 'FisherResultList' gsName(object, ...) ## S4 method for signature 'GmtList' gsName(object) ## S4 method for signature 'FisherResultList' gsName(object, ...)
object |
An object |
... |
Other parameters |
A character vector of gene-set names.
gsName(BroadGseaResItem): Get gene-set name from a BroadGseaResItem object
gsName(AnnoBroadGseaRes): Get gene-set name from an AnnoBroadGseaRes object
gsName(FisherResult): Get gene-set name from a FisherResult object
gsName(FisherResultList): Get gene-set name from a FisherResultList object
gsName(GmtList): Get gene-set name from a GmtList object
gsName(FisherResultList): Get gene-set name from a FisherResultList object
Return gene-set namespace
gsNamespace(object, ...) ## S4 method for signature 'GmtList' gsNamespace(object) ## S4 method for signature 'FisherResult' gsNamespace(object) ## S4 method for signature 'FisherResultList' gsNamespace(object)gsNamespace(object, ...) ## S4 method for signature 'GmtList' gsNamespace(object) ## S4 method for signature 'FisherResult' gsNamespace(object) ## S4 method for signature 'FisherResultList' gsNamespace(object)
object |
An object |
... |
Other parameters |
A character vector of gene-set namespaces.
gsNamespace(GmtList): Return gene-set namespace from a GmtList object
gsNamespace(FisherResult): Return gene-set namespace from a FisherResult object
gsNamespace(FisherResultList): Return gene-set namespace from a FisherResultList
object.
Return the size (unique length) of gene-sets
gsSize(gmtList)gsSize(gmtList)
gmtList |
a |
An integer vector
Return hits
hits(object, ...) ## S4 method for signature 'FisherResult' hits(object) ## S4 method for signature 'FisherResultList' hits(object, geneset)hits(object, ...) ## S4 method for signature 'FisherResult' hits(object) ## S4 method for signature 'FisherResultList' hits(object, geneset)
object |
An object |
... |
Other parameters |
geneset |
Character string, gene-set name |
A character vector or list of hit genes.
hits(FisherResult): Return hits from a FisherResult object
hits(FisherResultList): Return hits from a FisherResultList object, returning a list
if geneset is missing, or gene-set genes if geneset is present.
Insert a GmtList object to GeMS
insertGmtListToGeMS( gmtList, geneFormat = 0, source = "PubMed", taxID = 9606, user = ribiosUtils::whoami(), subtype = "", domain = "" )insertGmtListToGeMS( gmtList, geneFormat = 0, source = "PubMed", taxID = 9606, user = ribiosUtils::whoami(), subtype = "", domain = "" )
gmtList |
A |
geneFormat |
Integer index of gene format. 0 stands for official human gene symbol |
source |
Character, source of the gene set |
taxID |
Integer, NCBI taxonomy ID of the species. |
user |
The user name |
subtype |
Subtype of the geneset |
domain |
Domain of the geneset |
Response code or error message returned by the GeMS API. A value of 200 indicates a successful insertion.
## Not run: testList <- list(list(name="GS_A", desc=NULL, genes=c("MAPK14", "JAK1", "EGFR")), list(name="GS_B", desc="gene set B", genes=c("ABCA1", "DDR1", "DDR2")), list(name="GS_C", desc="gene set C", genes=NULL)) testGmt <- BioQC::GmtList(testList) ## insertGmtListToGeMS(testGmt, geneFormat=0, source="Test") ## removeFromGeMS(setName=c("GS_A", "GS_B", "GS_C"), source="Test") ## End(Not run)## Not run: testList <- list(list(name="GS_A", desc=NULL, genes=c("MAPK14", "JAK1", "EGFR")), list(name="GS_B", desc="gene set B", genes=c("ABCA1", "DDR1", "DDR2")), list(name="GS_C", desc="gene set C", genes=NULL)) testGmt <- BioQC::GmtList(testList) ## insertGmtListToGeMS(testGmt, geneFormat=0, source="Test") ## removeFromGeMS(setName=c("GS_A", "GS_B", "GS_C"), source="Test") ## End(Not run)
Construct message body to insert into GeMS
insertGmtListToGeMSBody( gmtList, geneFormat = 0, source = "PubMed", taxID = 9606, user = ribiosUtils::whoami(), subtype = "", domain = "" )insertGmtListToGeMSBody( gmtList, geneFormat = 0, source = "PubMed", taxID = 9606, user = ribiosUtils::whoami(), subtype = "", domain = "" )
gmtList |
A |
geneFormat |
Integer index of gene format. 0 stands for official human gene symbol |
source |
Character, source of the gene set |
taxID |
Integer, NCBI taxonomy ID of the species. |
user |
The user name |
subtype |
Subtype of the geneset |
domain |
Domain of the geneset |
A list with three items: headers, parsed, and params
testList <- list(list(name="GS_A", desc=NULL, genes=c("MAPK14", "JAK1", "EGFR")), list(name="GS_B", desc="gene set B", genes=c("ABCA1", "DDR1", "DDR2")), list(name="GS_C", desc="gene set C", genes=NULL)) testGmt <- BioQC::GmtList(testList) insertGmtListToGeMSBody(testGmt, geneFormat=0, source="Test")testList <- list(list(name="GS_A", desc=NULL, genes=c("MAPK14", "JAK1", "EGFR")), list(name="GS_B", desc="gene set B", genes=c("ABCA1", "DDR1", "DDR2")), list(name="GS_C", desc="gene set C", genes=NULL)) testGmt <- BioQC::GmtList(testList) insertGmtListToGeMSBody(testGmt, geneFormat=0, source="Test")
Test whether GeMS is reachable
isGeMSReachable()isGeMSReachable()
Logical value
## Not run: ## isGeMSReachble() ## End(Not run)## Not run: ## isGeMSReachble() ## End(Not run)
Return a vector of logical values, indicating whether genes belong to core enrichment or not
isGseaCoreEnrich(object)isGseaCoreEnrich(object)
object |
An |
A logical vector
Return a logical vector indicating whether a gene-set is significantly enriched or not, given the FDR threshold
isSigGeneSet(object, fdr = 0.05)isSigGeneSet(object, fdr = 0.05)
object |
A FisherResultList object |
fdr |
Numeric, FDR value threshold |
A logical vector
S3 generic for kendallW
kendallW(object, ...)kendallW(object, ...)
object |
An object |
... |
Other parameters |
A matrix with an info attribute, see kendallWmat.
Compute Kendall's W for an eSet object
## S3 method for class 'eSet' kendallW( object, row.factor, summary = c("none", "mean", "median", "max.mean.sig", "max.var.sig"), na.rm = TRUE, alpha = 0.01, ... )## S3 method for class 'eSet' kendallW( object, row.factor, summary = c("none", "mean", "median", "max.mean.sig", "max.var.sig"), na.rm = TRUE, alpha = 0.01, ... )
object |
An |
row.factor |
A factor indicating groups of rows. In expression analysis, for instance, this can be GeneIDs indicating which probesets in rows belong to the same gene. |
summary |
Summary type, passed to |
na.rm |
Logical, whether |
alpha |
Numeric, passed to |
... |
Not used |
An ExpressionSet object with consolidated features.
Compute Kendall's W for a matrix
## S3 method for class 'matrix' kendallW( object, row.factor, summary = c("none", "mean", "median", "max.mean.sig", "max.var.sig"), na.rm = TRUE, alpha = 0.01, ... )## S3 method for class 'matrix' kendallW( object, row.factor, summary = c("none", "mean", "median", "max.mean.sig", "max.var.sig"), na.rm = TRUE, alpha = 0.01, ... )
object |
A numeric matrix |
row.factor |
A factor indicating groups of rows. In expression analysis, for instance, this can be GeneIDs indicating which probesets in rows belong to the same gene. |
summary |
Summary type, passed to |
na.rm |
Logical, whether |
alpha |
Numeric, passed to |
... |
Not used |
A matrix with an info attribute, see kendallWmat.
S3 generic for kendallW information
kendallWinfo(object) ## S3 method for class 'matrix' kendallWinfo(object)kendallWinfo(object) ## S3 method for class 'matrix' kendallWinfo(object)
object |
An object |
A data.frame containing grouping information.
kendallWinfo(matrix): Extract kendallW information from a matrix
S3 method to assign kendallW information to a matrix
## S3 replacement method for class 'matrix' kendallWinfo(object) <- value## S3 replacement method for class 'matrix' kendallWinfo(object) <- value
object |
matrix |
value |
assigned value |
The matrix containing grouping information
Kendall's W, also known as Kendall's coefficient of concordance, is a non-parametric statistic developed to assess agreement among raters used in psychological or similar experimental settings.
kendallWmat( mat, row.factor, summary = c("none", "mean", "median", "max.mean.sig", "max.var.sig"), na.rm = TRUE, alpha = 0.01 )kendallWmat( mat, row.factor, summary = c("none", "mean", "median", "max.mean.sig", "max.var.sig"), na.rm = TRUE, alpha = 0.01 )
mat |
A numeric matrix. It must contain at least 2 rows and 2 columns. |
row.factor |
A factor indicating groups of rows. In expression analysis, for instance, this can be GeneIDs indicating which probesets in rows belong to the same gene. |
summary |
Character, action to take once the sub-groups have been
determined. ‘none’ indicates no action should be taken, the original
data is returned with the information of sub-grouping. The option
‘mean’ (or ‘median’) will take mean/median of features in each
sub-group as result. On contrast, |
na.rm |
Logical, should those features whose |
alpha |
Nunmeric value, the significance level of the Kendall's W
statistic. The larger the value, the more abbreviations from strong
associations are allowed in sub-groups. Default is |
In computational biology, the concept of associating features with similar patterns while keeping outliers can be useful in many cases. See the Details section for examples.
This function implements the Kendall's W recursively with graph theory. It split grouped measurements into strongly associated sub-groups. See the Details section.
We take a microarray experiment as an example to demonstrate how the
function works. In microarrays, a gene is often represented by more than one
probeset, and it is not rare that they do not all resemble the same
expression pattern. Usually a one gene-one value relation is desired.
Common practices including choosing the probeset with the highest average
signal or the highest variance, as well as taking the mean/median value of
all probesets mapped to one gene as the representative value.
Kendall's W takes a very different approach. First it tries to judge whether multiple probesets of one gene are concordant. The concordance is determined by a non-parametric statistic closely related to Spearman correlation coefficient as well as Friedman's test. If all probesets are concordant, it means that their expression patterns are closely associated with each other. Any one of them, or the mean value, can be then used to represent the expression level of the gene.
In cases where there is little concordance among probesets, we can take use
of graph theory to iteratively search for sub-groups of probesets resemble
each other's expression patterns. In the extreme case, each probeset can be
different from the rest, and in this case the number of sub-groups will be
equal to the number of probesets mapped to the gene. Such cases can appear,
for instance, when each probeset was designed to target a different region
of a transcript with splice variants. By using Kendall's W statistic with
graph theory, the kendallWmat function can detect sub-groups with
strongly correlated expression patterns, while keeping outliers on their
own, therefore providing help for both conventional expression analysis and
post-hoc analysis with the help of sequence analysis. See reference for
examples on this application.
We believe this approach is only useful for microarray, but can be also
interesting for other applications like next-generation sequencing (NGS) or
pathway/network analysis. For instance, in NGS experiments, this method can
help to determine which splice variants of a transcript have similar
expression patterns, and how different are other variants. In pathway
analysis, when rows indicate gene expression values and row.factor
indicate pathway membership, the result reveals which sub-networks are
regulated associatively.
Currently a matrix with one attribute slot named info.
Jitao David Zhang <[email protected]>
The concept of Kendall's W was introduced in the seminal paper The problem of m rankings by M.G. Kendall and B.B. Smith (The Annals of Mathematical Statistics, 1939). Schneider, Smith and Hansen developed the SCOREM algorithm combining this statistic with graph theory (SCOREM: statistical consolidation of redundant expression measures, Nucleic Acids Research, 2011). This implementation is very much based on the SCOREM algorithm. The main changes are (1) the current implementation is more generic, applicable to native R data structures, therefore able to be applied in other scenario than microarray analysis (2) it takes not-annotated features into account as well and (3) it is possible to directly calculate summary statistics from sub-groups.
## use a mock example emat <- matrix(c(2,3,5, 8,9,2, 3,4,7, 0,2,1, NA, 3, 1.2, 5, -3,4, 5,7,11), ncol=3, byrow=TRUE, dimnames=list(paste("row", 1:7, sep=""),NULL)) efac <- factor(c("a", "b", "c", NA, "b", "a", "a"), levels=letters[1:5]) print(emat) kendallWmat(emat, efac, summary="none") kendallWmat(emat, efac, summary="none", na.rm=FALSE) kendallWmat(emat, efac, summary="mean") kendallWmat(emat, efac, summary="mean", na.rm=FALSE) kendallWmat(emat, efac, summary="median") kendallWmat(emat, efac, summary="median", na.rm=FALSE) kendallWmat(emat, efac, summary="max.mean.sig") kendallWmat(emat, efac, summary="max.mean.sig", na.rm=FALSE) kendallWmat(emat, efac, summary="max.var.sig") kendallWmat(emat, efac, summary="max.var.sig", na.rm=TRUE) ## kendallW acts as an interface to matrix kendallW(emat, efac, summary="none") ## kendallW acts as an interface to ExpressionSet data(ribios.ExpressionSet, package="ribiosExpression") kendallW(ribios.ExpressionSet, Biobase::fData(ribios.ExpressionSet)$GeneID, summary="none") kendallW(ribios.ExpressionSet, Biobase::fData(ribios.ExpressionSet)$GeneID, summary="mean")## use a mock example emat <- matrix(c(2,3,5, 8,9,2, 3,4,7, 0,2,1, NA, 3, 1.2, 5, -3,4, 5,7,11), ncol=3, byrow=TRUE, dimnames=list(paste("row", 1:7, sep=""),NULL)) efac <- factor(c("a", "b", "c", NA, "b", "a", "a"), levels=letters[1:5]) print(emat) kendallWmat(emat, efac, summary="none") kendallWmat(emat, efac, summary="none", na.rm=FALSE) kendallWmat(emat, efac, summary="mean") kendallWmat(emat, efac, summary="mean", na.rm=FALSE) kendallWmat(emat, efac, summary="median") kendallWmat(emat, efac, summary="median", na.rm=FALSE) kendallWmat(emat, efac, summary="max.mean.sig") kendallWmat(emat, efac, summary="max.mean.sig", na.rm=FALSE) kendallWmat(emat, efac, summary="max.var.sig") kendallWmat(emat, efac, summary="max.var.sig", na.rm=TRUE) ## kendallW acts as an interface to matrix kendallW(emat, efac, summary="none") ## kendallW acts as an interface to ExpressionSet data(ribios.ExpressionSet, package="ribiosExpression") kendallW(ribios.ExpressionSet, Biobase::fData(ribios.ExpressionSet)$GeneID, summary="none") kendallW(ribios.ExpressionSet, Biobase::fData(ribios.ExpressionSet)$GeneID, summary="mean")
Cluster gene-sets by enrichment profiles with k-means clustering, and select representative gene-sets by gene-set composition
kmeansGeneset( enrichProfMatrix, genesetGenes, optK = pmin(25, floor(nrow(enrichProfMatrix)/2)), iter.max = 15, nstart = 50, thrCumJaccardIndex = 0.5, maxRepPerCluster = 10, metaClusterColumns = 1:ncol(enrichProfMatrix) )kmeansGeneset( enrichProfMatrix, genesetGenes, optK = pmin(25, floor(nrow(enrichProfMatrix)/2)), iter.max = 15, nstart = 50, thrCumJaccardIndex = 0.5, maxRepPerCluster = 10, metaClusterColumns = 1:ncol(enrichProfMatrix) )
enrichProfMatrix |
A numeric matrix representing gene-set enrichment profile. Each row represent one gene-set and each column represent one enrichment profile, for instance a contrast in differential gene expression analysis. The values of the matrix represent enrichment of gene-sets, for instance enrichment score or absolute log10-transform p-values can be used. The row names are gene-set names. |
genesetGenes |
A list of character strings, each element being genes of a gene-set in the |
optK |
Integer, the number of initial clusters of gene-sets. Because one or more gene-sets may be selected from each gene-set cluster, the number of finally selected gene-sets is equal to or larger than |
iter.max |
Integer, the maximum numbers of iterations allowed. This parameter is passed to |
nstart |
Integer, how many random sets should be chosen to initialize cluster centers. This parameter is passed to |
thrCumJaccardIndex |
Numeric, between 0 and 1, the threshold of cumulative Jaccard Index. The larger the value is, the more gene-sets will be selected from each cluster |
maxRepPerCluster |
Integer, maximum number of representative genesets per cluster. If NULL or NA, no limit is set. |
metaClusterColumns |
Columns used to cluster the clusters by their average enrichment profile. By default, all columns are used. This function performs The geneset clusters are ordered by their average profiles - similar clusters are near to each other. |
A list:
kmeans Result object returned by kmeans.
genesetClusterData A data.frame with following columns: GenesetCluster, GenesetInd, GenesetName, JaccardIndex, CumJaccardIndex, IsRepresentative.
repGenesets Character vector, gene-set names that are selected as representative gene-sets from each gene-set clsuter.
gsCompOverlapSelInd Factor vector, indicating the gene-set clusters represented by each representative gene-set.
set.seed(1887) profMat <- matrix(rnorm(100), nrow=20, dimnames=list(sprintf("geneset%d", 1:20), sprintf("contrast%d", 1:5))) gsGenes <- lapply(1:nrow(profMat), function(x) unique(sample(LETTERS, 10, replace=TRUE))) names(gsGenes) <- rownames(profMat) kmeansGeneset(profMat, gsGenes, optK=5)set.seed(1887) profMat <- matrix(rnorm(100), nrow=20, dimnames=list(sprintf("geneset%d", 1:20), sprintf("contrast%d", 1:5))) gsGenes <- lapply(1:nrow(profMat), function(x) unique(sample(LETTERS, 10, replace=TRUE))) names(gsGenes) <- rownames(profMat) kmeansGeneset(profMat, gsGenes, optK=5)
First-level list must have vectors of basic data types defined by R such as
characater, integer, number, and logical.The
function transforms such a list into adjacency matrix, rows of which are
vector elements and columns of which are names of the list.
list2mat(list)list2mat(list)
list |
A one-level list. See details |
An adjacency matrix. Row and column names are defined by unique elements and list names, respectively.
Jitao David Zhang <[email protected]>
testList <- list(HSV=c("Adler", "Westermann", "Jansen"), FCB=c("Robben", "Jansen", "Neuer"), S04=c("Westermann", "Neuer")) list2mat(testList) testList2 <- list(c("A", "B", "C"), c("B", "C", "D"), c("D", "E", "F")) list2mat(testList2) testList3 <- list(Worker1=0:8L, Worker2=5:13L, Worker3=8:16L, Worker4=16:24L) list2mat(testList3)testList <- list(HSV=c("Adler", "Westermann", "Jansen"), FCB=c("Robben", "Jansen", "Neuer"), S04=c("Westermann", "Neuer")) list2mat(testList) testList2 <- list(c("A", "B", "C"), c("B", "C", "D"), c("D", "E", "F")) list2mat(testList2) testList3 <- list(Worker1=0:8L, Worker2=5:13L, Worker3=8:16L, Worker4=16:24L) list2mat(testList3)
Perform the GAGE analysis for EdgeResult and GmtList
logFCgage(edgeResult, gmtList)logFCgage(edgeResult, gmtList)
edgeResult |
An |
gmtList |
A |
A data.frame containing enrichment analysis results.
Merge CAMERA results using limma default parameters and biosCamera parameters
mergeCameraResults( matrix, index, designMatrix, contrast, featureLabels, weights = NULL, use.ranks = FALSE )mergeCameraResults( matrix, index, designMatrix, contrast, featureLabels, weights = NULL, use.ranks = FALSE )
matrix |
A numeric matrix, passed to |
index |
An index vector or a list of index vectors of features. |
designMatrix |
Design matrix. |
contrast |
A numeric vector of the same length as the number of columns in the design matrix, coefficients of contrasts. |
featureLabels |
A character vector of the same length as the number of rows of the matrix, feature labels, for instance gene symbols. |
weights |
NULL or numeric matrix of precision weights, passed to |
use.ranks |
Logical, passed to The function merges the output of |
A data.frame containing merged CAMERA results.
y <- matrix(rnorm(1000*6),1000,6) features <- sprintf("Feature%d", 1:nrow(y)) design <- cbind(Intercept=1,Group=c(0,0,0,1,1,1)) # First set of 20 genes are genuinely deferentially expressed index1 <- 1:20 y[index1,4:6] <- y[index1,4:6]+1 # The second set of 20 genes are not index2 <- 21:40 index1Res <- mergeCameraResults(y, index=index1, designMatrix=design, contrast=c(0,1), featureLabels=features) index1ListRes <- mergeCameraResults(y, index=list(index1), designMatrix=design, contrast=c(0,1), featureLabels=features) index12ListRes <- mergeCameraResults(y, index=list(index1, index2), designMatrix=design, contrast=c(0,1), featureLabels=features)y <- matrix(rnorm(1000*6),1000,6) features <- sprintf("Feature%d", 1:nrow(y)) design <- cbind(Intercept=1,Group=c(0,0,0,1,1,1)) # First set of 20 genes are genuinely deferentially expressed index1 <- 1:20 y[index1,4:6] <- y[index1,4:6]+1 # The second set of 20 genes are not index2 <- 21:40 index1Res <- mergeCameraResults(y, index=index1, designMatrix=design, contrast=c(0,1), featureLabels=features) index1ListRes <- mergeCameraResults(y, index=list(index1), designMatrix=design, contrast=c(0,1), featureLabels=features) index12ListRes <- mergeCameraResults(y, index=list(index1, index2), designMatrix=design, contrast=c(0,1), featureLabels=features)
Return the minimal FDR value from a FisherResultList
minFDRvalue(object)minFDRvalue(object)
object |
A FisherResultList object |
A numeric value
Return the minimal p-value from a FisherResultList
minPvalue(object)minPvalue(object)
object |
A FisherResultList object |
A numeric value
Wrap the gage::gage method to report consistent results as the CAMERA method
myGage(logFC, gmtList, ...)myGage(logFC, gmtList, ...)
logFC |
A named vector of logFC values of genes |
gmtList |
A |
... |
Other parameters passed to |
A data.frame containing enrichment analysis results.
Order strings by numbers in them
orderByNumberInStr(str, ...)orderByNumberInStr(str, ...)
str |
A vector of character trings |
... |
Passed to |
An integer vector of indices.
factorByNumberInStr, which makes factors with levels
ordered by numbers in the string
orderByNumberInStr(c("D1", "D10", "D15", "D3.5"))orderByNumberInStr(c("D1", "D10", "D15", "D3.5"))
CAMERA methodParse contributing genes by genesets from the result data.frame of the CAMERA method
parseCameraContributingGenes(cameraResTbl, genesets)parseCameraContributingGenes(cameraResTbl, genesets)
cameraResTbl |
A |
genesets |
Character strings, geneset labels |
A list of gene symbols, indexed by geneset names that are found in the results.
Parse contributing genes from the CAMERA output file
parseContributingGenes(str)parseContributingGenes(str)
str |
Character string, containing contributing genes |
A list of data.frames, each containing two columns,
Gene and Stat
parseContributingGenes("AKR1C4(-1.25), AKR1D1(-1.11)") parseContributingGenes(c("AKR1C4(-1.25), AKR1D1(-1.11)", "AKT1(1.24), AKT2(1.11), AKT3(1.05)"))parseContributingGenes("AKR1C4(-1.25), AKR1D1(-1.11)") parseContributingGenes(c("AKR1C4(-1.25), AKR1D1(-1.11)", "AKT1(1.24), AKT2(1.11), AKT3(1.05)"))
Parse contributing genes by genesets
parseGenesetsContributingGenes(str, genesets)parseGenesetsContributingGenes(str, genesets)
str |
Character strings, containing contributing genes |
genesets |
Character strings, geneset labels. Its length must match the
length of |
A data.frame containing genesets, genes, and statistics
parseGenesetsContributingGenes("AKR1C4(-1.25), AKR1D1(-1.11)", "Metabolism") parseGenesetsContributingGenes(c("AKR1C4(-1.25), AKR1D1(-1.11)", "AKT1(1.24), AKT2(1.11), AKT3(1.05)"), c("Metabolism", "AKTs"))parseGenesetsContributingGenes("AKR1C4(-1.25), AKR1D1(-1.11)", "Metabolism") parseGenesetsContributingGenes(c("AKR1C4(-1.25), AKR1D1(-1.11)", "AKT1(1.24), AKT2(1.11), AKT3(1.05)"), c("Metabolism", "AKTs"))
Parse an output directory of the Broad GSEA tool
parseGSEAdir(dir)parseGSEAdir(dir)
dir |
Character string, path to output directory |
An AnnoBroadGseaRes object
Pretty RONET Gene-set Names
prettyRonetGenesetNames(x, nchar = 50)prettyRonetGenesetNames(x, nchar = 50)
x |
Character strings, RONET gene-set names |
nchar |
Integer, number of chararacters to be displayed. |
Character strings
strs <- c("ARNT_GeneID405_negativeTargets", "Neurophysiological_process_nNOS_signaling_in_neuronal_synapses", "NR5A1_GeneID2516_allTargets", "IL4_GeneID3565_negativeTargets", "Apoptosis_REACTOME") prettyRonetGenesetNames(strs)strs <- c("ARNT_GeneID405_negativeTargets", "Neurophysiological_process_nNOS_signaling_in_neuronal_synapses", "NR5A1_GeneID2516_allTargets", "IL4_GeneID3565_negativeTargets", "Apoptosis_REACTOME") prettyRonetGenesetNames(strs)
Print a FisherResult object
## S3 method for class 'FisherResult' print(x, ...)## S3 method for class 'FisherResult' print(x, ...)
x |
A FisherResult object |
... |
Not used |
x, invisibly.
Print a FisherResultList object
## S3 method for class 'FisherResultList' print(x, ...)## S3 method for class 'FisherResultList' print(x, ...)
x |
A |
... |
Not used |
x, invisibly.
Print S3 object FishersMethodResult
## S3 method for class 'FishersMethodResult' print(x, ...)## S3 method for class 'FishersMethodResult' print(x, ...)
x |
An object of the |
... |
Not used |
x, invisibly.
Print contributing genes
printContributingGenes(geneLabels, geneValues)printContributingGenes(geneLabels, geneValues)
geneLabels |
A vector of character strings |
geneValues |
A vector of numeric values |
A vector of character strings
Return P-values
pValue(object, ...) ## S4 method for signature 'FisherResult' pValue(object) ## S4 method for signature 'FisherResultList' pValue(object, ind, ...) ## S4 method for signature 'FisherResult' fdrValue(object) ## S4 method for signature 'FisherResultList' fdrValue(object, ind, ...)pValue(object, ...) ## S4 method for signature 'FisherResult' pValue(object) ## S4 method for signature 'FisherResultList' pValue(object, ind, ...) ## S4 method for signature 'FisherResult' fdrValue(object) ## S4 method for signature 'FisherResultList' fdrValue(object, ind, ...)
object |
An object |
... |
Other parameters |
ind |
An integer or logical vector for subsetting |
A numeric vector of p-values.
pValue(FisherResult): Return the p-value from a FisherResult
pValue(FisherResultList): Return the p-values from a FisherResultList. If ind
is missing, all p-values are returned; otherwise, the subset indicated by
ind is returned.
fdrValue(FisherResult): Return the FDR-value from a FisherResult
fdrValue(FisherResultList): Return the FDR-values from a FisherResultList. If ind
is missing, all FDR-values are returned; otherwise, the subset indicated by
ind is returned.
Read CAMERA results into a tibble object
readCameraResults(file, minNGenes = 3, maxNGenes = 1000)readCameraResults(file, minNGenes = 3, maxNGenes = 1000)
file |
CAMERA results file |
minNGenes |
NULL or integer, genesets with fewer genes are filtered out |
maxNGenes |
NULL or integer, genesets with more genes are filtered out |
A tibble containing the CAMERA results.
In Roche Bioinformatics we use a default collection of gene-sets for gene-set enrichment analysis. This function loads this collection.
readDefaultGenesets(path, mps = FALSE)readDefaultGenesets(path, mps = FALSE)
path |
Character, path to the directory where the gmt files are stored |
mps |
Logical, whether molecular-phenotypic screening (MPS) genesets should be read in as pathway-centric namespaces ( |
The default collection includes both publicly available genesets as well as proprietary genesets, and therefore they are not included as part of the ribios package.
Publicly available genesets include
MSigDB: collections C2, C7 and Hallmark
RONET: which is a collection of publicly available pathway databases including REACTOME and NCI-Nature
goslim
A GmtList object containing the default gene-set collections.
## Not run: ## this cannot be run because the files are not located there ## readDefaultGenesets("/tmp/defaultGmts") ## End(Not run)## Not run: ## this cannot be run because the files are not located there ## readDefaultGenesets("/tmp/defaultGmts") ## End(Not run)
Read molecular-phenotyping genesets
readMPSGmt(file)readMPSGmt(file)
file |
GMT file which stores default molecular-phenotyping genesets |
A GmtList object containing molecular-phenotypic screening (MPS) categories and genes
Read RONET GMT files with namespace information
readRonetGmt(file)readRonetGmt(file)
file |
A GMT file in the RONET format, where in the 'desc' field a namespace is appended at the beginning, separated from the rest of the description with a pipe |
A GmtList object with an additional 'namespace' item in each list
Read significant CAMERA results into a tibble
readSigCameraResults( file, returnAllContrasts = TRUE, maxPValue = 0.01, minAbsEffectSize = 0.5, minNGenes = 5, maxNGenes = 200, excludeNamespace = c("goslim", "immunespace", "immunomics", "mbdisease", "mbpathology", "mbtoxicity", "msigdbC7", "msigdbC2", "MolecularPhenotyping") )readSigCameraResults( file, returnAllContrasts = TRUE, maxPValue = 0.01, minAbsEffectSize = 0.5, minNGenes = 5, maxNGenes = 200, excludeNamespace = c("goslim", "immunespace", "immunomics", "mbdisease", "mbpathology", "mbtoxicity", "msigdbC7", "msigdbC2", "MolecularPhenotyping") )
file |
A tsv file, output of |
returnAllContrasts |
Logical, if TRUE, results of all contrasts for gene-sets that are significant in at least one contrast are returned. |
maxPValue |
Numeric, max unadjusted P-value of CAMERA that is considered significant |
minAbsEffectSize |
Numeric, minimal absolute effect size |
minNGenes |
Integer, size of the smallest gene set that is considered |
maxNGenes |
Integer, size of the largest gene set that is considered |
excludeNamespace |
Character, vector of namespaces to be excluded |
A tibble containing filtered CAMERA results.
Read significant CAMERA results into a matrix
readSigCameraScoreMatrix(file, ...)readSigCameraScoreMatrix(file, ...)
file |
A tsv file, output of |
... |
passed to |
A numeric matrix with gene-sets as rows and contrasts as columns.
Rmove one or gene sets of the same source and user from GeMS
removeFromGeMS( setName = "", source = "", user = ribiosUtils::whoami(), subtype = "" )removeFromGeMS( setName = "", source = "", user = ribiosUtils::whoami(), subtype = "" )
setName |
A vector of character strings, defining set names to be renamed. They must all have the same |
source |
Character string, source of the gene set(s) |
user |
Character string, user name |
subtype |
Character string, subtype of the gene set(s) |
Response code or error message returned by the GeMS API. A value of 200 indicates a successful insertion.
## Not run: testList <- list(list(name="GS_A", desc=NULL, genes=c("MAPK14", "JAK1", "EGFR")), list(name="GS_B", desc="gene set B", genes=c("ABCA1", "DDR1", "DDR2")), list(name="GS_C", desc="gene set C", genes=NULL)) testGmt <- BioQC::GmtList(testList) ## insertGmtListToGeMS(testGmt, geneFormat=0, source="Test") ## removeFromGeMS(setName=c("GS_A", "GS_B", "GS_C"), source="Test") ## End(Not run)## Not run: testList <- list(list(name="GS_A", desc=NULL, genes=c("MAPK14", "JAK1", "EGFR")), list(name="GS_B", desc="gene set B", genes=c("ABCA1", "DDR1", "DDR2")), list(name="GS_C", desc="gene set C", genes=NULL)) testGmt <- BioQC::GmtList(testList) ## insertGmtListToGeMS(testGmt, geneFormat=0, source="Test") ## removeFromGeMS(setName=c("GS_A", "GS_B", "GS_C"), source="Test") ## End(Not run)
Message body to remove one or gene sets of the same source and user from GeMS
removeFromGeMSBody( setName = "", source = "", user = ribiosUtils::whoami(), subtype = "" )removeFromGeMSBody( setName = "", source = "", user = ribiosUtils::whoami(), subtype = "" )
setName |
A vector of character strings, defining set names to be renamed. They must all have the same |
source |
Character string, source of the gene set(s) |
user |
Character string, user name |
subtype |
Character string, subtype of the gene set(s) |
A list of genesets to be removed, to be sent as message body
removeFromGeMSBody(setName=c("GS_A", "GS_B", "GS_C"), source="Test")removeFromGeMSBody(setName=c("GS_A", "GS_B", "GS_C"), source="Test")
Extract gene-set namespace from RONET GMT files
ronetGeneSetNamespace(gmtList)ronetGeneSetNamespace(gmtList)
gmtList |
A |
Character vector of the same length, indicating categorie
Show a anonBroadGseaRes object
## S4 method for signature 'AnnoBroadGseaRes' show(object)## S4 method for signature 'AnnoBroadGseaRes' show(object)
object |
A AnnoBroadGseaRes object export |
Show an AnnoBroadGseaResItem object
## S4 method for signature 'AnnoBroadGseaResItem' show(object)## S4 method for signature 'AnnoBroadGseaResItem' show(object)
object |
An annoBroadGseaResItem object export |
Show a BroadGseaResItem object
## S4 method for signature 'BroadGseaResItem' show(object)## S4 method for signature 'BroadGseaResItem' show(object)
object |
A BroadGseaResItem object export |
Return names of gene-sets that are significantly enriched given the FDR threshold
sigGeneSet(object, fdr)sigGeneSet(object, fdr)
object |
A FisherResultList object |
fdr |
Numeric, FDR value threshold |
A character vector
Return a data.frame of significantly enriched gene-sets
sigGeneSetTable(object, fdr)sigGeneSetTable(object, fdr)
object |
A FisherResultList object |
fdr |
Numeric, FDR value threshold |
A data.frame
Return a data.frame of top gene-sets with the lowest p-values
topGeneSetTable(object, N)topGeneSetTable(object, N)
object |
An FisherResultList object |
N |
Integer, the number of returned gene-sets |
A data.frame
Return a data.frame of significantly enriched gene-sets with a minimum number
topOrSigGeneSetTable(object, fdr = 0.05, N = 10)topOrSigGeneSetTable(object, fdr = 0.05, N = 10)
object |
An FisherResultList object |
fdr |
Numeric, the treshold of FDR value |
N |
Integer, the number of returned gene-sets
The total number of returned gene-sets are determined by the maximum of
|
A data.frame.
Write an GmtList object into a file
writeGmt(gmtList, file)writeGmt(gmtList, file)
gmtList |
A |
file |
Character string, output file name |
Invisibly returns NULL. Called for its side effect of writing
the GMT file.
The function will be moved to BioQC once the ribiosIO is reposited in CRAN
gmtFile <- system.file("extdata", "example.gmt", package="ribiosGSEA") mySet <- BioQC::readGmt(gmtFile)[1:5] myTempFile <- tempfile() writeGmt(mySet, file=myTempFile) readLines(myTempFile)gmtFile <- system.file("extdata", "example.gmt", package="ribiosGSEA") mySet <- BioQC::readGmt(gmtFile)[1:5] myTempFile <- tempfile() writeGmt(mySet, file=myTempFile) readLines(myTempFile)
Calculate mid-p quantile residuals
zscoreDGE(y, design = NULL, contrast = ncol(design))zscoreDGE(y, design = NULL, contrast = ncol(design))
y |
An DGEList object |
design |
Design matrix |
contrast |
Contrast vector The function is a carbon copy of edgeR:::.zscoreDGE, which is unfortunately not exported |
A numeric matrix of z-scores with the same dimensions as the input count matrix.
dgeMatrix <- matrix(rpois(1200, 10), nrow=200) dgeList <- DGEList(dgeMatrix) dgeList <- edgeR::estimateCommonDisp(dgeList) dgeDesign <- model.matrix(~gl(2,3)) dgeZscore <- zscoreDGE(dgeList, dgeDesign, contrast=c(0,1)) head(dgeZscore)dgeMatrix <- matrix(rpois(1200, 10), nrow=200) dgeList <- DGEList(dgeMatrix) dgeList <- edgeR::estimateCommonDisp(dgeList) dgeDesign <- model.matrix(~gl(2,3)) dgeZscore <- zscoreDGE(dgeList, dgeDesign, contrast=c(0,1)) head(dgeZscore)