| Title: | Differential gene expression analysis for next-generation sequencing data in the 'ribios' suite |
|---|---|
| Description: | Provides a comprehensive workflow for differential gene expression analysis of RNA-seq data using 'edgeR' and 'limma'-voom pipelines. The package supports count filtering, normalization, surrogate variable analysis, batch correction, and visualization including volcano plots, smear plots, and PCA. It integrates with the 'ribios' suite of packages and is designed for reproducible bioinformatics analyses. |
| Authors: | Jitao David Zhang [aut, cre, ctb] (ORCID: <https://orcid.org/0000-0002-3085-0909>), Ailu Mading [ctb] |
| Maintainer: | Jitao David Zhang <[email protected]> |
| License: | GPL-3 |
| Version: | 1.0.0 |
| Built: | 2026-05-16 08:40:19 UTC |
| Source: | https://github.com/bedapub/ribiosNGS |
Annotate an EdgeObject
## S4 method for signature 'EdgeObject,character,logical' annotate(object, target, check.target)## S4 method for signature 'EdgeObject,character,logical' annotate(object, target, check.target)
object |
An EdgeObject. |
target |
Character, target of annotation. |
check.target |
Logical, check whether the target is valid or not. |
An annotated EdgeObject.
Annotate an EdgeObject, without checking the target
## S4 method for signature 'EdgeObject,character,missing' annotate(object, target)## S4 method for signature 'EdgeObject,character,missing' annotate(object, target)
object |
An EdgeObject |
target |
Character, target of annotation |
An annotated EdgeObject.
Annotate an EdgeObject automatically without checking the target
## S4 method for signature 'EdgeObject,missing,missing' annotate(object)## S4 method for signature 'EdgeObject,missing,missing' annotate(object)
object |
An EdgeObject |
An annotated EdgeObject.
Get annotation information from an EdgeObject
## S4 method for signature 'EdgeObject' annotation(object)## S4 method for signature 'EdgeObject' annotation(object)
object |
An |
A character string, indicating the annotation type
Append degree of freedom and pseudo t-statistics to dgeTable
appendPseudoT(edgeResult, dgeTable)appendPseudoT(edgeResult, dgeTable)
edgeResult |
An |
dgeTable |
A |
A new data.frame with two new columns, df and pseudoT, containing degree of freem and the (pseudo) t-statistic, respective.y
The function relies on the fact that the degree of freedom ('df.residual' in GLM result) is the same for all genes.
If this is not the case, it will use the 'FeatureName' column in gene annotation to match the degree of freedoms.
The function is only used internally by dgeTablesWithPseudoT and dgeTableWithPseudoT.
the lower.tail option in qt is not vectorized, therefore the sign should be provided separately.
Append ranks to dgeTbl
appendRanks(dgeTbl)appendRanks(dgeTbl)
dgeTbl |
A dgeTbl, a data.frame that at least contain following columns
|
The dgeTble with four columns appended: rank_logFC,
rank_PValue, rank_absLogFC, and total_features, and
sorted by increasing P-values
myTbl <- data.frame(GeneSymbol=LETTERS[1:5], logFC=rnorm(5), PValue=runif(5, 0, 1)) appendRanks(myTbl)myTbl <- data.frame(GeneSymbol=LETTERS[1:5], logFC=rnorm(5), PValue=runif(5, 0, 1)) appendRanks(myTbl)
Append zscores to dgeTable
appendZScore(dgeTable)appendZScore(dgeTable)
dgeTable |
A |
A new data.frame with one new column, zScore, containing the z-score transformed from p-values. If that column exists, it will be rewritten.
The function is similar to appendPseudoT, with the difference that the Gaussian distribution is used instead of the t-distribution. This solution
delivers less extreme values, because t-distribution is heavy-tailed. It allows comparison between studies of different sample sizes.
Assert that the input data.frame is a valid EdgeTopTable
assertEdgeToptable(x)assertEdgeToptable(x)
x |
A data.frame |
Logical
Get aveExpr threshold in LimmaSigFilter
aveExpr(limmaSigFilter)aveExpr(limmaSigFilter)
limmaSigFilter |
An |
Numeric values of the logCPM filter
Return a data.frame of BCV values
BCV(x) ## S4 method for signature 'DGEList' BCV(x) ## S4 method for signature 'EdgeResult' BCV(x)BCV(x) ## S4 method for signature 'DGEList' BCV(x) ## S4 method for signature 'EdgeResult' BCV(x)
x |
An object |
A data.frame of BCV values.
BCV(DGEList): Method for DGEList
BCV(EdgeResult): Method for EdgeResult
Boxplot of an EdgeObject
## S4 method for signature 'EdgeObject' boxplot( x, type = c("normFactors", "modLogCPM"), xlab = "", ylab = NULL, main = "", ... )## S4 method for signature 'EdgeObject' boxplot( x, type = c("normFactors", "modLogCPM"), xlab = "", ylab = NULL, main = "", ... )
x |
An EdgeObject |
type |
The type of boxplot: 'normFactors' and 'modLogCPM' are supported. |
xlab |
Character, xlab. |
ylab |
Character, ylab. |
main |
Character, title. |
... |
Passed to |
Called for its side effect of plotting; returns invisibly NULL.
Calculate normalisation factor if not
calcNormFactorsIfNot(dgeList)calcNormFactorsIfNot(dgeList)
dgeList |
A |
Updated dgeList object with norm.factors filled
Check sample annotation data.frame or tibble meets the requirement of the biokit pipeline
checkBiokitSampleAnnotation(df)checkBiokitSampleAnnotation(df)
df |
Sample annotation, can be either a |
Invisible NULL if the requirement is met, otherwise an error
is printed and the function stops
The biokit pipeline requires that values in each column contain no empty spaces. This function ensures that.
writeBiokitSampleAnnotation, which calls this
function to ensure that the sample annotation file is ok
test <- data.frame(Char=LETTERS[1:6], Integer=1:6, Number=pi*1:6, Factor=gl(2,3, labels = c("level 1", "level 2")), stringsAsFactors=FALSE) testthat::expect_error(checkBiokitSampleAnnotation(test), regexp = "level 1.*level 2") testFix <- data.frame(Char=LETTERS[1:6], Integer=1:6, Number=pi*1:6, Factor=gl(2,3, labels = c("level1", "level2")), stringsAsFactors=FALSE) testthat::expect_silent(checkBiokitSampleAnnotation(testFix)) if(require("tibble")) { testthat::expect_error(checkBiokitSampleAnnotation(tibble::as_tibble(test)), regexp = "level 1.*level 2") testthat::expect_silent(checkBiokitSampleAnnotation(tibble::as_tibble(testFix))) }test <- data.frame(Char=LETTERS[1:6], Integer=1:6, Number=pi*1:6, Factor=gl(2,3, labels = c("level 1", "level 2")), stringsAsFactors=FALSE) testthat::expect_error(checkBiokitSampleAnnotation(test), regexp = "level 1.*level 2") testFix <- data.frame(Char=LETTERS[1:6], Integer=1:6, Number=pi*1:6, Factor=gl(2,3, labels = c("level1", "level2")), stringsAsFactors=FALSE) testthat::expect_silent(checkBiokitSampleAnnotation(testFix)) if(require("tibble")) { testthat::expect_error(checkBiokitSampleAnnotation(tibble::as_tibble(test)), regexp = "level 1.*level 2") testthat::expect_silent(checkBiokitSampleAnnotation(tibble::as_tibble(testFix))) }
Check a contrast matrix to make sure that it is likely o.k.
checkContrastNames(contrastMatrix, action = c("message", "warning", "error"))checkContrastNames(contrastMatrix, action = c("message", "warning", "error"))
contrastMatrix |
A contrast matrix |
action |
Character strings, the action to perform in case the names show irregularities Right now, the function checks no column names contain the equal sign. |
Invisibly returns NULL. Called for its side effect of issuing
a message, warning, or error if column names contain equal signs.
testDesign <- cbind(Control=rep(1,8), Treatment=rep(c(0,1),4), Batch=rep(c(0, 1), each=4)) problemContrast <- limma::makeContrasts("Treatment"="Treatment", "Batch=Batch", ## problematic levels=testDesign) checkContrastNames(problemContrast, action="message") if(requireNamespace("testthat")) { testthat::expect_warning(checkContrastNames(problemContrast, action="warning")) testthat::expect_error(checkContrastNames(problemContrast, action="error")) }testDesign <- cbind(Control=rep(1,8), Treatment=rep(c(0,1),4), Batch=rep(c(0, 1), each=4)) problemContrast <- limma::makeContrasts("Treatment"="Treatment", "Batch=Batch", ## problematic levels=testDesign) checkContrastNames(problemContrast, action="message") if(requireNamespace("testthat")) { testthat::expect_warning(checkContrastNames(problemContrast, action="warning")) testthat::expect_error(checkContrastNames(problemContrast, action="error")) }
Common biological coefficients of variance (BCV)
commonBCV(x) ## S4 method for signature 'DGEList' commonBCV(x) ## S4 method for signature 'EdgeResult' commonBCV(x)commonBCV(x) ## S4 method for signature 'DGEList' commonBCV(x) ## S4 method for signature 'EdgeResult' commonBCV(x)
x |
An object |
A numeric value of the common BCV.
commonBCV(DGEList): method for DGEList
commonBCV(EdgeResult): method for EdgeResult
Common dispersion
commonDisp(object) ## S4 method for signature 'DGEList' commonDisp(object) ## S4 method for signature 'EdgeObject' commonDisp(object)commonDisp(object) ## S4 method for signature 'DGEList' commonDisp(object) ## S4 method for signature 'EdgeObject' commonDisp(object)
object |
An object |
A numeric value of the common dispersion.
commonDisp(DGEList): Method for DGEList
commonDisp(EdgeObject): Method for EdgeObject
Set common dispersion
commonDisp(object) <- value ## S4 replacement method for signature 'DGEList,numeric' commonDisp(object) <- value ## S4 replacement method for signature 'EdgeObject,numeric' commonDisp(object) <- valuecommonDisp(object) <- value ## S4 replacement method for signature 'DGEList,numeric' commonDisp(object) <- value ## S4 replacement method for signature 'EdgeObject,numeric' commonDisp(object) <- value
object |
An object |
value |
Numeric value |
The updated object with the common dispersion set.
commonDisp(object = DGEList) <- value: Method for DGEList
commonDisp(object = EdgeObject) <- value: Method for EdgeObject
Assign contrast matrix
contrastMatrix(object) <- value ## S4 replacement method for signature 'EdgeObject,matrix' contrastMatrix(object) <- valuecontrastMatrix(object) <- value ## S4 replacement method for signature 'EdgeObject,matrix' contrastMatrix(object) <- value
object |
An object |
value |
Matrix |
The updated object with the new contrast matrix.
contrastMatrix(object = EdgeObject) <- value: Method for EdgeObject
Extract contrast matrix from an EdgeObject object
## S4 method for signature 'EdgeObject' contrastMatrix(object)## S4 method for signature 'EdgeObject' contrastMatrix(object)
object |
An EdgeObject object |
A contrast matrix.
Extract contrast matrix from an EdgeResult object
## S4 method for signature 'EdgeResult' contrastMatrix(object)## S4 method for signature 'EdgeResult' contrastMatrix(object)
object |
An EdgeResult object |
A contrast matrix.
Extract contrast names from an EdgeObject object
## S4 method for signature 'EdgeObject' contrastNames(object)## S4 method for signature 'EdgeObject' contrastNames(object)
object |
An EdgeObject object |
A character vector of contrast names.
Extract contrast names from an EdgeResult object
## S4 method for signature 'EdgeResult' contrastNames(object)## S4 method for signature 'EdgeResult' contrastNames(object)
object |
An EdgeResult object |
A character vector of contrast names.
Extract contrast sample indices
## S4 method for signature 'EdgeResult,character' contrastSampleIndices(object, contrast)## S4 method for signature 'EdgeResult,character' contrastSampleIndices(object, contrast)
object |
An EdgeResult object. |
contrast |
Character, indicating the contrast of interest. |
A list of integer vectors indicating sample indices for each group in the contrast.
Extract contrast sample indices
## S4 method for signature 'EdgeResult,integer' contrastSampleIndices(object, contrast)## S4 method for signature 'EdgeResult,integer' contrastSampleIndices(object, contrast)
object |
An EdgeResult object. |
contrast |
Character, indicating the contrast of interest. |
A list of integer vectors indicating sample indices for each group in the contrast.
Sample counts by group
countByGroup(edgeObj) maxCountByGroup(edgeObj) hasNoReplicate(edgeObj)countByGroup(edgeObj) maxCountByGroup(edgeObj) hasNoReplicate(edgeObj)
edgeObj |
An EdgeObject object |
A named vector if integers, sample counts by group
maxCountByGroup(): Returns the max count
hasNoReplicate(): Returns TRUE if the largest group has only one sample
Object that contains count data, dgeTables, and sigFilter
dgeTablesA list of dgeTable
sigFilterSignificantly regulated gene filter
The object is used only for inheritance
Return counts in a DGEList object
## S4 method for signature 'DGEList' counts(object)## S4 method for signature 'DGEList' counts(object)
object |
A |
A numeric matrix of counts.
Return counts in EdgeObject
## S4 method for signature 'EdgeObject' counts(object, filter = TRUE)## S4 method for signature 'EdgeObject' counts(object, filter = TRUE)
object |
An EdgeObject |
filter |
Logical, whether filtered matrix (by default) or unfiltered matrix should be returned |
A numeric matrix of counts.
Apply SVA to transformed count data and return the transformed matrix removing the effect of surrogate variables
countsRemoveSV( counts, designMatrix, transformFunc = function(counts, designMatrix) voom(counts, designMatrix)$E )countsRemoveSV( counts, designMatrix, transformFunc = function(counts, designMatrix) voom(counts, designMatrix)$E )
counts |
A matrix of counts |
designMatrix |
Design matrix |
transformFunc |
A function to transform the count data |
The expression matrix, with SV effects removed
exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) head(countsRemoveSV(exCounts, designMatrix=exDesign))exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) head(countsRemoveSV(exCounts, designMatrix=exDesign))
Apply SVA to transformed count data
countsSVA( counts, designMatrix, transformFunc = function(counts, designMatrix) voom(counts, designMatrix)$E, ... )countsSVA( counts, designMatrix, transformFunc = function(counts, designMatrix) voom(counts, designMatrix)$E, ... )
counts |
A matrix of counts |
designMatrix |
Design matrix |
transformFunc |
A function to transform the count data |
... |
Passed to |
The SV matrix
set.seed(1887) exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) countsSVA(exCounts, designMatrix=exDesign)set.seed(1887) exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) countsSVA(exCounts, designMatrix=exDesign)
cpm for EdgeObject
## S3 method for class 'EdgeObject' cpm(y, ...)## S3 method for class 'EdgeObject' cpm(y, ...)
y |
An EdgeObject object |
... |
Passed to |
A numeric matrix of counts per million values.
Filter by counts per million (cpm)
cpmFilter(object)cpmFilter(object)
object |
An object |
The filtered object.
Apply cpm to voom-transformed count data, and return the voom expression matrix with surrogate variables' effect removed
cpmRemoveSV(counts, designMatrix)cpmRemoveSV(counts, designMatrix)
counts |
A matrix of counts |
designMatrix |
Design matrix |
The cpm matrix, with SV effects removed
exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) head(cpmRemoveSV(exCounts, designMatrix=exDesign)) ## compare the results without SV removal, note the values in the second ## and third column are much larger than the rest head(cpm(exCounts))exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) head(cpmRemoveSV(exCounts, designMatrix=exDesign)) ## compare the results without SV removal, note the values in the second ## and third column are much larger than the rest head(cpm(exCounts))
Apply SVA to cpm-transformed count data
cpmSVA(counts, designMatrix)cpmSVA(counts, designMatrix)
counts |
A matrix of counts |
designMatrix |
Design matrix |
The SV matrix
exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) cpmSVA(exCounts, designMatrix=exDesign)exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) cpmSVA(exCounts, designMatrix=exDesign)
Custom smear plot
customSmearPlot( tbl, main, xlab, ylab, pch = 19, cex = 0.2, smearWidth = 0.5, panel.first = grid(), smooth.scatter = FALSE, lowess = FALSE, ... )customSmearPlot( tbl, main, xlab, ylab, pch = 19, cex = 0.2, smearWidth = 0.5, panel.first = grid(), smooth.scatter = FALSE, lowess = FALSE, ... )
tbl |
A |
main |
Character string, title of the plot |
xlab |
Character string, xlab |
ylab |
Character string, ylab |
pch |
Point symbol |
cex |
Font size |
smearWidth |
Smear with |
panel.first |
Passed to |
smooth.scatter |
Passed to |
lowess |
Passed to |
... |
Passed to |
Called for its side effect of plotting; returns invisibly NULL.
Retrieve the design/contrast object
designContrast(edgeObject)designContrast(edgeObject)
edgeObject |
An |
A DesignContrast object.
designContrast(exampleEdgeObject())designContrast(exampleEdgeObject())
Assign the design matrix
designMatrix(object) <- value ## S4 replacement method for signature 'EdgeObject,matrix' designMatrix(object) <- valuedesignMatrix(object) <- value ## S4 replacement method for signature 'EdgeObject,matrix' designMatrix(object) <- value
object |
An object |
value |
Matrix |
The updated object with the new design matrix.
designMatrix(object = EdgeObject) <- value: Method for EdgeObject
Extract design matrix from an EdgeObject object
## S4 method for signature 'EdgeObject' designMatrix(object)## S4 method for signature 'EdgeObject' designMatrix(object)
object |
An EdgeObject object |
A design matrix.
Extract design matrix from an EdgeResult object
## S4 method for signature 'EdgeResult' designMatrix(object)## S4 method for signature 'EdgeResult' designMatrix(object)
object |
An EdgeResult object |
A design matrix.
Return the dgeGML method
dgeGML(edgeResult)dgeGML(edgeResult)
edgeResult |
An |
A DGEGLM object.
Extract DGEList from an EdgeObject
Extract DGEList from an EdgeResult object
dgeList(object) ## S4 method for signature 'EdgeObject' dgeList(object) ## S4 method for signature 'EdgeResult' dgeList(object)dgeList(object) ## S4 method for signature 'EdgeObject' dgeList(object) ## S4 method for signature 'EdgeResult' dgeList(object)
object |
An EdgeResult object |
A DGEList object.
dgeList(EdgeObject): Extract DGEList from EdgeObject
dgeList(EdgeResult): Extract DGEList from EdgeResult
Construct a DGEListList object
DGEListList(...)DGEListList(...)
... |
A list of DGEListList objects, can be passed as individual objects or in a list |
A DGEListList object.
An S4 class to represent a list of DGEListList objects
Convert a DGEList object to a long data.frame containing expression, feature annotation, and sample annotation
DGEListToLongTable(x, exprsFun = function(dgeList) cpm(dgeList, log = TRUE))DGEListToLongTable(x, exprsFun = function(dgeList) cpm(dgeList, log = TRUE))
x |
A |
exprsFun |
A function to convert counts to expression data. Default: logCPM |
A long-format data.frame with expression values and annotations.
Columns with empty names will be discard.
mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) y <- edgeR::DGEList(counts=mat, group=rep(1:2, each=5)) DGEListToLongTable(y)mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) y <- edgeR::DGEList(counts=mat, group=rep(1:2, each=5)) DGEListToLongTable(y)
Return the top table in a unified format
dgeTable(edgeResult, contrast = NULL)dgeTable(edgeResult, contrast = NULL)
edgeResult |
An |
contrast |
Character, contrast name of interest. If |
A data.frame
Return the top tables of specified contrast(s) in a list
dgeTableList(edgeResult, contrast = NULL)dgeTableList(edgeResult, contrast = NULL)
edgeResult |
An |
contrast |
Character, contrast name(s) of interest. If |
A list of data.frames
Return a list of differential gene expression tables
dgeTables(edgeResult)dgeTables(edgeResult)
edgeResult |
An |
A list of data.frames, each containing the DGEtable for one contrast.
dgeTable which returns one data.frame for one or more given contrasts.
Append dgeTables with pseudo t-statistic
dgeTablesWithPseudoT(edgeResult)dgeTablesWithPseudoT(edgeResult)
edgeResult |
An |
Similar as dgeTables, a list of data.frame, but with additional columns df (degree of freedom) and pseudoT.
The pseudo t-statistic is calculated based on the P-value of the likelihood ratio test and the residual degree of freedom by the function. qt, and its sign is given by the sign of logFC.
Append dgeTables with z-scores
dgeTablesWithZScore(edgeResult)dgeTablesWithZScore(edgeResult)
edgeResult |
An |
Similar as dgeTables, a list of data.frame, but with an additional column zScore.
appendZScore, dgeTableWithZScore, dgeTableWithPseudoT
Append dgeTable with pseudo t-statistic
dgeTableWithPseudoT(edgeResult, contrast = NULL)dgeTableWithPseudoT(edgeResult, contrast = NULL)
edgeResult |
An |
contrast |
A character string, or integer index, or |
Similar as dgeTable, a data.frame, but with additional columns df (degree of freedom) and pseudoT.
The pseudo t-statistic is calculated based on the P-value of the likelihood ratio test and the residual degree of freedom by the function qt, and its sign is given by the sign of logFC.
dgeTablesWithPseudoT, dgeTable
Append dgeTable with z-scores
dgeTableWithZScore(edgeResult, contrast = NULL)dgeTableWithZScore(edgeResult, contrast = NULL)
edgeResult |
An |
contrast |
A character string, or integer index, or |
Similar as dgeTable, a data.frame, with an additional column zScore.
dgeTablesWithZScore, dgeTable, dgeTableWithPseudoT
Perform differential gene expression analysis with edgeR
dgeWithEdgeR(edgeObj)dgeWithEdgeR(edgeObj)
edgeObj |
An object of The function performs end-to-end differential gene expression (DGE) analysis with common best practice using edgeR |
An EdgeResult object
exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(Identifier=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- dgeWithEdgeR(exObj) dgeTable(exDgeRes)exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(Identifier=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- dgeWithEdgeR(exObj) dgeTable(exDgeRes)
Perform differential gene expression analysis with edgeR-limma
dgeWithLimmaVoom(edgeObj, ...)dgeWithLimmaVoom(edgeObj, ...)
edgeObj |
An object of |
... |
Passed to The function performs end-to-end differential gene expression (DGE) analysis with common best practice using voom-limma |
A LimmaVoomResult object.
set.seed(1887) exObj <- exampleEdgeObject() exLimmaVoomRes <- dgeWithLimmaVoom(exObj) dgeTable(exLimmaVoomRes) ## compare with edgeR dgeTable(dgeWithEdgeR(exObj)) ## LimmaVoomResult can be also used with exportEdgeResult exportEdgeResult(exLimmaVoomRes, tempdir(), "overwrite")set.seed(1887) exObj <- exampleEdgeObject() exLimmaVoomRes <- dgeWithLimmaVoom(exObj) dgeTable(exLimmaVoomRes) ## compare with edgeR dgeTable(dgeWithEdgeR(exObj)) ## LimmaVoomResult can be also used with exportEdgeResult exportEdgeResult(exLimmaVoomRes, tempdir(), "overwrite")
Dimensions of an EdgeResults
## S3 method for class 'EdgeResult' dim(x)## S3 method for class 'EdgeResult' dim(x)
x |
An EdgeResult object |
An integer vector of length two (features, samples).
Get display labels of sample groups
## S4 method for signature 'EdgeObject' dispGroups(object)## S4 method for signature 'EdgeObject' dispGroups(object)
object |
An |
A character vector of display labels for sample groups.
Perform surrogate variable analysis (SVA) to an EdgeObject object
doSVA(edgeObj, transform = c("voom", "cpm"))doSVA(edgeObj, transform = c("voom", "cpm"))
edgeObj |
An |
transform |
Function name to perform transformation, currently supported values include voom and cpm The count data associated with the EdgeObject object is first transformed, and surrogate variables are estimated from the transformed data. Correspondingly the design matrix and contrast matrix associated with the object are updated, too. |
An updated EdgeObject with the design and contrast matrices
augmented by surrogate variables, if any are detected.
set.seed(1887) exMat <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exMat[1:100,2:3] <- exMat[1:100, 2:3]+20 exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exSVAobj <- doSVA(exObj, transform="voom") designMatrix(exSVAobj) contrastMatrix(exSVAobj) ## Note that the SVA is sensitive against parameterisation, see ## the example below. Also notice that in the zero-intercept parameterisation, ## the SVA does not give meaningful results. designMatrix(exObj) <- model.matrix(~0+exGroups) designMatrix(doSVA(exObj, transform="voom"))set.seed(1887) exMat <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exMat[1:100,2:3] <- exMat[1:100, 2:3]+20 exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exSVAobj <- doSVA(exObj, transform="voom") designMatrix(exSVAobj) contrastMatrix(exSVAobj) ## Note that the SVA is sensitive against parameterisation, see ## the example below. Also notice that in the zero-intercept parameterisation, ## the SVA does not give meaningful results. designMatrix(exObj) <- model.matrix(~0+exGroups) designMatrix(doSVA(exObj, transform="voom"))
Construct an EdgeObject object by a count matrix and DesignContrast
EdgeObject(object, designContrast, ...)EdgeObject(object, designContrast, ...)
object |
A matrix containing counts of features |
designContrast |
A |
... |
Other parameters |
An EdgeObject.
exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(Identifier=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon) exObj2 <- EdgeObject(exMat, exDescon, fData=exFdata) exObj3 <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) fData(exObj3) dgeList <- edgeR::DGEList(counts=exMat, samples=exPdata, genes=exFdata) exObj4 <- EdgeObject(dgeList, exDescon) ## note that pData are appended after count information pData(exObj2) pData(exObj3)exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(Identifier=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon) exObj2 <- EdgeObject(exMat, exDescon, fData=exFdata) exObj3 <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) fData(exObj3) dgeList <- edgeR::DGEList(counts=exMat, samples=exPdata, genes=exFdata) exObj4 <- EdgeObject(dgeList, exDescon) ## note that pData are appended after count information pData(exObj2) pData(exObj3)
EdgeObject argumenting DGEList by including designContrast information
## S4 method for signature 'matrix,DesignContrast' EdgeObject( object, designContrast, fData = NULL, pData = NULL, remove.zeros = FALSE ) ## S4 method for signature 'FeatAnnoExprs,DesignContrast' EdgeObject(object, designContrast, pData = NULL, remove.zeros = FALSE) ## S4 method for signature 'DGEList,DesignContrast' EdgeObject(object, designContrast)## S4 method for signature 'matrix,DesignContrast' EdgeObject( object, designContrast, fData = NULL, pData = NULL, remove.zeros = FALSE ) ## S4 method for signature 'FeatAnnoExprs,DesignContrast' EdgeObject(object, designContrast, pData = NULL, remove.zeros = FALSE) ## S4 method for signature 'DGEList,DesignContrast' EdgeObject(object, designContrast)
object |
A |
designContrast |
A |
fData |
A |
pData |
A |
remove.zeros |
Logical, whether to remove rows that have 0 total count |
EdgeObject(object = matrix, designContrast = DesignContrast): The method for matrix as input
EdgeObject(object = FeatAnnoExprs, designContrast = DesignContrast): The method for FeatAnnoExprs as input
EdgeObject(object = DGEList, designContrast = DesignContrast): The method for DGEList as input
dgeListA DGEList object
designContrastA designContrast object
Export an DGEList, designMatrix, and contrastMatrix to files and return the command to run the edgeR script
edgeRcommand( dgeList, designMatrix, contrastMatrix, outdir = "edgeR_output", outfilePrefix = "an-unnamed-project-", mps = FALSE, limmaVoom = FALSE, appendGmt = NULL, debug = FALSE, rootPath = "/pstore/apps/bioinfo/geneexpression/", contrastAnno = NULL )edgeRcommand( dgeList, designMatrix, contrastMatrix, outdir = "edgeR_output", outfilePrefix = "an-unnamed-project-", mps = FALSE, limmaVoom = FALSE, appendGmt = NULL, debug = FALSE, rootPath = "/pstore/apps/bioinfo/geneexpression/", contrastAnno = NULL )
dgeList |
An |
designMatrix |
The design matrix to model the data |
contrastMatrix |
The contrast matrix matching the design matrix |
outdir |
Output directory of the edgeR script. Default value "edgeR_output". |
outfilePrefix |
Prefix of the output files, for instance a reasonable
name of the project, to identify the files uniquely. The files will be written in
|
mps |
Logical, whether molecular-phenotyping analysis is run. |
limmaVoom |
Logical, whether the limma-voom model is run instead of the edgeR model |
appendGmt |
|
debug |
Logical, if |
rootPath |
Character, the root path of the script |
contrastAnno |
A |
A character string containing the command to run the edgeR script.
Following checks are done internally:
The design matrix must have the same number of rows as the columns of the count matrix.
The contrast matrix must have the same number of rows as the columns of the design matrix.
Row names of the design matrix match the column names of the expression matrix. In case of suspect, the program will stop and report.
The output file names start with the outfilePrefix, followed by '-' and customed file suffixes.
mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) myFac <- gl(2,5, labels=c("Control", "Treatment")) y <- edgeR::DGEList(counts=mat, group=myFac) myDesign <- model.matrix(~myFac); colnames(myDesign) <- levels(myFac) myContrast <- limma::makeContrasts(Treatment, levels=myDesign) edgeRcommand(y, designMatrix=myDesign, contrastMatrix=myContrast, outfilePrefix="test", outdir=tempdir()) ## clean up unlink(file.path(tempdir(), "input_data"), recursive=TRUE)mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) myFac <- gl(2,5, labels=c("Control", "Treatment")) y <- edgeR::DGEList(counts=mat, group=myFac) myDesign <- model.matrix(~myFac); colnames(myDesign) <- levels(myFac) myContrast <- limma::makeContrasts(Treatment, levels=myDesign) edgeRcommand(y, designMatrix=myDesign, contrastMatrix=myContrast, outfilePrefix="test", outdir=tempdir()) ## clean up unlink(file.path(tempdir(), "input_data"), recursive=TRUE)
Return a list of differential gene expression tables
EdgeResult(edgeObj, dgeGLM, dgeTables)EdgeResult(edgeObj, dgeGLM, dgeTables)
edgeObj |
An EdgeObject |
dgeGLM |
A DGEGLM object |
dgeTables |
A list of DGEtables. |
An EdgeResult object
Object that contains test results, dgeTable, and SigFilter
dgeGLMA DGEGLM class object that contains GLM test results
dgeTablesA list of dgeTable
sigFilterSignificantly regulated gene filter
Extends BaseSigFilter to filter genes base on logCPM and LR
logCPMNumeric, logCPM threshold (larger values are kept)
Transform an EexpressionSet to a DGEList object
eset2DGEList(eset, groupVar = "group")eset2DGEList(eset, groupVar = "group")
eset |
An |
groupVar |
Character string, column in phenoDatta |
A DGEList object
eset <- new("ExpressionSet", exprs=matrix(rpois(120, 20), nrow=20, dimnames=list(LETTERS[1:20], letters[1:6])), phenoData = new("AnnotatedDataFrame", data.frame(Sample=letters[1:6], group=gl(2, 3), row.names=letters[1:6])), featureData = new("AnnotatedDataFrame", data.frame(Feature=LETTERS[1:20], row.names=LETTERS[1:20]))) eset2DGEList(eset)eset <- new("ExpressionSet", exprs=matrix(rpois(120, 20), nrow=20, dimnames=list(LETTERS[1:20], letters[1:6])), phenoData = new("AnnotatedDataFrame", data.frame(Sample=letters[1:6], group=gl(2, 3), row.names=letters[1:6])), featureData = new("AnnotatedDataFrame", data.frame(Feature=LETTERS[1:20], row.names=LETTERS[1:20]))) eset2DGEList(eset)
Return an example of EdgeObject
exampleEdgeObject(nfeat = 20, nsample = 6, ngroup = 3, lambda = 10)exampleEdgeObject(nfeat = 20, nsample = 6, ngroup = 3, lambda = 10)
nfeat |
Integer, number of features |
nsample |
Integer, number of samples |
ngroup |
Integer, number of groups, must be divisible by |
lambda |
Integer, passed to |
An EdgeObject
set.seed(1887) ## fix random generators exampleEdgeObject() exampleEdgeObject(50, 12, ngroup=4)set.seed(1887) ## fix random generators exampleEdgeObject() exampleEdgeObject(50, 12, ngroup=4)
Export dgeTest results
exportEdgeResult( edgeResult, outRootDir, action = c("ask", "append", "overwrite", "no") )exportEdgeResult( edgeResult, outRootDir, action = c("ask", "append", "overwrite", "no") )
edgeResult |
A |
outRootDir |
Character string, output directory |
action |
Character string, what happens if the output directory exists |
Invisibly returns NULL. Called for its side effects of writing
result files to disk.
Export static gene-level plots in PDF
exportStaticGeneLevelPlots(edgeResult, file)exportStaticGeneLevelPlots(edgeResult, file)
edgeResult |
An |
file |
Character string, the PDF file name |
Called for its side effect of writing plots to a PDF file; returns invisibly NULL.
Get fData
## S4 method for signature 'DGEList' fData(object)## S4 method for signature 'DGEList' fData(object)
object |
A DGEList |
A data.frame of feature (gene) annotations.
Get fData
## S4 method for signature 'EdgeObject' fData(object)## S4 method for signature 'EdgeObject' fData(object)
object |
An EdgeObject |
A data.frame of feature (gene) annotations.
Set fData
## S4 replacement method for signature 'DGEList,data.frame' fData(object) <- value## S4 replacement method for signature 'DGEList,data.frame' fData(object) <- value
object |
A DGEList |
value |
A |
The updated DGEList object with new feature annotations.
Set fData
## S4 replacement method for signature 'EdgeObject,data.frame' fData(object) <- value## S4 replacement method for signature 'EdgeObject,data.frame' fData(object) <- value
object |
An EdgeObject |
value |
A |
The updated EdgeObject with new feature annotations.
A class that contain feature annotation and expression matrix
exprs |
A matrix of expression |
genes |
A data.frame |
Feature names
## S4 method for signature 'EdgeObject' featureNames(object)## S4 method for signature 'EdgeObject' featureNames(object)
object |
An EdgeObject |
A character vector of feature names.
Filter lowly expressed genes by counts per million (CPM)
filterByCPM(obj, ...)filterByCPM(obj, ...)
obj |
An object |
... |
Other parameters |
The return value depends on the class of obj. See method-specific documentation.
Filter lowly expressed genes by CPM in DGEList
## S3 method for class 'DGEList' filterByCPM( obj, minCPM = 1, minCount = minGroupCount(obj), lib.size = NULL, ... )## S3 method for class 'DGEList' filterByCPM( obj, minCPM = 1, minCount = minGroupCount(obj), lib.size = NULL, ... )
obj |
A |
minCPM |
Numeric, the minimum CPM accepted as expressed in one sample |
minCount |
Integer, how many samples must have CPM larger than
|
lib.size |
Integers of library size, or |
... |
Not used |
Another DGEList object, with lowly expressed genes removed.
The original counts and gene annotation can be found in
counts.unfiltered and genes.unfiltered fields, respectively.
The logical vector of the filter is saved in the cpmFilter field.
set.seed(1887) mat <- rbind(matrix(rbinom(150, 5, 0.25), nrow=25), rep(0, 6)) d <- DGEList(mat, group=rep(1:3, each=2), genes=data.frame(Gene=sprintf("Gene%d", 1:nrow(mat)))) df <- filterByCPM(d) nrow(df$counts.unfiltered) ## 26 nrow(df$counts) ## 25set.seed(1887) mat <- rbind(matrix(rbinom(150, 5, 0.25), nrow=25), rep(0, 6)) d <- DGEList(mat, group=rep(1:3, each=2), genes=data.frame(Gene=sprintf("Gene%d", 1:nrow(mat)))) df <- filterByCPM(d) nrow(df$counts.unfiltered) ## 26 nrow(df$counts) ## 25
Filter EdgeObj and remove lowly expressed genes
## S3 method for class 'EdgeObject' filterByCPM(obj, minCPM = 1, minCount = minGroupCount(obj), ...)## S3 method for class 'EdgeObject' filterByCPM(obj, minCPM = 1, minCount = minGroupCount(obj), ...)
obj |
An EdgeObject object |
minCPM |
Minimal CPM value, see descriptions below |
minCount |
Minimal count of samples in which the CPM value is no less
than |
... |
Not used The filter is recommended by the authors of the The filter removes genes that are less expressed than 1 copy per million
reads (cpm) in at least |
An EdgeObject with lowly expressed genes removed from the
internal DGEList. The unfiltered counts and gene annotation are
preserved in the counts.unfiltered and genes.unfiltered
fields of the DGEList.
myFac <- gl(3,2) set.seed(1234) myMat <- matrix(rpois(1200,100), nrow=200, ncol=6) myMat[1:3,] <- 0 myEdgeObj <- EdgeObject(myMat, DesignContrast(designMatrix=model.matrix(~myFac), contrastMatrix=matrix(c(0,1,0), ncol=1), groups=myFac), fData=data.frame(GeneSymbol=sprintf("Gene%d", 1:200))) myFilteredEdgeObj <- filterByCPM(myEdgeObj) dim(counts(myEdgeObj)) dim(counts(myFilteredEdgeObj)) ## show unfiltered count matrix dim(counts(myFilteredEdgeObj, filter=FALSE))myFac <- gl(3,2) set.seed(1234) myMat <- matrix(rpois(1200,100), nrow=200, ncol=6) myMat[1:3,] <- 0 myEdgeObj <- EdgeObject(myMat, DesignContrast(designMatrix=model.matrix(~myFac), contrastMatrix=matrix(c(0,1,0), ncol=1), groups=myFac), fData=data.frame(GeneSymbol=sprintf("Gene%d", 1:200))) myFilteredEdgeObj <- filterByCPM(myEdgeObj) dim(counts(myEdgeObj)) dim(counts(myFilteredEdgeObj)) ## show unfiltered count matrix dim(counts(myFilteredEdgeObj, filter=FALSE))
Filter lowly expressed genes by CPM
## S3 method for class 'matrix' filterByCPM(obj, minCPM = 1, minCount = 1, ...)## S3 method for class 'matrix' filterByCPM(obj, minCPM = 1, minCount = 1, ...)
obj |
A matrix |
minCPM |
Numeric, the minimum CPM accepted as expressed in one sample |
minCount |
Integer, how many samples must have CPM larger than |
... |
Not used |
A logical vector of the same length as the row count of the matrix. TRUE means the gene is reasonably expressed, and FALSE means the gene is lowly expressed and should be filtered (removed)
set.seed(1887) mat <- rbind(matrix(rbinom(125, 5, 0.25), nrow=25), rep(0, 5)) filterByCPM(mat)set.seed(1887) mat <- rbind(matrix(rbinom(125, 5, 0.25), nrow=25), rep(0, 5)) filterByCPM(mat)
Fit generalized linear model
fitGLM(object, ...) ## S4 method for signature 'EdgeObject' fitGLM(object, ...)fitGLM(object, ...) ## S4 method for signature 'EdgeObject' fitGLM(object, ...)
object |
An |
... |
Passed to |
A DGEGLM object containing the GLM fit.
The fit object
fitGLM(EdgeObject): Method for EdgeObject
Get GCT filename from a directory
gctFilename(dir)gctFilename(dir)
dir |
Character string, path to a directory where a GCT file is saved |
Character string, full name of the GCT file If no file is found, the function reports an error. If more than one file is found, a warning message is raised, and only the first file is used.
Return gene count
geneCount(countDgeResult)geneCount(countDgeResult)
countDgeResult |
An EdgeResult object |
Integer
Return gene identifier types
geneIdentifierTypes(dgeResult)geneIdentifierTypes(dgeResult)
dgeResult |
An DgeResult object |
A character string indicating the gene identifiers found The following terms are recognized: GeneID, EnsemblID, GeneSymbol, FeatureName
Note that the order matters: FeatureName > GeneID = EnsemblID > GeneSymbol
Get automatic group color
groupCol(edgeObj, panel = "Set1")groupCol(edgeObj, panel = "Set1")
edgeObj |
An EdgeObject or EdgeResult object |
panel |
passed to |
A fcbase object
Get sample groups from an EdgeObject object
## S4 method for signature 'EdgeObject' groups(object)## S4 method for signature 'EdgeObject' groups(object)
object |
An |
A factor of sample group assignments.
Print a kdTable nicely with gt
gtKdTable(kdTable, feature_label = "GeneSymbol", ...)gtKdTable(kdTable, feature_label = "GeneSymbol", ...)
kdTable |
a knockdown table (kdTable) returned by |
feature_label |
The column which contains feature label, gene symbol by |
... |
Passed to |
A gt table object
Tells whether common dispersion has been set
hasCommonDisp(object) ## S4 method for signature 'DGEList' hasCommonDisp(object) ## S4 method for signature 'EdgeObject' hasCommonDisp(object)hasCommonDisp(object) ## S4 method for signature 'DGEList' hasCommonDisp(object) ## S4 method for signature 'EdgeObject' hasCommonDisp(object)
object |
An object |
Logical, whether the common dispersion has been set.
hasCommonDisp(DGEList): Method for DGEList
hasCommonDisp(EdgeObject): Method for EdgeObject
Get human gene symbols for gene-set enrichment analysis
humanGeneSymbols(object) ## S4 method for signature 'DGEList' humanGeneSymbols(object) ## S4 method for signature 'EdgeObject' humanGeneSymbols(object)humanGeneSymbols(object) ## S4 method for signature 'DGEList' humanGeneSymbols(object) ## S4 method for signature 'EdgeObject' humanGeneSymbols(object)
object |
An object |
A character vector of human gene symbols.
humanGeneSymbols(DGEList): Method for DGEList
humanGeneSymbols(EdgeObject): Method for EdgeObject
Infer surrogate variables
inferSV(object, design, ...) ## S4 method for signature 'matrix,matrix' inferSV(object, design, ...) ## S4 method for signature 'DGEList,matrix' inferSV(object, design, ...) ## S4 method for signature 'DGEList,formula' inferSV(object, design, ...)inferSV(object, design, ...) ## S4 method for signature 'matrix,matrix' inferSV(object, design, ...) ## S4 method for signature 'DGEList,matrix' inferSV(object, design, ...) ## S4 method for signature 'DGEList,formula' inferSV(object, design, ...)
object |
An object |
design |
Design matrix or formula |
... |
Other parameters |
A matrix of surrogate variables.
inferSV(object = matrix, design = matrix): method for matrix as input
inferSV(object = DGEList, design = matrix): method for voom-transformed DGEList and design matrix
as input
inferSV(object = DGEList, design = formula): method for voom-transformed DGEList and formula
as input
Is the object annotated
isAnnotated(object) ## S4 method for signature 'EdgeObject' isAnnotated(object)isAnnotated(object) ## S4 method for signature 'EdgeObject' isAnnotated(object)
object |
An object |
Logical, whether the object is annotated.
isAnnotated(EdgeObject): Method for EdgeObject
Is the Surrogate Variable (SV) matrix empty
isEmptySV(sv)isEmptySV(sv)
sv |
A surrogate variable (SV) matrix returned by |
TRUE if no valid SV was estimated; otherwise FALSE.
isEmptySV(matrix(0, 1,1)) isEmptySV(matrix(rnorm(5), nrow=5))isEmptySV(matrix(0, 1,1)) isEmptySV(matrix(rnorm(5), nrow=5))
Return logical vector indicating which genes are significantly regulated
isSig(data.frame, sigFilter) isSigPos(data.frame, sigFilter) isSigNeg(data.frame, sigFilter)isSig(data.frame, sigFilter) isSigPos(data.frame, sigFilter) isSigNeg(data.frame, sigFilter)
data.frame |
A |
sigFilter |
An SigFilter object |
A logical vector of the same length as the row number of the input data.frame
isSigPos(): Returns which genes are significantly positively regulated
isSigNeg(): Returns which genes are significantly negatively regulated
Whether the aveExpr filter is set
isUnsetAveExpr(limmaSigFilter)isUnsetAveExpr(limmaSigFilter)
limmaSigFilter |
A |
Logical, whether the aveExpr threshold is the default value.
Whether the logCPM filter is set
isUnsetLogCPM(edgeSigFilter)isUnsetLogCPM(edgeSigFilter)
edgeSigFilter |
A |
Logical, whether the logCPM threshold is the default value.
Tells whether the threshold was not set
isUnsetPosLogFC(sigFilter) isUnsetNegLogFC(sigFilter) isUnsetPValue(sigFilter) isUnsetFDR(sigFilter)isUnsetPosLogFC(sigFilter) isUnsetNegLogFC(sigFilter) isUnsetPValue(sigFilter) isUnsetFDR(sigFilter)
sigFilter |
An SigFilter object |
Logical, whether the thresholds are the default values
Whether the SigFilter is the default one
isUnsetSigFilter(object)isUnsetSigFilter(object)
object |
An SigFilter object |
Logical, whether it is unset
Retrieve a knockdown table from edgeRes
kdTable(edgeRes, feature, feature_label = "GeneSymbol")kdTable(edgeRes, feature, feature_label = "GeneSymbol")
edgeRes |
An |
feature |
The feature to be retrieved |
feature_label |
The column which contains feature label, gene symbol by default |
A compact table containing essential information about knockdown
LimmaSigFilter Extending BaseSigFilter to filter genes base on aveExpr
aveExprNumeric, AveExpr threshold (larger values are kept)
Construct a LimmaVoomResult object
LimmaVoomResult(edgeObj, voom, marrayLM, dgeTables)LimmaVoomResult(edgeObj, voom, marrayLM, dgeTables)
edgeObj |
An EdgeObject. |
voom |
The voom ( |
marrayLM |
A MArrayLM object. |
dgeTables |
A list of DGEtables. |
An LimmaVoomResult object.
The LimmaVoom Object that contains test results, dgeTable, and SigFilter
marrayLMA MArrayLM class object that contains results of eBayesFit
voomThe voom object
Get settings in the EdgeSigFilter
logCPM(edgeSigFilter)logCPM(edgeSigFilter)
edgeSigFilter |
An |
Numeric values of the logCPM filter
Extract a matrix of log2(fold-change) values
logFCmatrix( edgeResult, featureIdentifier = "GeneSymbol", contrasts = NULL, removeNAfeatures = TRUE, minAveExpr = NULL )logFCmatrix( edgeResult, featureIdentifier = "GeneSymbol", contrasts = NULL, removeNAfeatures = TRUE, minAveExpr = NULL )
edgeResult |
An |
featureIdentifier |
Character, column name in |
contrasts |
|
removeNAfeatures |
Logical, if |
minAveExpr |
|
A numeric matrix of log2(fold-change) values with features in rows and contrasts in columns.
TODO: add edgeResult data example
Perform principal component analysis to the log fold-change matrix
logFCmatrixPCA(lfc)logFCmatrixPCA(lfc)
lfc |
A matrix of log2 fold changes, with features in rows and contrasts in columns |
A PCAScoreMatrix object
The function performs principal component analysis (PCA) to the log fold-change matrix.
By using a column of zeros during the PCA analysis, which was removed from the final result, the point of origin represents an ideal contrast which yield absolutely no differential gene expression. It is easier to interpret the PCA results with this transformation.
my_lfc_mat <- matrix(rnorm(1000), nrow=100, ncol=10) my_lfc_pca <- logFCmatrixPCA(my_lfc_mat) my_lfc_pcamy_lfc_mat <- matrix(rnorm(1000), nrow=100, ncol=10) my_lfc_pca <- logFCmatrixPCA(my_lfc_mat) my_lfc_pca
Perform principal component analysis to an EdgeResult object
logFCpca(edgeResult)logFCpca(edgeResult)
edgeResult |
An |
A PCAScoreMatrix object
The function performs principal component analysis (PCA) to the log fold-change matrix.
By using a column of zeros during the PCA analysis, which was removed from the final result, the point of origin represents an ideal contrast which yield absolutely no differential gene expression. It is easier to interpret the PCA results with this transformation.
logFCmatrixPCA
## TODO: add edgeResult sample## TODO: add edgeResult sample
Send an edgeR analysis job to SLF
lsfEdgeR( dgeList, designContrast, outdir = "edgeR_output", outfilePrefix = "an-unnamed-project-", overwrite = c("ask", "overwrite", "append", "no"), mps = FALSE, limmaVoom = FALSE, appendGmt = NULL, qos = c("preempt_cpu", "interactive_cpu", "batch_cpu"), rootPath = "/apps/rocs/pRED/groups/bioinfo/geneexpression", debug = FALSE )lsfEdgeR( dgeList, designContrast, outdir = "edgeR_output", outfilePrefix = "an-unnamed-project-", overwrite = c("ask", "overwrite", "append", "no"), mps = FALSE, limmaVoom = FALSE, appendGmt = NULL, qos = c("preempt_cpu", "interactive_cpu", "batch_cpu"), rootPath = "/apps/rocs/pRED/groups/bioinfo/geneexpression", debug = FALSE )
dgeList |
An |
designContrast |
The DesignContrast object to model the data |
outdir |
Output directory of the edgeR script. Default value "edgeR_output". |
outfilePrefix |
Prefix of the output files. It can include directories,
e.g. |
overwrite |
If |
mps |
Logical, whether molecular-phenotyping analysis is run. |
limmaVoom |
Logical, whether the limma-voom model is run instead of the edgeR model. |
appendGmt |
|
qos |
Character, specifying Quality of Service of Slurm. Available values include |
rootPath |
Character string, the directory of geneexpression scripts, under which |
debug |
Logical, if |
A list of two items, command, the command line call, and
output, the output of the SLURM command in bash
Even if the output directory is empty, if overwrite is set to
no (or if the user answers no), the job will not be started.
mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) myFac <- gl(2,5, labels=c("Control", "Treatment")) y <- edgeR::DGEList(counts=mat, group=myFac) myDesign <- model.matrix(~myFac); colnames(myDesign) <- levels(myFac) myContrast <- limma::makeContrasts(Treatment, levels=myDesign) ## \dontrun{ ## lsfEdgeR(y, designMatrix=myDesign, contrastMatrix=myContrast, ## outfilePrefix="test", outdir=tempdir()) ## }mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) myFac <- gl(2,5, labels=c("Control", "Treatment")) y <- edgeR::DGEList(counts=mat, group=myFac) myDesign <- model.matrix(~myFac); colnames(myDesign) <- levels(myFac) myContrast <- limma::makeContrasts(Treatment, levels=myDesign) ## \dontrun{ ## lsfEdgeR(y, designMatrix=myDesign, contrastMatrix=myContrast, ## outfilePrefix="test", outdir=tempdir()) ## }
Return the LSF command to run the edgeR script
lsfEdgeRcommand( dgeList, designContrast, outdir = "edgeR_output", outfilePrefix = "an-unnamed-project-", mps = FALSE, limmaVoom = FALSE, appendGmt = NULL, qos = c("long", "preempty", "short"), rootPath = "/apps/rocs/pRED/groups/bioinfo/geneexpression", debug = FALSE, bsubFile = NULL )lsfEdgeRcommand( dgeList, designContrast, outdir = "edgeR_output", outfilePrefix = "an-unnamed-project-", mps = FALSE, limmaVoom = FALSE, appendGmt = NULL, qos = c("long", "preempty", "short"), rootPath = "/apps/rocs/pRED/groups/bioinfo/geneexpression", debug = FALSE, bsubFile = NULL )
dgeList |
An |
designContrast |
The DesignContrast object to model the data |
outdir |
Output directory of the edgeR script. Default value "edgeR_output". |
outfilePrefix |
Prefix of the output files. It can include directories,
e.g. |
mps |
Logical, whether molecular-phenotyping analysis is run. |
limmaVoom |
Logical, whether the limma-voom model is run instead of the edgeR model |
appendGmt |
|
qos |
Character, specifying Quality of Service of LSF Available values include |
rootPath |
Character string, the directory of geneexpression scripts, under which |
debug |
Logical, if |
bsubFile |
This function wraps the function It uses |
A character string containing the LSF bsub command to
submit the edgeR analysis job.
mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) myFac <- gl(2,5, labels=c("Control", "Treatment")) y <- edgeR::DGEList(counts=mat, group=myFac) myDesign <- model.matrix(~myFac); colnames(myDesign) <- levels(myFac) myContrast <- limma::makeContrasts(Treatment, levels=myDesign) myDesCon <- DesignContrast(designMatrix=myDesign, contrastMatrix=myContrast) lsfEdgeRcommand(y, designContrast=myDesCon, outfilePrefix="test", outdir=tempdir()) ## clean up unlink(file.path(tempdir(), "input_data"), recursive=TRUE) unlink("test.bsub")mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) myFac <- gl(2,5, labels=c("Control", "Treatment")) y <- edgeR::DGEList(counts=mat, group=myFac) myDesign <- model.matrix(~myFac); colnames(myDesign) <- levels(myFac) myContrast <- limma::makeContrasts(Treatment, levels=myDesign) myDesCon <- DesignContrast(designMatrix=myDesign, contrastMatrix=myContrast) lsfEdgeRcommand(y, designContrast=myDesCon, outfilePrefix="test", outdir=tempdir()) ## clean up unlink(file.path(tempdir(), "input_data"), recursive=TRUE) unlink("test.bsub")
Merge two DGEList objects into one
mergeDGEList(firstDgeList, secondDgeList, DGEListLabels = NULL)mergeDGEList(firstDgeList, secondDgeList, DGEListLabels = NULL)
firstDgeList |
First |
secondDgeList |
Second |
DGEListLabels |
Labels, either The function merges two
In case |
A DGEList object containing the merged counts, sample
annotation, and gene annotation from both input objects.
y1 <- matrix(rnbinom(1000, mu=5, size=2), ncol=4) genes1 <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(y1))) rownames(y1) <- rownames(genes1) <- 1:nrow(y1) anno1 <- data.frame(treatment=gl(2,2, labels=c("ctrl", "tmt")), donor=factor(rep(c(1,2), each=2))) d1 <- DGEList(counts=y1, genes=genes1, samples=anno1) y2 <- matrix(rnbinom(1000, mu=5, size=2), ncol=4) genes2 <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(y2)+100)) rownames(y2) <- rownames(genes1) <- 1:nrow(y2)+100 anno2 <- data.frame(treatment=gl(2,2, labels=c("ctrl", "tmt")), sex=factor(rep(c("m", "f"), each=2))) d2 <- DGEList(counts=y2, genes=genes2, samples=anno2) md <- mergeDGEList(d1, d2) md2 <- mergeDGEList(d1, d2, DGEListLabels=c("d1", "d2"))y1 <- matrix(rnbinom(1000, mu=5, size=2), ncol=4) genes1 <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(y1))) rownames(y1) <- rownames(genes1) <- 1:nrow(y1) anno1 <- data.frame(treatment=gl(2,2, labels=c("ctrl", "tmt")), donor=factor(rep(c(1,2), each=2))) d1 <- DGEList(counts=y1, genes=genes1, samples=anno1) y2 <- matrix(rnbinom(1000, mu=5, size=2), ncol=4) genes2 <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(y2)+100)) rownames(y2) <- rownames(genes1) <- 1:nrow(y2)+100 anno2 <- data.frame(treatment=gl(2,2, labels=c("ctrl", "tmt")), sex=factor(rep(c("m", "f"), each=2))) d2 <- DGEList(counts=y2, genes=genes2, samples=anno2) md <- mergeDGEList(d1, d2) md2 <- mergeDGEList(d1, d2, DGEListLabels=c("d1", "d2"))
Return the size of the smallest group
minGroupCount(obj) ## S3 method for class 'DGEList' minGroupCount(obj) ## S3 method for class 'EdgeObject' minGroupCount(obj)minGroupCount(obj) ## S3 method for class 'DGEList' minGroupCount(obj) ## S3 method for class 'EdgeObject' minGroupCount(obj)
obj |
A |
Integer
minGroupCount(DGEList): Return the size of the smallest group defined in
the DGEList object
minGroupCount(EdgeObject): Return the size of the smallest group defined in
the EdgeObject object
y <- matrix(rnbinom(12000,mu=10,size=2),ncol=6) d <- DGEList(counts=y, group=rep(1:3,each=2)) minGroupCount(d) ## 2 d2 <- DGEList(counts=y, group=rep(1:2,each=3)) minGroupCount(d2) ## 3 d3 <- DGEList(counts=y, group=rep(1:3, 1:3)) minGroupCount(d3) ## 1y <- matrix(rnbinom(12000,mu=10,size=2),ncol=6) d <- DGEList(counts=y, group=rep(1:3,each=2)) minGroupCount(d) ## 2 d2 <- DGEList(counts=y, group=rep(1:2,each=3)) minGroupCount(d2) ## 3 d3 <- DGEList(counts=y, group=rep(1:3, 1:3)) minGroupCount(d3) ## 1
Build design matrix from a DGEList object
model.DGEList(object, formula, ...)model.DGEList(object, formula, ...)
object |
A DGEList object |
formula |
Formula, passed to Sample annotation is used to construct the formula |
... |
Not used so far |
A design matrix.
Modulated logCPM
modLogCPM(object, ...) ## S4 method for signature 'DGEList' modLogCPM(object, prior.count = 2) ## S4 method for signature 'EdgeObject' modLogCPM(object)modLogCPM(object, ...) ## S4 method for signature 'DGEList' modLogCPM(object, prior.count = 2) ## S4 method for signature 'EdgeObject' modLogCPM(object)
object |
A DGEList object |
... |
Other parameters |
prior.count |
Integer, prior count. |
A numeric matrix of modulated log-CPM values.
modLogCPM(DGEList): Method for DGEList
modLogCPM(EdgeObject): Method for EdgeObject
Return either NA (if input is NULL) or sqrt
naOrSqrt(x)naOrSqrt(x)
x |
Numeric value |
NA or sqrt value
Return number of samples
## S4 method for signature 'EdgeResult' ncol(x)## S4 method for signature 'EdgeResult' ncol(x)
x |
An EdgeResults object |
Integer
Return the number of contrasts
## S4 method for signature 'EdgeResult' nContrast(object)## S4 method for signature 'EdgeResult' nContrast(object)
object |
An |
An integer, the number of contrasts.
Normalize an EdgeObject
## S4 method for signature 'EdgeObject' normalize(object, method = "RLE", ...)## S4 method for signature 'EdgeObject' normalize(object, method = "RLE", ...)
object |
An |
method |
Method passed to |
... |
Other parameters passed to |
An EdgeObject with updated normalization factors in the
internal DGEList.
normalize(EdgeObject): Normalize an EdgeObject by calculating normalization
factors using the given method
Plot distribution of normalized counts
normBoxplot(before.norm, after.norm, ...)normBoxplot(before.norm, after.norm, ...)
before.norm |
An |
after.norm |
An |
... |
Other parameters passed to |
Called for its side effect of plotting; returns invisibly NULL.
Extract normalisation factors from the object
normFactors(object) ## S4 method for signature 'DGEList' normFactors(object) ## S4 method for signature 'EdgeObject' normFactors(object)normFactors(object) ## S4 method for signature 'DGEList' normFactors(object) ## S4 method for signature 'EdgeObject' normFactors(object)
object |
An object |
A numeric vector of normalisation factors.
normFactors(DGEList): Method for DGEList
normFactors(EdgeObject): Method for EdgeObject
Return number of features
## S4 method for signature 'EdgeResult' nrow(x)## S4 method for signature 'EdgeResult' nrow(x)
x |
An EdgeResults object |
Integer
Pairs plot for EdgeResult
## S3 method for class 'EdgeResult' pairs( x, lower.panel = panel.lmSmooth, upper.panel = panel.cor, freeRelation = TRUE, pch = 19, ... )## S3 method for class 'EdgeResult' pairs( x, lower.panel = panel.lmSmooth, upper.panel = panel.cor, freeRelation = TRUE, pch = 19, ... )
x |
An EdgeResult object. |
lower.panel |
Lower panel, passed to |
upper.panel |
Upper panel, passed to |
freeRelation |
Logical, whether x- and y-axis shoule have the same range |
pch |
Point symbol |
... |
passed to Plot pairwise logFCs |
Called for its side effect of plotting; returns invisibly NULL.
Parse feature information from molecular-phenotyping GCT files
parseMolPhenFeat(gctMatrix)parseMolPhenFeat(gctMatrix)
gctMatrix |
A |
A data.frame with following columns: GeneID (as integer),
GeneSymbol (as character), and Transcript,
with the original names as row names
readMolPhenCoverageGct, which calls this function
internally to parse molecular phenotyping gene features
Get pData (sample annotation)
## S4 method for signature 'DGEList' pData(object)## S4 method for signature 'DGEList' pData(object)
object |
A DGEList |
A data.frame of sample annotations.
Get pData
## S4 method for signature 'EdgeObject' pData(object)## S4 method for signature 'EdgeObject' pData(object)
object |
An EdgeObject |
A data.frame of sample annotations.
Set pData (sample annotation)
## S4 replacement method for signature 'DGEList,data.frame' pData(object) <- value## S4 replacement method for signature 'DGEList,data.frame' pData(object) <- value
object |
A DGEList |
value |
A |
The updated DGEList object with new sample annotations.
Set pData (sample annotation)
## S4 replacement method for signature 'EdgeObject,data.frame' pData(object) <- value## S4 replacement method for signature 'EdgeObject,data.frame' pData(object) <- value
object |
A DGEList |
value |
A |
The updated EdgeObject with new sample annotations.
Plot BCV
plotBCV(x, ...) ## S4 method for signature 'DGEList' plotBCV(x, ...) ## S4 method for signature 'EdgeObject' plotBCV(x, ...) ## S4 method for signature 'EdgeResult' plotBCV(x, ...)plotBCV(x, ...) ## S4 method for signature 'DGEList' plotBCV(x, ...) ## S4 method for signature 'EdgeObject' plotBCV(x, ...) ## S4 method for signature 'EdgeResult' plotBCV(x, ...)
x |
An object |
... |
Other paramters |
Called for its side effect of plotting; returns invisibly NULL.
plotBCV(DGEList): Method for DGEList
plotBCV(EdgeObject): Method for EdgeObject
plotBCV(EdgeResult): Method for EdgeResult
Plot gene expression with knockdown efficiency
plotKnockdown( goiExpr, exprsVar = "exprs", groupVar = "group", controlGroup = NULL, trans = "identity", exprsUnit = "Arbitrary Unit", test = c("wilcox.test", "t.test") )plotKnockdown( goiExpr, exprsVar = "exprs", groupVar = "group", controlGroup = NULL, trans = "identity", exprsUnit = "Arbitrary Unit", test = c("wilcox.test", "t.test") )
goiExpr |
A data.frame containing expression of gene of interest in linear scale, which
must contain columns given below as |
exprsVar |
Character, the variable name of expression. The unit must be in linear scale, not in logarithmic scale, otherwise the knockdown efficiency calculation will be wrong. |
groupVar |
Character, the variable name of grouping. The column must be a factor or a character. |
controlGroup |
|
trans |
Character, transformation of the y-axis, commonly used values include |
exprsUnit |
Character, unit name of expression (TPM, CPM, RPKM, etc.) |
test |
Character, statistical test, |
A ggplot object displaying boxplots of gene expression across
groups with a secondary axis showing knockdown efficiency.
myData <- data.frame(group=gl(3,4), exprs=as.vector(sapply(c(100, 10, 1), function(x) rnorm(4, x)))) plotKnockdown(myData) myData2 <- data.frame(group=rep(c("Vehicle", "Dose1", "Dose2"), each=4), exprs=as.vector(sapply(c(100, 10, 1), function(x) rnorm(4, x)))) plotKnockdown(myData2, controlGroup="Vehicle")myData <- data.frame(group=gl(3,4), exprs=as.vector(sapply(c(100, 10, 1), function(x) rnorm(4, x)))) plotKnockdown(myData) myData2 <- data.frame(group=rep(c("Vehicle", "Dose1", "Dose2"), each=4), exprs=as.vector(sapply(c(100, 10, 1), function(x) rnorm(4, x)))) plotKnockdown(myData2, controlGroup="Vehicle")
plotMDS for EdgeObject
## S3 method for class 'EdgeObject' plotMDS(x, ...)## S3 method for class 'EdgeObject' plotMDS(x, ...)
x |
An EdgeObject object |
... |
Other parameters passed to |
An MDS object (invisibly), as returned by
plotMDS.
Plot top significantly differentially expressed genes by contrast
plotTopSigGenes( countDgeResult, n = 5, nSigned = NULL, identifier = "GeneSymbol" )plotTopSigGenes( countDgeResult, n = 5, nSigned = NULL, identifier = "GeneSymbol" )
countDgeResult |
A |
n |
Integer, how many genes should be visualized per contrast. |
nSigned |
|
identifier |
Character string, column name in |
A ggplot object
plotTopSigGenesByContrast plots one contrast at a time.
edgeObj <- exampleEdgeObject() edgeRes <- dgeWithEdgeR(edgeObj) plotTopSigGenes(edgeRes, n=6) ## display top two positive and top three negative genes plotTopSigGenes(edgeRes, nSigned=2)edgeObj <- exampleEdgeObject() edgeRes <- dgeWithEdgeR(edgeObj) plotTopSigGenes(edgeRes, n=6) ## display top two positive and top three negative genes plotTopSigGenes(edgeRes, nSigned=2)
Plot top significantly differentially expressed genes by contrast
plotTopSigGenesByContrast( countDgeResult, contrast, n = 5, nSigned = NULL, identifier = "GeneSymbol" )plotTopSigGenesByContrast( countDgeResult, contrast, n = 5, nSigned = NULL, identifier = "GeneSymbol" )
countDgeResult |
A |
contrast |
A character string, or an index (integer), or a logical vector with one |
n |
Integer, how many genes should be visualized. |
nSigned |
|
identifier |
Character string, column name in |
A ggplot object. If nSigned is not NULL, genes are
plotted with colors: blue indicating down-regulated genes and red indicating up-regulated genes.
plotTopSigGenesByContrast plots all contrasts at once.
edgeObj <- exampleEdgeObject() edgeRes <- dgeWithEdgeR(edgeObj) plotTopSigGenesByContrast(edgeRes, n=6, contrast=1) plotTopSigGenesByContrast(edgeRes, n=6, contrast=2) ## display top three positive and top three negative genes plotTopSigGenesByContrast(edgeRes, nSigned=3, contrast=1) plotTopSigGenesByContrast(edgeRes, nSigned=4, contrast=2)edgeObj <- exampleEdgeObject() edgeRes <- dgeWithEdgeR(edgeObj) plotTopSigGenesByContrast(edgeRes, n=6, contrast=1) plotTopSigGenesByContrast(edgeRes, n=6, contrast=2) ## display top three positive and top three negative genes plotTopSigGenesByContrast(edgeRes, nSigned=3, contrast=1) plotTopSigGenesByContrast(edgeRes, nSigned=4, contrast=2)
Get settings in the significance filter
posLogFC(sigFilter) negLogFC(sigFilter) pValue(sigFilter) FDR(sigFilter)posLogFC(sigFilter) negLogFC(sigFilter) pValue(sigFilter) FDR(sigFilter)
sigFilter |
An SigFilter object |
Numeric values of the thresholds
Update SigFilter
posLogFC(object) <- value negLogFC(object) <- value logFC(object) <- value pValue(object) <- value FDR(object) <- value logCPM(object) <- value aveExpr(object) <- value ## S3 method for class 'SigFilter' update(object, logFC, posLogFC, negLogFC, pValue, FDR, ...) ## S3 method for class 'EdgeSigFilter' update(object, logFC, posLogFC, negLogFC, pValue, FDR, logCPM, ...) ## S3 method for class 'LimmaSigFilter' update(object, logFC, posLogFC, negLogFC, pValue, FDR, aveExpr, ...)posLogFC(object) <- value negLogFC(object) <- value logFC(object) <- value pValue(object) <- value FDR(object) <- value logCPM(object) <- value aveExpr(object) <- value ## S3 method for class 'SigFilter' update(object, logFC, posLogFC, negLogFC, pValue, FDR, ...) ## S3 method for class 'EdgeSigFilter' update(object, logFC, posLogFC, negLogFC, pValue, FDR, logCPM, ...) ## S3 method for class 'LimmaSigFilter' update(object, logFC, posLogFC, negLogFC, pValue, FDR, aveExpr, ...)
object |
An |
value |
Numeric, vssigned threshold value |
logFC |
Numeric, logFC filter value, optional. |
posLogFC |
Numeric, positive logFC filter value, optional. |
negLogFC |
Numeric, negative logFC filter value, optional. |
pValue |
Numeric, pValue filter value, optional |
FDR |
Numeric, FDR filter value, optional |
... |
not used now |
logCPM |
Numeric, logCPM filter value, optional (only for |
aveExpr |
Numeric, aveExpr filter value, optional (only for |
An updated SigFilter object.
An updated EdgeSigFilter object.
An updated LimmaSigFilter object.
posLogFC(object) <- value: Updates the posLogFC threshold value
negLogFC(object) <- value: Updates the negLogFC threshold value
logFC(object) <- value: Updates the posLogFC threshold value
pValue(object) <- value: Updates the pValue threshold value
FDR(object) <- value: Updates the FDR threshold value
logCPM(object) <- value: Updates the logCPM threshold value
aveExpr(object) <- value: Updates the aveExpr threshold value
Principal component analysis of DGEList
## S3 method for class 'DGEList' prcomp(x, ntop = NULL, scale = FALSE, verbose = FALSE, ...)## S3 method for class 'DGEList' prcomp(x, ntop = NULL, scale = FALSE, verbose = FALSE, ...)
x |
A |
ntop |
Integer, how many top-variable features should be used? If |
scale |
Logical, whether variance of features should be scaled to 1. |
verbose |
Logical, whether the function should print messages. |
... |
Other parameters passed to The function first remove all-zero-count features, because they can make the PCA plot of samples delusive. Next, it applies Finally, PCA is applied to the vsn-transformed matrix. |
The function returns a prcomp object. The fit object is saved in the vsnFit field in the returned object, and the transformed matrix is saved in the vsnMat field.
myCounts <- matrix(rnbinom(10000, 3, 0.25), nrow=1000) myDgeList <- DGEList(counts=myCounts, samples=data.frame(group=gl(5,2))) myPrcomp <- prcomp(myDgeList) vsn::meanSdPlot(myPrcomp$vsnFit) ## features with zero count in all samples do not contribute to the PCA analysis myDgeList2 <- DGEList(counts=rbind(myCounts, rep(0, 10)), samples=data.frame(group=gl(5,2))) myPrcomp2 <- prcomp(myDgeList2) stopifnot(identical(myPrcomp, myPrcomp2))myCounts <- matrix(rnbinom(10000, 3, 0.25), nrow=1000) myDgeList <- DGEList(counts=myCounts, samples=data.frame(group=gl(5,2))) myPrcomp <- prcomp(myDgeList) vsn::meanSdPlot(myPrcomp$vsnFit) ## features with zero count in all samples do not contribute to the PCA analysis myDgeList2 <- DGEList(counts=rbind(myCounts, rep(0, 10)), samples=data.frame(group=gl(5,2))) myPrcomp2 <- prcomp(myDgeList2) stopifnot(identical(myPrcomp, myPrcomp2))
Run principal component analysis on a DGEListList object
## S3 method for class 'DGEListList' prcomp(x, ntop = NULL, fun = function(x) cpm(x, log = TRUE), ...)## S3 method for class 'DGEListList' prcomp(x, ntop = NULL, fun = function(x) cpm(x, log = TRUE), ...)
x |
A |
ntop |
NULL or integer. If set, only |
fun |
Function, used to transform count data into continuous data used by PCA |
... |
Not used. |
A list of prcomp objects.
Principal component analysis of an expression matrix
prcompExprs(matrix, ntop = NULL, scale = FALSE, nbin = NULL)prcompExprs(matrix, ntop = NULL, scale = FALSE, nbin = NULL)
matrix |
Numeric matrix. Features in rows and samples in columns. |
ntop |
Integer or NULL. If not |
scale |
Logical, whether variance of features should be scaled to 1.
Default |
nbin |
Integer. Genes are divided into |
A prcomp object.
Nguyen, Lan Huong, and Susan Holmes. "Ten Quick Tips for Effective Dimensionality Reduction." PLOS Computational Biology 15, no. 6 (2019): e1006907
myTestExprs <- matrix(rnorm(1000), ncol=10, byrow=FALSE) myTestExprs[1:50, 6:10] <- myTestExprs[1:50, 6:10] + 2 myTopPca <- prcompExprs(myTestExprs, ntop=50, nbin=5)myTestExprs <- matrix(rnorm(1000), ncol=10, byrow=FALSE) myTestExprs[1:50, 6:10] <- myTestExprs[1:50, 6:10] + 2 myTopPca <- prcompExprs(myTestExprs, ntop=50, nbin=5)
Convert p-values to t-statistics
pseudoTfromPvalue(p, df, sign, replaceZero = TRUE)pseudoTfromPvalue(p, df, sign, replaceZero = TRUE)
p |
Numeric, a numeric vector between 0 and 1. |
df |
Numeric, degree of freedom. |
sign |
Logical or integer, positive numbers or |
replaceZero |
Logical, whether small p values or 0 should be replaced by a sufficient small number. Default and recommended: |
A numeric vector of pseudo t-statistics.
pVals <- 10^(seq(-11,0)) signs <- rep(c(TRUE, FALSE), 6) tVals <- pseudoTfromPvalue(pVals, 5, sign=signs) logFCs <- rep(c(1.2,-1.2),6) tValsLogFCs <- pseudoTfromPvalue(pVals, 5, sign=logFCs)pVals <- 10^(seq(-11,0)) signs <- rep(c(TRUE, FALSE), 6) tVals <- pseudoTfromPvalue(pVals, 5, sign=signs) logFCs <- rep(c(1.2,-1.2),6) tValsLogFCs <- pseudoTfromPvalue(pVals, 5, sign=logFCs)
Return a range determined by the quantile of the data
quantileRange(x, outlier = 0.01, symmetric = TRUE)quantileRange(x, outlier = 0.01, symmetric = TRUE)
x |
A numeric vector |
outlier |
Quantile (lower and higher) threshold |
symmetric |
Logical, whether the range must be symmetric around zero. |
A numeric vector of two (c(low, high)).
Read Illumina MolPhen sample sheet from XLS files
read_illumina_sampleSheet_xls(file)read_illumina_sampleSheet_xls(file)
file |
A XLS/XLSX file containing in the first sheet the sample sheet of a molecular phenotyping experiment |
A data.frame annotating the samples
Read a Biokit output directory into a DGEList object for downstream analysis
readBiokitAsDGEList( dir, anno = c("refseq", "ensembl", "gencode"), useCollapsedData = FALSE, verbose = FALSE )readBiokitAsDGEList( dir, anno = c("refseq", "ensembl", "gencode"), useCollapsedData = FALSE, verbose = FALSE )
dir |
Biokit output directory |
anno |
Annotation type, either |
useCollapsedData |
Logical, |
verbose |
Logical
The function depends on |
A DGEList object with count and TPM matrices, sample
annotation, feature annotation, and an additional BiokitAnno
element indicating the annotation type used.
##... (TODO: add a mock output directory in testdata)##... (TODO: add a mock output directory in testdata)
Read feature annotation from Biokit directory
readBiokitFeatureAnnotation( dir, anno = c("refseq", "ensembl", "gencode"), verbose = FALSE )readBiokitFeatureAnnotation( dir, anno = c("refseq", "ensembl", "gencode"), verbose = FALSE )
dir |
Character string, a Biokit output directory. |
anno |
Character, indicating the annotation type. |
verbose |
Logical |
A data.frame containing feature annotation, with feature IDs
as characters in rownames. The data frame contains following columns
depending on the anno parameter:
FeatureName, the primary key of feature name as characters
GeneID (refseq only) or EnsemblID (ensembl only)
GeneSymbol
mean: mean length
median: median length
longest_isoform: longest isoform
merged: total length of merged exons
The function depends on the refseq.annot.gz (ensembl.annot.gz)
and refseq.geneLength.gz (ensembl.geneLength.gz) files in the
biokit directory.
If .annot.gz file is not found (which can be the case, for instance,
when older biokit output directories are used), feature annotation is
read from the count GCT file. The resulting data.frame will only
contain two columns: FeatureName and Description.
If .geneLength.gz file is not found, no gene length information is
appended.
## TODO add small example files## TODO add small example files
Read GCT files from Biokit output directory
readBiokitGctFile( dir, anno = c("refseq", "ensembl", "gencode"), type = c("count", "tpm", "count_collapsed", "tpm_collapsed", "log2tpm"), verbose = FALSE )readBiokitGctFile( dir, anno = c("refseq", "ensembl", "gencode"), type = c("count", "tpm", "count_collapsed", "tpm_collapsed", "log2tpm"), verbose = FALSE )
dir |
Biokit output directory |
anno |
Annotation type, either |
type |
GCT file type, |
verbose |
Logical, if TRUE, verbose mode is turned on. |
A numeric matrix with the attribute desc encoding the values
in the description column of the GCT format.
The function depends on gct (in case anno="refseq") or
gct-ens (in case anno="ensembl") sub-directory in the biokit
output directory.
##... (TODO: add a mock output directory in testdata)##... (TODO: add a mock output directory in testdata)
Read Biokit phenodata
readBiokitPhenodata(dir, verbose = FALSE)readBiokitPhenodata(dir, verbose = FALSE)
dir |
Character string, Biokit output directory |
verbose |
Logical |
A data.frame with sample annotation in columns, and sample
names (identical as the names in gct files, character strings) are row
names. Nmes of the first three columns are fixed:
SampleName, SampleID and group concatenated by
underscore
SampleID
group
The function depends on the annot/phenoData.meta file in the biokit
output directory.
Read feature annotation for EdgeR pipeline
readFeatureAnnotationForEdgeR(featureNames, file = NULL)readFeatureAnnotationForEdgeR(featureNames, file = NULL)
featureNames |
Character string, feature names |
file |
A tab-delimited file with header that provides feature annotation |
A data.frame, with the first column named FeatureName,
which are the input feature names. The rest columns contain annotations.
The functions tries to parse feature annotation file if it is present. If
not, it will use guessAndAnnotate to
annotate the features.
Note that for gene symbols, only human gene symbols are supported.
anno <- "GeneID\tGeneSymbol\n1234\tCCR5\n1235\tCCR6" annoFile <- tempfile() writeLines(anno, annoFile) featIds <- c("1235", "1234") ## use file readFeatureAnnotationForEdgeR(featIds, file=annoFile) ## Not run: ## use ribiosAnnotation, depending on database connection readFeatureAnnotationForEdgeR(featIds, file=NULL) ## End(Not run)anno <- "GeneID\tGeneSymbol\n1234\tCCR5\n1235\tCCR6" annoFile <- tempfile() writeLines(anno, annoFile) featIds <- c("1235", "1234") ## use file readFeatureAnnotationForEdgeR(featIds, file=annoFile) ## Not run: ## use ribiosAnnotation, depending on database connection readFeatureAnnotationForEdgeR(featIds, file=NULL) ## End(Not run)
Read molecular phenotyping output folder into a DGEList object
readMolPhenAsDGEList(dir)readMolPhenAsDGEList(dir)
dir |
Path of molecular phenotyping output folder, generated by the mpsnake tool. |
A DGEList object containing counts, gene and sample annotation.
#todo#todo
Read molecular phenotyping coverage file
readMolPhenCoverageGct(file)readMolPhenCoverageGct(file)
file |
Character string, a coverage GCT file of a molecular phenotyping experiment. |
A list of two elements: coverage, which represents the
coverage matrix, and genes, which represents feature annotation.
mpsCov <- readMolPhenCoverageGct(system.file(file.path("extdata", "AmpliSeq_files", "MolPhen-coverage-example-20200115.gct"), package="ribiosNGS"))mpsCov <- readMolPhenCoverageGct(system.file(file.path("extdata", "AmpliSeq_files", "MolPhen-coverage-example-20200115.gct"), package="ribiosNGS"))
Read mpsnake output directory into a DGEList object
readMpsnakeAsDGEList(dir, minReads = NULL)readMpsnakeAsDGEList(dir, minReads = NULL)
dir |
Character string, path of mpsnake pipeline directory (or the |
minReads |
|
A DGEList object containing counts, gene, and sample annotation
mpsnakeDir <- system.file("extdata/mpsnake-minimal-outdir", package="ribiosNGS") mpsDgeList <- readMpsnakeAsDGEList(mpsnakeDir) ## equivalent mpsnakeResDir <- system.file("extdata/mpsnake-minimal-outdir", "results", package="ribiosNGS") mpsDgeList <- readMpsnakeAsDGEList(mpsnakeResDir)mpsnakeDir <- system.file("extdata/mpsnake-minimal-outdir", package="ribiosNGS") mpsDgeList <- readMpsnakeAsDGEList(mpsnakeDir) ## equivalent mpsnakeResDir <- system.file("extdata/mpsnake-minimal-outdir", "results", package="ribiosNGS") mpsDgeList <- readMpsnakeAsDGEList(mpsnakeResDir)
Read sample annotation from tab-delimited file for EdgeR analysis
readSampleAnnotationForEdgeR(sampleNames, file = NULL, ...)readSampleAnnotationForEdgeR(sampleNames, file = NULL, ...)
sampleNames |
Character string, giving sample names |
file |
Character string, path to a tab-delimited file, or |
... |
Other parameter passed to |
A data.frame containing sample annotation, removing
'lib.size', and 'norm.factors' because they will be added
by the edgeR pipeline
phenoDataFile <- system.file("extdata/phenoData/test-phenoData.txt", package="ribiosNGS") readSampleAnnotationForEdgeR(phenoDataFile) readSampleAnnotationForEdgeR(file=NULL, sampleNames=as.character(1:4))phenoDataFile <- system.file("extdata/phenoData/test-phenoData.txt", package="ribiosNGS") readSampleAnnotationForEdgeR(phenoDataFile) readSampleAnnotationForEdgeR(file=NULL, sampleNames=as.character(1:4))
Rename contrast by a pair of vectors
renameContrast(edgeResult, oldContrastName, newContrastName)renameContrast(edgeResult, oldContrastName, newContrastName)
edgeResult |
An |
oldContrastName |
A vector of character strings giving old contrast names |
newContrastName |
completeA vector of character strings giving new contrast names, which match the |
A new EdgeResult object
Rename contrast by a function
renameContrastByFunc(edgeResult, func)renameContrastByFunc(edgeResult, func)
edgeResult |
An |
func |
A function receiving a vector of character strings as input, and returns
another vector of the same length as output, for instance |
A new EdgeResult object
Replace NA counts with zero counts
replaceNAwithZero(edgeObj)replaceNAwithZero(edgeObj)
edgeObj |
An EdgeObject object |
An EdgeObject object
ribiosNGS provides data structures and functions for next-generation sequencing gene expression analysis
Variance of features in rows
rowVars(x, na.rm = TRUE)rowVars(x, na.rm = TRUE)
x |
Numeric matrix |
na.rm |
Logical. Should missing values (including NaN) be omitted from the calculations? |
A numeric vector of row variances.
myVal <- matrix(1:9, nrow=3, byrow=FALSE) myVar <- rowVars(myVal) stopifnot(identical(myVar, c(9,9,9)))myVal <- matrix(1:9, nrow=3, byrow=FALSE) myVar <- rowVars(myVal) stopifnot(identical(myVar, c(9,9,9)))
Convert a RPKM matrix to a TPM matrix
rpkm2tpm(x)rpkm2tpm(x)
x |
A count matrix or other objects that can be converted to a matrix
by |
transcripts per million (TPM) values
testMatrix <- matrix(rnbinom(200, size=5, prob=0.1), nrow=20, ncol=10) testMatrixGeneLen <- as.integer(10^rnorm(20, mean=3, sd=0.5)) testMatrixTpm <- tpm(testMatrix, testMatrixGeneLen) testMatrixRpkm <- edgeR::rpkm(testMatrix, testMatrixGeneLen) testthat::expect_equal(testMatrixTpm, rpkm2tpm(testMatrixRpkm))testMatrix <- matrix(rnbinom(200, size=5, prob=0.1), nrow=20, ncol=10) testMatrixGeneLen <- as.integer(10^rnorm(20, mean=3, sd=0.5)) testMatrixTpm <- tpm(testMatrix, testMatrixGeneLen) testMatrixRpkm <- edgeR::rpkm(testMatrix, testMatrixGeneLen) testthat::expect_equal(testMatrixTpm, rpkm2tpm(testMatrixRpkm))
Return sample names from a DGEList object
## S4 method for signature 'DGEList' sampleNames(object)## S4 method for signature 'DGEList' sampleNames(object)
object |
A DGEList object |
A character vector of sample names.
Sample names
## S4 method for signature 'EdgeObject' sampleNames(object)## S4 method for signature 'EdgeObject' sampleNames(object)
object |
An EdgeObject |
A character vector of sample names.
Set common dispersion if missing
setCommonDispIfMissing(object, value) ## S4 method for signature 'DGEList,numeric' setCommonDispIfMissing(object, value) ## S4 method for signature 'EdgeObject,numeric' setCommonDispIfMissing(object, value)setCommonDispIfMissing(object, value) ## S4 method for signature 'DGEList,numeric' setCommonDispIfMissing(object, value) ## S4 method for signature 'EdgeObject,numeric' setCommonDispIfMissing(object, value)
object |
An object |
value |
Numeric |
The object, with common dispersion set if it was previously missing.
setCommonDispIfMissing(object = DGEList, value = numeric): Method for DGEList
setCommonDispIfMissing(object = EdgeObject, value = numeric): Method for EdgeObject
Show DGEList
## S4 method for signature 'DGEList' show(object)## S4 method for signature 'DGEList' show(object)
object |
A DGEList object |
Called for its side effect of printing; returns invisibly NULL.
Show DGEListList
## S4 method for signature 'DGEListList' show(object)## S4 method for signature 'DGEListList' show(object)
object |
A DGEListList object |
Called for its side effect of printing; returns invisibly NULL.
Show an EdgeResult object
## S4 method for signature 'EdgeResult' show(object)## S4 method for signature 'EdgeResult' show(object)
object |
An EdgeResult object |
Invisibly returns the formatted message string.
Show an EdgeSigFilter object
## S4 method for signature 'EdgeSigFilter' show(object)## S4 method for signature 'EdgeSigFilter' show(object)
object |
An SigFilter object |
Invisibly returns the formatted message strings.
Show an LimmaSigFilter object
## S4 method for signature 'LimmaSigFilter' show(object)## S4 method for signature 'LimmaSigFilter' show(object)
object |
An LimmaSigFilter object |
Invisibly returns the formatted message strings.
Show an SigFilter object
## S4 method for signature 'SigFilter' show(object)## S4 method for signature 'SigFilter' show(object)
object |
An SigFilter object |
Invisibly returns the formatted message string.
Retrieve SigFilter objects from other objects Return the SigFilter in use
sigFilter(countDgeResult)sigFilter(countDgeResult)
countDgeResult |
An |
An SigFilter object
Build a SigFilter
SigFilter(logFC, posLogFC, negLogFC, pValue, FDR) EdgeSigFilter(logFC, posLogFC, negLogFC, pValue, FDR, logCPM) LimmaSigFilter(logFC, posLogFC, negLogFC, pValue, FDR, aveExpr)SigFilter(logFC, posLogFC, negLogFC, pValue, FDR) EdgeSigFilter(logFC, posLogFC, negLogFC, pValue, FDR, logCPM) LimmaSigFilter(logFC, posLogFC, negLogFC, pValue, FDR, aveExpr)
logFC |
Missing or positive numeric |
posLogFC |
Missing or positive numeric |
negLogFC |
Missing or negative numeric |
pValue |
Missing or numeric between 0 and 1 |
FDR |
Missing or numeric between 0 and 1 |
logCPM |
logCPM filter, only valid for |
aveExpr |
Average expression filter, only valid for |
A SigFilter object
SigFilter() SigFilter(logFC=2) SigFilter(negLogFC=-1) SigFilter(FDR=0.05) esf <- EdgeSigFilter(logFC=2, FDR=0.05, logCPM=0) LimmaSigFilter(logFC=1, FDR=0.05, aveExpr=10)SigFilter() SigFilter(logFC=2) SigFilter(negLogFC=-1) SigFilter(FDR=0.05) esf <- EdgeSigFilter(logFC=2, FDR=0.05, logCPM=0) LimmaSigFilter(logFC=1, FDR=0.05, aveExpr=10)
Base result filter for significantly regulated genes
posLogFCNumeric, positive logFC threshold (larger values are kept)
negLogFCNumeric, negative logFC threshold (more negative values are kept)
pValueNumeric, p-value treshold (smaller values are kept)
FDRNumeric, FDR treshold
Replace the SigFilter of an CountDgeResult
sigFilter(countDgeResult) <- valuesigFilter(countDgeResult) <- value
countDgeResult |
An |
value |
An SigFilter object |
An updated countDgeResult object
Return significantly regulated genes
sigGene(countDgeResult, contrast, value = NULL) sigPosGene(countDgeResult, contrast, value = NULL) sigNegGene(countDgeResult, contrast, value = NULL)sigGene(countDgeResult, contrast, value = NULL) sigPosGene(countDgeResult, contrast, value = NULL) sigNegGene(countDgeResult, contrast, value = NULL)
countDgeResult |
An EdgeResult object |
contrast |
Character, contrast(s) of interest |
value |
|
A vector of identifiers
sigPosGene(): Only return positively significantly regulated genes
sigNegGene(): Only return negatively significantly regulated genes
exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exMat[2:4, 4:6] <- exMat[2:4, 4:6]+20 exMat[7:9, 1:3] <- exMat[7:9, 1:3]+20 exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneID=1:nrow(exMat), GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- dgeWithEdgeR(exObj) sigGenes(exDgeRes) sigPosGenes(exDgeRes) sigNegGenes(exDgeRes) ## specify the value type to return sigGenes(exDgeRes, value="GeneSymbol") sigPosGenes(exDgeRes, value="GeneSymbol") sigNegGenes(exDgeRes, value="GeneSymbol")exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exMat[2:4, 4:6] <- exMat[2:4, 4:6]+20 exMat[7:9, 1:3] <- exMat[7:9, 1:3]+20 exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneID=1:nrow(exMat), GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- dgeWithEdgeR(exObj) sigGenes(exDgeRes) sigPosGenes(exDgeRes) sigNegGenes(exDgeRes) ## specify the value type to return sigGenes(exDgeRes, value="GeneSymbol") sigPosGenes(exDgeRes, value="GeneSymbol") sigNegGenes(exDgeRes, value="GeneSymbol")
Barchart of significantly regulated genes
sigGeneBarchart( edgeResult, logy = FALSE, scales = list(x = list(rot = 45), y = list(alternating = 1, tck = c(1, 0))), stack = FALSE, ylab = "Significant DEGs", col = c(positive = "orange", negative = "lightblue"), auto.key = list(columns = 2), ... )sigGeneBarchart( edgeResult, logy = FALSE, scales = list(x = list(rot = 45), y = list(alternating = 1, tck = c(1, 0))), stack = FALSE, ylab = "Significant DEGs", col = c(positive = "orange", negative = "lightblue"), auto.key = list(columns = 2), ... )
edgeResult |
An EdgeResult object |
logy |
Logical, whether y-axis should be log-10 transformed |
scales |
passed to |
stack |
passed to |
ylab |
passed to |
col |
passed to |
auto.key |
passed to |
... |
passed to |
A trellis object (lattice barchart).
Return counts of significantly regulated genes
sigGeneCounts(countDgeResult, value = NULL)sigGeneCounts(countDgeResult, value = NULL)
countDgeResult |
An EdgeResult object |
value |
|
A data.frame containing counts of positively and negatively regulated genes, the sum, as well as total number of features
exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exMat[2:4, 4:6] <- exMat[2:4, 4:6]+20 exMat[7:9, 1:3] <- exMat[7:9, 1:3]+20 exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- dgeWithEdgeR(exObj) sigGeneCounts(exDgeRes)exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exMat[2:4, 4:6] <- exMat[2:4, 4:6]+20 exMat[7:9, 1:3] <- exMat[7:9, 1:3]+20 exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- dgeWithEdgeR(exObj) sigGeneCounts(exDgeRes)
Return dgeTable containing significantly regulated genes in respective contrasts
sigGeneDgeTable(countDgeResult, value = "FeatrueName")sigGeneDgeTable(countDgeResult, value = "FeatrueName")
countDgeResult |
An EdgeResult object |
value |
A character string, it must be a column name in the feature annotation data. Default: FeatureName. |
A data.frame containing dgeTable of positively and negatively regulated genes in respective contrasts
exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exMat[2:4, 4:6] <- exMat[2:4, 4:6]+20 exMat[7:9, 1:3] <- exMat[7:9, 1:3]+20 exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- dgeWithEdgeR(exObj) sigGeneDgeTable(exDgeRes, value="GeneSymbol")exMat <- matrix(rpois(120, 10), nrow=20, ncol=6) exMat[2:4, 4:6] <- exMat[2:4, 4:6]+20 exMat[7:9, 1:3] <- exMat[7:9, 1:3]+20 exGroups <- gl(2,3, labels=c("Group1", "Group2")) exDesign <- model.matrix(~0+exGroups) colnames(exDesign) <- levels(exGroups) exContrast <- matrix(c(-1,1), ncol=1, dimnames=list(c("Group1", "Group2"), c("Group2.vs.Group1"))) exDescon <- DesignContrast(exDesign, exContrast, groups=exGroups) exFdata <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(exMat))) exPdata <- data.frame(Name=sprintf("Sample%d", 1:ncol(exMat)), Group=exGroups) exObj <- EdgeObject(exMat, exDescon, fData=exFdata, pData=exPdata) exDgeRes <- dgeWithEdgeR(exObj) sigGeneDgeTable(exDgeRes, value="GeneSymbol")
Return gene identifiers of significant DGEs
sigGeneIdentifiers(dgeResult, contrast, sigFunc = isSig, value = NULL)sigGeneIdentifiers(dgeResult, contrast, sigFunc = isSig, value = NULL)
dgeResult |
An DgeResult object. |
contrast |
A character string, a contrast of interest. |
sigFunc |
A function, defining the type of significant genes. |
value |
|
A vector of character strings indicating the gene identifiers that are significantly regulated. If no defined types are found, either rownames or the first column is returned
Return significantly regulated genes of all contrasts
sigGenes(countDgeResult, value = NULL) sigPosGenes(countDgeResult, value = NULL) sigNegGenes(countDgeResult, value = NULL)sigGenes(countDgeResult, value = NULL) sigPosGenes(countDgeResult, value = NULL) sigNegGenes(countDgeResult, value = NULL)
countDgeResult |
An EdgeResult object |
value |
|
A list of vectors of identifiers
sigPosGenes(): Only return significantly positively regulated genes
sigNegGenes(): Only return significantly negatively regulated genes
TODO fix: add InputFeature
Send an edgeR analysis job to SLURM
slurmEdgeR( dgeList, designContrast, outdir = "edgeR_output", outfilePrefix = "an-unnamed-project-", overwrite = c("ask", "overwrite", "append", "no"), mps = FALSE, limmaVoom = FALSE, appendGmt = NULL, qos = c("3h", "1d", "3d", "15d", "interactive", "preempt"), rootPath = "/apps/rocs/pRED/groups/bioinfo/geneexpression", debug = FALSE )slurmEdgeR( dgeList, designContrast, outdir = "edgeR_output", outfilePrefix = "an-unnamed-project-", overwrite = c("ask", "overwrite", "append", "no"), mps = FALSE, limmaVoom = FALSE, appendGmt = NULL, qos = c("3h", "1d", "3d", "15d", "interactive", "preempt"), rootPath = "/apps/rocs/pRED/groups/bioinfo/geneexpression", debug = FALSE )
dgeList |
An |
designContrast |
The DesignContrast object to model the data |
outdir |
Output directory of the edgeR script. Default value "edgeR_output". |
outfilePrefix |
Prefix of the output files. It can include directories,
e.g. |
overwrite |
If |
mps |
Logical, whether molecular-phenotyping analysis is run. |
limmaVoom |
Logical, whether the limma-voom model is run instead of the edgeR model. |
appendGmt |
|
qos |
Character, specifying Quality of Service of Slurm. Available values include |
rootPath |
Character string, the directory of geneexpression scripts, under which |
debug |
Logical, if |
A list of two items, command, the command line call, and
output, the output of the SLURM command in bash
Even if the output directory is empty, if overwrite is set to
no (or if the user answers no), the job will not be started.
mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) myFac <- gl(2,5, labels=c("Control", "Treatment")) y <- edgeR::DGEList(counts=mat, group=myFac) myDesign <- model.matrix(~myFac); colnames(myDesign) <- levels(myFac) myContrast <- limma::makeContrasts(Treatment, levels=myDesign) myDescon <- DesignContrast(myDesign, myContrast) ## \dontrun{ ## slurmEdgeR(y, myDescon, outfilePrefix="test", outdir=tempdir()) ## }mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) myFac <- gl(2,5, labels=c("Control", "Treatment")) y <- edgeR::DGEList(counts=mat, group=myFac) myDesign <- model.matrix(~myFac); colnames(myDesign) <- levels(myFac) myContrast <- limma::makeContrasts(Treatment, levels=myDesign) myDescon <- DesignContrast(myDesign, myContrast) ## \dontrun{ ## slurmEdgeR(y, myDescon, outfilePrefix="test", outdir=tempdir()) ## }
Return the SLURM command to run the edgeR script
slurmEdgeRcommand( dgeList, designContrast, outdir = "edgeR_output", outfilePrefix = "an-unnamed-project-", mps = FALSE, limmaVoom = FALSE, appendGmt = NULL, qos = c("3h", "1d", "3d", "15d", "interactive", "preempt"), params = "", rootPath = "/apps/rocs/pRED/groups/bioinfo/geneexpression", debug = FALSE )slurmEdgeRcommand( dgeList, designContrast, outdir = "edgeR_output", outfilePrefix = "an-unnamed-project-", mps = FALSE, limmaVoom = FALSE, appendGmt = NULL, qos = c("3h", "1d", "3d", "15d", "interactive", "preempt"), params = "", rootPath = "/apps/rocs/pRED/groups/bioinfo/geneexpression", debug = FALSE )
dgeList |
An |
designContrast |
The DesignContrast object to model the data |
outdir |
Output directory of the edgeR script. Default value "edgeR_output". |
outfilePrefix |
Prefix of the output files. It can include directories,
e.g. |
mps |
Logical, whether molecular-phenotyping analysis is run. |
limmaVoom |
Logical, whether the limma-voom model is run instead of the edgeR model |
appendGmt |
|
qos |
Character, specifying Quality of Service of Slurm. Available values include |
params |
Character, further parameters to pass to sbatch, for instance "–partition ANOTHER_PARITION" |
rootPath |
Character string, the directory of geneexpression scripts, under which |
debug |
Logical, if This function wraps the function It uses |
A character string containing the SLURM sbatch command to
submit the edgeR analysis job.
mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) myFac <- gl(2,5, labels=c("Control", "Treatment")) y <- edgeR::DGEList(counts=mat, group=myFac) myDesign <- model.matrix(~myFac); colnames(myDesign) <- levels(myFac) myContrast <- limma::makeContrasts(Treatment, levels=myDesign) myDesCon <- DesignContrast(designMatrix=myDesign, contrastMatrix=myContrast) mytempdir <- tempdir() slurmEdgeRcommand(y, myDesCon, outfilePrefix="test", outdir=mytempdir) ## clean up unlink(file.path(mytempdir, "input_data"), recursive=TRUE) slurmFile <- file.path(dirname(mytempdir), paste0("slurm-", basename(mytempdir), ".sh")) unlink(slurmFile)mat <- matrix(rnbinom(100, mu=5, size=2), ncol=10) rownames(mat) <- sprintf("gene%d", 1:nrow(mat)) myFac <- gl(2,5, labels=c("Control", "Treatment")) y <- edgeR::DGEList(counts=mat, group=myFac) myDesign <- model.matrix(~myFac); colnames(myDesign) <- levels(myFac) myContrast <- limma::makeContrasts(Treatment, levels=myDesign) myDesCon <- DesignContrast(designMatrix=myDesign, contrastMatrix=myContrast) mytempdir <- tempdir() slurmEdgeRcommand(y, myDesCon, outfilePrefix="test", outdir=mytempdir) ## clean up unlink(file.path(mytempdir, "input_data"), recursive=TRUE) slurmFile <- file.path(dirname(mytempdir), paste0("slurm-", basename(mytempdir), ".sh")) unlink(slurmFile)
Smear plot
smearPlot(object, ...) ## S4 method for signature 'EdgeResult' smearPlot( object, contrast = NULL, freeRelation = FALSE, xlab = "Average logCPM", ylab = "logFC", pch = 19, cex = 0.2, smearWidth = 0.5, panel.first = grid(), smooth.scatter = FALSE, lowess = FALSE, multipage = FALSE, ... )smearPlot(object, ...) ## S4 method for signature 'EdgeResult' smearPlot( object, contrast = NULL, freeRelation = FALSE, xlab = "Average logCPM", ylab = "logFC", pch = 19, cex = 0.2, smearWidth = 0.5, panel.first = grid(), smooth.scatter = FALSE, lowess = FALSE, multipage = FALSE, ... )
object |
An object |
... |
Other parameters |
contrast |
Character, contrast of interest |
freeRelation |
Logical |
xlab |
Character |
ylab |
Character |
pch |
Character or integer |
cex |
Numeric |
smearWidth |
Numeric |
panel.first |
Grid |
smooth.scatter |
Logical |
lowess |
Logical |
multipage |
Logical |
Called for its side effect of plotting; returns invisibly NULL.
smearPlot(EdgeResult): Method for EdgeResult
Split a DGEList object by a factor of samples (default) or genes
## S3 method for class 'DGEList' split(x, f, drop = FALSE, bySample = TRUE, sampleDropLevels = TRUE, ...) splitDGEList(x, f, drop = FALSE, bySample = TRUE, sampleDropLevels = TRUE, ...)## S3 method for class 'DGEList' split(x, f, drop = FALSE, bySample = TRUE, sampleDropLevels = TRUE, ...) splitDGEList(x, f, drop = FALSE, bySample = TRUE, sampleDropLevels = TRUE, ...)
x |
A |
f |
A factor vector. Other types will be coereced into factors. |
drop |
Not used now |
bySample |
Logical, if |
sampleDropLevels |
Logical, if |
... |
Not used so far. |
A DGEListList object, a list of DGEList objects split
by the factor f.
splitDGEList(): A wrapper of split.DGEList
y1 <- matrix(rnbinom(1000, mu=5, size=2), ncol=4) genes1 <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(y1)), GeneType=gl(5,50)) rownames(y1) <- rownames(genes1) <- 1:nrow(y1) anno1 <- data.frame(treatment=gl(2,2, labels=c("ctrl", "tmt")), donor=factor(rep(c(1,2), each=2))) d1 <- DGEList(counts=y1, genes=genes1, samples=anno1) d1SampleSplit <- split(d1, d1$samples$donor) d1GeneSplit <- split(d1, d1$genes$GeneType, bySample=FALSE)y1 <- matrix(rnbinom(1000, mu=5, size=2), ncol=4) genes1 <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(y1)), GeneType=gl(5,50)) rownames(y1) <- rownames(genes1) <- 1:nrow(y1) anno1 <- data.frame(treatment=gl(2,2, labels=c("ctrl", "tmt")), donor=factor(rep(c(1,2), each=2))) d1 <- DGEList(counts=y1, genes=genes1, samples=anno1) d1SampleSplit <- split(d1, d1$samples$donor) d1GeneSplit <- split(d1, d1$genes$GeneType, bySample=FALSE)
Make static gene-level plots of an EdgeResult object
staticGeneLevelPlots(edgeResult)staticGeneLevelPlots(edgeResult)
edgeResult |
An EdgeResult object |
Called for its side effect of generating plots; returns invisibly NULL.
edgeObj <- exampleEdgeObject() edgeRes <- dgeWithEdgeR(edgeObj) staticGeneLevelPlots(edgeRes) limmaVoomRes <- dgeWithEdgeR(edgeObj) staticGeneLevelPlots(limmaVoomRes)edgeObj <- exampleEdgeObject() edgeRes <- dgeWithEdgeR(edgeObj) staticGeneLevelPlots(edgeRes) limmaVoomRes <- dgeWithEdgeR(edgeObj) staticGeneLevelPlots(limmaVoomRes)
Detect surrogate variables from count data and remove their effects VSN-transformed counts
svaseqRemove(dgeList, design, nullModel, verbose = FALSE, offset)svaseqRemove(dgeList, design, nullModel, verbose = FALSE, offset)
dgeList |
An |
design |
Design matrix |
nullModel |
Null model matrix |
verbose |
Logical |
offset |
If provided, it is passed to In case no significant surrogate variables are detected, PCA analysis is applied to the vsn-transformed matrix. |
A list with following items
Results of svaseq
Fit object of vsn
Fitted matrix of vsn
Fitted matrix of vsn, with surrogates' effect removed
PCA object derived from vsnBatchRemoved
PCA scores with annotations
Design matrix with surrogates variables appended if any
This function needs to be harmonized with the other SVA functions. The reason is that svaseq was for a long time not stable until recently. Therefore this function is written later, and unfortunately the outcome is not harmonized yet.
y1org <- matrix(rnbinom(4000, mu=5, size=2), ncol=8) genes1 <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(y1org))) y1 <- y1org y1[30:120, 4:7] <- y1[30:120, 4:7]+9 ## mimicking batch effect rownames(y1org) <- rownames(y1) <- rownames(genes1) <- 1:nrow(y1) anno1 <- data.frame(treatment=gl(2,4, labels=c("ctrl", "tmt")), donor=factor(rep(c(1,2), 4))) d1 <- DGEList(counts=y1, genes=genes1, samples=anno1) d2 <- DGEList(counts=y1org, genes=genes1, samples=anno1) design <- model.matrix(~treatment+donor, data=d1$samples) nullModel <- model.matrix(~donor, data=d1$samples) d1VsnSvaRes <- svaseqRemove(d1, design, nullModel) d2VsnSvaRes <- svaseqRemove(d2, design, nullModel)y1org <- matrix(rnbinom(4000, mu=5, size=2), ncol=8) genes1 <- data.frame(GeneSymbol=sprintf("Gene%d", 1:nrow(y1org))) y1 <- y1org y1[30:120, 4:7] <- y1[30:120, 4:7]+9 ## mimicking batch effect rownames(y1org) <- rownames(y1) <- rownames(genes1) <- 1:nrow(y1) anno1 <- data.frame(treatment=gl(2,4, labels=c("ctrl", "tmt")), donor=factor(rep(c(1,2), 4))) d1 <- DGEList(counts=y1, genes=genes1, samples=anno1) d2 <- DGEList(counts=y1org, genes=genes1, samples=anno1) design <- model.matrix(~treatment+donor, data=d1$samples) nullModel <- model.matrix(~donor, data=d1$samples) d1VsnSvaRes <- svaseqRemove(d1, design, nullModel) d2VsnSvaRes <- svaseqRemove(d2, design, nullModel)
Tagwise biological coefficients of variance
tagwiseBCV(x) ## S4 method for signature 'DGEList' tagwiseBCV(x) ## S4 method for signature 'EdgeResult' tagwiseBCV(x)tagwiseBCV(x) ## S4 method for signature 'DGEList' tagwiseBCV(x) ## S4 method for signature 'EdgeResult' tagwiseBCV(x)
x |
An object |
A numeric vector of tagwise BCV values.
tagwiseBCV(DGEList): method for DGEList
tagwiseBCV(EdgeResult): method for EdgeResult
Test GLM
testGLM(object, fit) ## S4 method for signature 'EdgeObject,DGEGLM' testGLM(object, fit)testGLM(object, fit) ## S4 method for signature 'EdgeObject,DGEGLM' testGLM(object, fit)
object |
An object |
fit |
A fit object |
An EdgeResult object containing the test results.
testGLM(object = EdgeObject, fit = DGEGLM): Method for EdgeObject and DGEGLM.
Return raw expression of top differentially expressed genes of multiple contrasts
topDgeExpression( edgeResult, ntop = 10, contrast = NULL, exprsFun = function(dgeList) cpm(dgeList, log = TRUE) )topDgeExpression( edgeResult, ntop = 10, contrast = NULL, exprsFun = function(dgeList) cpm(dgeList, log = TRUE) )
edgeResult |
An |
ntop |
Integer, number of top differentially expressed genes. |
contrast |
|
exprsFun |
A function to derive expression values from |
A tibble object with Contrast in the first column and
a wide table of raw expression, feature annotation, and sample annotations
in the rest columns
Return raw expression of top differentially expressed genes of one contrast
topDgeExpressionByContrast( edgeResult, ntop = 10, contrast, exprsFun = function(dgeList) cpm(dgeList, log = TRUE) )topDgeExpressionByContrast( edgeResult, ntop = 10, contrast, exprsFun = function(dgeList) cpm(dgeList, log = TRUE) )
edgeResult |
An |
ntop |
Integer, number of top differentially expressed genes. |
contrast |
A character string or an integer value, specifying which contrast is considered. |
exprsFun |
A function to derive expression values from |
A tibble object with Contrast in the first column and
a wide table of raw expression, feature annotation, and sample annotations
in the rest columns
Convert count matrix to TPM values
tpm(x, gene.length)tpm(x, gene.length)
x |
A count matrix or other objects that can be converted to a matrix
by |
gene.length |
A numeric vector of the same length as the row count of
|
transcripts per million (TPM) values
testMatrix <- matrix(rnbinom(200, size=5, prob=0.1), nrow=20, ncol=10) testMatrixGeneLen <- as.integer(10^rnorm(20, mean=3, sd=0.5)) testMatrixTpm <- tpm(testMatrix, testMatrixGeneLen)testMatrix <- matrix(rnbinom(200, size=5, prob=0.1), nrow=20, ncol=10) testMatrixGeneLen <- as.integer(10^rnorm(20, mean=3, sd=0.5)) testMatrixTpm <- tpm(testMatrix, testMatrixGeneLen)
Trended biological coefficients of variance
trendedBCV(x) ## S4 method for signature 'DGEList' trendedBCV(x) ## S4 method for signature 'EdgeResult' trendedBCV(x)trendedBCV(x) ## S4 method for signature 'DGEList' trendedBCV(x) ## S4 method for signature 'EdgeResult' trendedBCV(x)
x |
An object |
A numeric vector of trended BCV values.
trendedBCV(DGEList): method for DGEList
trendedBCV(EdgeResult): method for EdgeResult
Update a contrast matrix given a surrogate variable matrix
updateContrastMatrixWithSV(contrastMatrix, svMatrix)updateContrastMatrixWithSV(contrastMatrix, svMatrix)
contrastMatrix |
A contrast matrix |
svMatrix |
A surrogate-variable matrix, for instance returned by sva |
An updated matrix, with surrogate matrix variables appended to the rows
exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) colnames(exDesign) <- c("Baseline", "Treatment") exContrast <- limma::makeContrasts("Treatment"="Treatment", levels=exDesign) exVoomSvaRes <- voomSVA(exCounts, exDesign) updateContrastMatrixWithSV(exContrast, exVoomSvaRes)exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) colnames(exDesign) <- c("Baseline", "Treatment") exContrast <- limma::makeContrasts("Treatment"="Treatment", levels=exDesign) exVoomSvaRes <- voomSVA(exCounts, exDesign) updateContrastMatrixWithSV(exContrast, exVoomSvaRes)
Update design matrix by SVA
updateDesignMatrixBySVA(object, design, ...) ## S4 method for signature 'DGEList,formula' updateDesignMatrixBySVA(object, design, ...)updateDesignMatrixBySVA(object, design, ...) ## S4 method for signature 'DGEList,formula' updateDesignMatrixBySVA(object, design, ...)
object |
An object |
design |
Design matrix or formula |
... |
Other parameters |
A design matrix updated with surrogate variables.
updateDesignMatrixBySVA(object = DGEList, design = formula): for DGEList and formula
Update the SigFilter
updateSigFilter(countDgeResult, logFC, posLogFC, negLogFC, pValue, FDR, ...)updateSigFilter(countDgeResult, logFC, posLogFC, negLogFC, pValue, FDR, ...)
countDgeResult |
An |
logFC |
Numeric |
posLogFC |
Numeric |
negLogFC |
Numeric |
pValue |
Numeric |
FDR |
Numeric |
... |
Other parameters, now used ones including |
An updated CountDgeResult object with updated SigFilter
Volcano plot
volcanoPlot(object, ...) ## S4 method for signature 'EdgeResult' volcanoPlot( object, contrast = NULL, freeRelation = FALSE, colramp = ribiosPlot::heat, multipage = FALSE, yValue = c("PValue", "FDR"), xlim = NULL, ylim = NULL, main = NULL, topLabel = NULL, labelType = NULL, ... )volcanoPlot(object, ...) ## S4 method for signature 'EdgeResult' volcanoPlot( object, contrast = NULL, freeRelation = FALSE, colramp = ribiosPlot::heat, multipage = FALSE, yValue = c("PValue", "FDR"), xlim = NULL, ylim = NULL, main = NULL, topLabel = NULL, labelType = NULL, ... )
object |
An object |
... |
Other parameters |
contrast |
Character, contrast of interest. If |
freeRelation |
Logical. |
colramp |
Function, color palette. |
multipage |
Logical. |
yValue |
Character string, either |
xlim |
NULL or a numeric vector of two |
ylim |
NULL or a numeric vector of two. |
main |
Character, title. |
topLabel |
NULL or an integer number, number of top features to be labelled |
labelType |
NULL or a character string, a column name in the feature annotation |
Called for its side effect of plotting; returns invisibly NULL.
volcanoPlot(EdgeResult): Method for EdgeResult
Perform VOOM analysis
voom(object, ...) ## S4 method for signature 'DGEList' voom(object, ...) ## S4 method for signature 'matrix' voom(object, ...) ## S4 method for signature 'ExpressionSet' voom(object, ...) ## S4 method for signature 'EdgeObject' voom(object, ...)voom(object, ...) ## S4 method for signature 'DGEList' voom(object, ...) ## S4 method for signature 'matrix' voom(object, ...) ## S4 method for signature 'ExpressionSet' voom(object, ...) ## S4 method for signature 'EdgeObject' voom(object, ...)
object |
An object |
... |
Other parameters |
An EList object containing voom-transformed values.
voom(DGEList): Method for DGEList
voom(matrix): Method for matrix
voom(ExpressionSet): Method for matrix
voom(EdgeObject): Method for EdgeObject, norm.factors are calculated first if not done yet
Perform the voom+limma procedure
voomLimma( dgeList, design, contrasts, normalize.method = "none", block = NULL, correlation = NULL, weights = NULL, plot = FALSE, ... )voomLimma( dgeList, design, contrasts, normalize.method = "none", block = NULL, correlation = NULL, weights = NULL, plot = FALSE, ... )
dgeList |
A DGEList object, it should be ideally already filtered |
design |
The design matrix |
contrasts |
The contrast matrix |
normalize.method |
Character string, passed to |
block |
Blocking factor, passed to |
correlation |
Correlation between duplicates, passed to |
weights |
Weights, passed to |
plot |
Logical, whether the variance-mean relationship should be ploted |
... |
Passed to |
MArrayLM object returned by eBayes, with voom object in the voom element of the list
y <- matrix(rnbinom(10000,mu=5,size=2),ncol=4) d <- edgeR::DGEList(counts=y, group=rep(1:2,each=2)) d <- edgeR::calcNormFactors(d) design <- model.matrix(~gl(2,2)) colnames(design) <- c("baseline", "treatment") contrasts <- limma::makeContrasts("treatment", levels=design) dvl <- voomLimma(d, design=design, contrasts=contrasts)y <- matrix(rnbinom(10000,mu=5,size=2),ncol=4) d <- edgeR::DGEList(counts=y, group=rep(1:2,each=2)) d <- edgeR::calcNormFactors(d) design <- model.matrix(~gl(2,2)) colnames(design) <- c("baseline", "treatment") contrasts <- limma::makeContrasts("treatment", levels=design) dvl <- voomLimma(d, design=design, contrasts=contrasts)
Apply SVA to voom-transformed count data, and return the voom expression matrix with surrogate variables' effect removed
voomRemoveSV(counts, designMatrix)voomRemoveSV(counts, designMatrix)
counts |
A matrix of counts |
designMatrix |
Design matrix |
The voom expression matrix, with SV effects removed
A numeric matrix of voom-transformed expression values with surrogate variable effects removed.
exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) head(voomRemoveSV(exCounts, designMatrix=exDesign)) ## compare the results without SV removal, note the values in the ## second and third column are much larger than the rest head(voom(exCounts, exDesign)$E)exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) head(voomRemoveSV(exCounts, designMatrix=exDesign)) ## compare the results without SV removal, note the values in the ## second and third column are much larger than the rest head(voom(exCounts, exDesign)$E)
Run SVA on a count matrix transformed by voom
voomSVA(object, design, ...) ## S4 method for signature 'matrix,matrix' voomSVA(object, design) ## S4 method for signature 'DGEList,matrix' voomSVA(object, design) ## S4 method for signature 'DGEList,formula' voomSVA(object, design)voomSVA(object, design, ...) ## S4 method for signature 'matrix,matrix' voomSVA(object, design) ## S4 method for signature 'DGEList,matrix' voomSVA(object, design) ## S4 method for signature 'DGEList,formula' voomSVA(object, design)
object |
A count matrix |
design |
Design matrix or formula |
... |
Other parameters |
SV matrix
voomSVA(object = matrix, design = matrix): Method for count matrix and design matrix
voomSVA(object = DGEList, design = matrix): Method for DGEList and design matrix
voomSVA(object = DGEList, design = formula): Method for count matrix and design formula
set.seed(1887) exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) voomSVA(exCounts, design=exDesign)set.seed(1887) exCounts <- matrix(rpois(12000, 10), nrow=2000, ncol=6) exCounts[1:100, 2:3] <- exCounts[1:100,2:3]+20 exDesign <- model.matrix(~gl(2,3)) voomSVA(exCounts, design=exDesign)
Write sample annotation into a tab-delimited file to start the Biokit pipeline
writeBiokitSampleAnnotation(df, con)writeBiokitSampleAnnotation(df, con)
df |
A data.frame or anything that can be converted to a data.frame |
con |
Connection, can be a character string indicating file name |
Called for its side effect of writing the sample annotation file; returns invisibly NULL.
Starting from version 1.0-36, the function checks the input
data.frame or tbl_df before writing to the file
testDf <- data.frame(ID=LETTERS[1:4], GROUP=gl(2,2), FASTQ1=sprintf("%s_R1.fastq.gz", LETTERS[1:4]), FASTQ2=sprintf("%s_R1.fastq.gz", LETTERS[1:4])) tmp <- tempfile() writeBiokitSampleAnnotation(testDf, con=tmp) readLines(tmp)testDf <- data.frame(ID=LETTERS[1:4], GROUP=gl(2,2), FASTQ1=sprintf("%s_R1.fastq.gz", LETTERS[1:4]), FASTQ2=sprintf("%s_R1.fastq.gz", LETTERS[1:4])) tmp <- tempfile() writeBiokitSampleAnnotation(testDf, con=tmp) readLines(tmp)
Write an DGEList object as plain files for downstream analysis
writeDGEList( dgeList, exprs.file, fData.file, pData.file, group.file, groupLevels.file, feat.name = NULL, feat.desc = NULL )writeDGEList( dgeList, exprs.file, fData.file, pData.file, group.file, groupLevels.file, feat.name = NULL, feat.desc = NULL )
dgeList |
An DGEList object |
exprs.file |
File name where counts are saved |
fData.file |
File name where feature annotations are saved |
pData.file |
File name where sample annotations are saved |
group.file |
File name where the sample group information is saved |
groupLevels.file |
File where the sample group levels are saved |
feat.name |
Feature names. Can be a column name in |
feat.desc |
Feature descriptions, used in GCT files. If Expression values are saved by default in the gct format, unless the file name ends with tsv in which case a tab-separated value (TSV) file will be saved. Sample group and group level information are derived from the |
Called for its side effect of writing files; returns invisibly NULL.
In case the input matrix has no feature name, the feature names are set to be the integer array starting from 1.
In case no genes item is available in the DGEList, a minimal
data.frame containing one column, Feature, is exported with row names
of the count matrix used as both row names as well as the content of the
Feature column.
y <- matrix(rnbinom(10000,mu=5,size=2),ncol=4) d <- DGEList(counts=y, group=rep(1:2,each=2)) exprsFile <- tempfile() fDataFile <- tempfile() pDataFile <- tempfile() groupFile <- tempfile() groupLevelsFile <- tempfile() writeDGEList(d, exprs.file=exprsFile, fData.file=fDataFile, pData.file=pDataFile, group.file=groupFile, groupLevels.file=groupLevelsFile) head(ribiosIO::read_gct_matrix(exprsFile)) head(ribiosIO::readMatrix(fDataFile)) head(ribiosIO::readMatrix(pDataFile)) head(readLines(groupFile)) head(readLines(groupLevelsFile))y <- matrix(rnbinom(10000,mu=5,size=2),ncol=4) d <- DGEList(counts=y, group=rep(1:2,each=2)) exprsFile <- tempfile() fDataFile <- tempfile() pDataFile <- tempfile() groupFile <- tempfile() groupLevelsFile <- tempfile() writeDGEList(d, exprs.file=exprsFile, fData.file=fDataFile, pData.file=pDataFile, group.file=groupFile, groupLevels.file=groupLevelsFile) head(ribiosIO::read_gct_matrix(exprsFile)) head(ribiosIO::readMatrix(fDataFile)) head(ribiosIO::readMatrix(pDataFile)) head(readLines(groupFile)) head(readLines(groupLevelsFile))
Write DGE tables in individual files, and the merged table in one file
writeDgeTables(edgeResult, outdir = getwd())writeDgeTables(edgeResult, outdir = getwd())
edgeResult |
An |
outdir |
Output directory |
NULL, side effects are used
Write dgeTables with pseudo T statistics
writeDgeTablesWithPseudoT(edgeResult, outdir = getwd())writeDgeTablesWithPseudoT(edgeResult, outdir = getwd())
edgeResult |
An |
outdir |
Output directory |
NULL, side effects are used
Write truncated DGE tables
writeTruncatedDgeTables(edgeResult, outdir = getwd())writeTruncatedDgeTables(edgeResult, outdir = getwd())
edgeResult |
An |
outdir |
Output directory |
NULL, side effects are used