The Geneset Managment System (GeMS, link visible within Roche), as its name suggests, is a system to manage genesets.
This document shows basic operations provided to work with the sytsem from R console.
Note that the evaluation only takes place when GeMS is reachable, because otherwise the compilation of this vignette will fail when the webservice is not available.
The following command fetches genesets associated with a given user, test in this case. If not parameter is given, the genesets of the current user is provided.
New genesets can be inserted into GeMS, using the GmtList data structure defined in the BioQC package, which is essentially a list of lists, which in turn contains three elements: name, desc, and genes. See example below.
testGmt <- BioQC::GmtList(list(
list(name="Test gene set 1",
desc="Test", genes=c("CXCL13", "TNFRSF1B", "RGS2")),
list(name="Test gene set 2", desc="Test",
genes=c("ACADSB", "AFTPH", "ARL1"))
))
insertGmtListToGeMS(testGmt, geneFormat=0, source="Test", xref="PMID:30397336", user="test")The return value indicates the insertion was successful.
Now we check the genesets of the user test, we expect to see the two gene sets that we inserted.
Alternative to manual specificiation, GmtList can be constructed by reading a valid GMT file with the function readGmt in BioQC.
The following example shows how a geneset can be removed
Now we check the genesets of the user test, we expect to see the two gene sets that we previously inserted should be gone.
The following example shows how genesest can be retrieved by their names. The example was kindly provided by Martha Serrano.
Warning: As shown in the example above, it is very easy to alter genesets in the database. Therefore always make gene set snapshots, never read live data from the database.
gmt_test1_filename <- "data/test1.gmt"
if (file.exists(gmt_test1_filename)) {
gmt_test1 <- readGmt(gmt_test1_filename)
} else {
gmt_test1 <- getSetsWithNamesFromGeMS(c("Plasma_sc", "Bcell_l_Danaher17"))
writeGmt(gmt_test1, gmt_test1_filename)
}
gmt_test1A caveat of using this function is that it envokes an API call for each geneset. If you need to extract multiple gene-sets, use getSetsWithPropertyFromGeMS (see below).
The following example shows how genesest can be retrieved by matching properties, in this case all cell-marker gene-sets provided by the BESCA package.
Again, we are using GMT snapshots for reproducibility.
besca_markers_filename <- "data/besca_markers.gmt"
if (file.exists(besca_markers_filename)) {
besca_markers <- readGmt(besca_markers_filename)
} else {
besca_markers <- getSetsWithPropertyFromGeMS("meta.geneset", "besca_markers")
writeGmt(besca_markers, besca_markers_filename)
}
print(besca_markers)This is a very brief introduction of how to working with GeMS within R. Make sure to make snapshots instead of using live data from the database. In case of questions or suggestions, please reach out to Jitao David Zhang or Laura Badi.