--- title: "Basic example of using designit: plate layout with two factors" output: rmarkdown::html_vignette: html_document: df_print: paged mathjax: default number_sections: true toc: true toc_depth: 2 vignette: > %\VignetteIndexEntry{Basic example of using designit: plate layout with two factors} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- This vignette demonstrates the use of the `desingit` package with a series of examples deriving from the same task, namely to randomize samples of a two-factor experiment into plate layouts. We shall start with the most basic use and gradually exploring some basic yet useful utilities provided by the package. ```{r include=FALSE, message=FALSE, warning=FALSE} knitr::opts_chunk$set( echo = TRUE, fig.height = 6, fig.width = 6, collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(designit) library(ggplot2) library(dplyr) library(tidyr) ``` # The samples and the conditions Our task is to randomize samples of an in-vivo experiment with multiple conditions. Our aim is to place them in several 48-well plates. These are the conditions: ```{r} # conditions to use conditions <- data.frame( group = c(1, 2, 3, 4, 5), treatment = c( "vehicle", "TRT1", "TRT2", "TRT1", "TRT2" ), dose = c(0, 25, 25, 50, 50) ) gt::gt(conditions) ``` We will have 3 animals per group, with 4 replicates of each animal. ```{r} # sample table n_reps <- 4 n_animals <- 3 animals <- bind_rows(replicate(n_animals, conditions, simplify = FALSE), .id = "animal" ) samples <- bind_rows(replicate(n_reps, animals, simplify = FALSE), .id = "replicate" ) |> mutate( SampleID = paste0(treatment, "_", animal, "_", replicate), AnimalID = paste0(treatment, "_", animal) ) |> mutate(dose = factor(dose)) samples |> head(10) |> arrange(animal, group, replicate) |> gt::gt() ``` # Plate layout requirements Corner wells of the plates should be left empty. This means on a 48 well plate we can place 44 samples. Since we have `r nrow(samples)` samples, they will fit on `r ceiling(nrow(samples)/44)` plates. ```{r} n_samp <- nrow(samples) n_loc_per_plate <- 48 - 4 n_plates <- ceiling(n_samp / n_loc_per_plate) exclude_wells <- expand.grid(plate = seq(n_plates), column = c(1, 8), row = c(1, 6)) ``` # Setting up a BatchContainer object First, we create a BatchContainer object that provides all possible locations. ```{r} bc <- BatchContainer$new( dimensions = c("plate" = n_plates, "column" = 8, "row" = 6), exclude = exclude_wells ) bc bc$n_locations bc$get_locations() |> head() ``` # Moving samples Next, we use the random assignment function to place samples to plate locations. ```{r} bc <- assign_random(bc, samples) bc$get_samples() bc$get_samples(remove_empty_locations = TRUE) ``` To check the results visually, we can plot of the result using the `plot_plate` function. ```{r, fig.width=6, fig.height=3.5} plot_plate(bc, plate = plate, column = column, row = row, .color = treatment, .alpha = dose ) ``` To not show empty wells, we can directly plot the sample table as well. ```{r, fig.width=6, fig.height=3.5} plot_plate(bc$get_samples(remove_empty_locations = TRUE), plate = plate, column = column, row = row, .color = treatment, .alpha = dose ) ``` Sometimes we may wish to move samples, or to swap samples, or to manually assign some locations. To move individual samples or manually assigning all locations we can use the `BatchContainer$move_samples()` method. *Warning*: The `$move_samples()` method will modify the `BatchContainer` object in place. That is usually faster than creating a copy. Most of the time you will probably call `optimize_design()` instead of moving samples manually. To swap two or more samples, use ```{r, fig.width=6, fig.height=3.5} bc$move_samples(src = c(1L, 2L), dst = c(2L, 1L)) plot_plate(bc$get_samples(remove_empty_locations = TRUE), plate = plate, column = column, row = row, .color = treatment, .alpha = dose ) ``` To assign all samples in one go, use the option `location_assignment`. The example below orders samples by ID and adds the empty locations afterwards ```{r, fig.width=6, fig.height=3.5} bc$move_samples( location_assignment = c( 1:nrow(samples), rep(NA, (bc$n_locations - nrow(samples))) ) ) plot_plate(bc$get_samples(remove_empty_locations = TRUE, include_id = TRUE), plate = plate, column = column, row = row, .color = .sample_id ) ``` # Running an optimization Once we have setup an initial layout, which may be suboptimal, we can optimize it in multiple ways, for instance by sample shuffling. The optimization procedure is invoked with e.g. `optimize_design`. Here we use a simple shuffling schedule: swap 10 samples for 100 times, then swap 2 samples for 400 times. In the context of randomization, a good layout means that known independent variables and/or covariates that may affect the dependent variable(s) are as uncorrelated as possible with the layout. To evaluate how good a layout is, we need a scoring function, which we pass a scoring function to the `optimize_design` function. This function will assess how well treatment and dose are balanced across the two plates. ```{r} bc <- optimize_design(bc, scoring = osat_score_generator( batch_vars = "plate", feature_vars = c("treatment", "dose") ), # shuffling schedule n_shuffle = c(rep(10, 200), rep(2, 400)) ) ``` Development of the score can be viewed with ```{r, fig.width=3.5, fig.height=3} bc$plot_trace() ``` The layout after plate batching looks the following ```{r, fig.width=6, fig.height=3.5} plot_plate(bc$get_samples(remove_empty_locations = TRUE), plate = plate, column = column, row = row, .color = treatment, .alpha = dose ) ``` Looking at treatment, we see it's evenly distributed across the plates ```{r, fig.width=4, fig.height=3.5} ggplot( bc$get_samples(remove_empty_locations = TRUE), aes(x = treatment, fill = treatment) ) + geom_bar() + facet_wrap(~plate) ``` # Customizing the plate layout To properly distinguish between empty and excluded locations one can do the following. * Supply the BatchContainer directly; * set `add_excluded = TRUE` and set `rename_empty = TRUE`; * supply a custom color palette; * excluded wells have NA values and can be colored with `na.value`. ```{r, fig.width=6, fig.height=3.5} color_palette <- c( TRT1 = "blue", TRT2 = "purple", vehicle = "orange", empty = "white" ) plot_plate(bc, plate = plate, column = column, row = row, .color = treatment, .alpha = dose, add_excluded = TRUE, rename_empty = TRUE ) + scale_fill_manual(values = color_palette, na.value = "darkgray") ``` To remove all empty wells from the plot, hand the pruned sample list to `plot_plate` rather than the whole `BatchContainer` object. You can still assign your own colors. ```{r, fig.width=6, fig.height=3.5} plot_plate(bc$get_samples(remove_empty_locations = TRUE), plate = plate, column = column, row = row, .color = treatment, .alpha = dose ) + scale_fill_viridis_d() ``` Note: removing all empty and excluded wells will lead to omitting completely empty rows or columns! ```{r, fig.width=6, fig.height=3.5} plot_plate( bc$get_samples(remove_empty_locations = TRUE) |> filter(column != 2), plate = plate, column = column, row = row, .color = treatment, .alpha = dose ) + scale_fill_viridis_d() ``` # Summary To summarize 1. In order to randomize the layout of samples from an experiment, create an instance of `BatchContainer` with `BatchContainer$new()`. 2. Use functions `assign_random` and `plot_plate` to assign samples randomly and to plot the plate layout. If necessary, you can retrieve the samples from the BatchContainer instance `bc` with the method `bc$get_samples()`, or move samples with the method `bc$move_samples()`. The better approach usually is to optimize the design with `optimize_design()`. 3. The scoring function can be set by passing `scoring` parameter to the `optimize_design()` function. The sample assignent is optimized by shuffling the samples. 4. Various options are available to further customize the design. Now you have already the first experience of using `designit` for randomization, it is time to apply the learning to your work. If you need more examples or if you want to understand more details of the package, please explore other vignettes of the package as well as check out the documentations. # Session information ```{r sessionInfo} sessionInfo() ```