This vignette demonstrates the use
of the desingit
package with a series of examples deriving
from the same task, namely to randomize samples of a two-factor
experiment into plate layouts. We shall start with the most basic use
and gradually exploring some basic yet useful utilities provided by the
package.
Our task is to randomize samples of an in-vivo experiment with multiple conditions. Our aim is to place them in several 48-well plates.
These are the conditions:
# conditions to use
conditions <- data.frame(
group = c(1, 2, 3, 4, 5),
treatment = c(
"vehicle", "TRT1", "TRT2",
"TRT1", "TRT2"
),
dose = c(0, 25, 25, 50, 50)
)
gt::gt(conditions)
group | treatment | dose |
---|---|---|
1 | vehicle | 0 |
2 | TRT1 | 25 |
3 | TRT2 | 25 |
4 | TRT1 | 50 |
5 | TRT2 | 50 |
We will have 3 animals per group, with 4 replicates of each animal.
# sample table
n_reps <- 4
n_animals <- 3
animals <- bind_rows(replicate(n_animals, conditions, simplify = FALSE),
.id = "animal"
)
samples <- bind_rows(replicate(n_reps, animals, simplify = FALSE),
.id = "replicate"
) |>
mutate(
SampleID = paste0(treatment, "_", animal, "_", replicate),
AnimalID = paste0(treatment, "_", animal)
) |>
mutate(dose = factor(dose))
samples |>
head(10) |>
arrange(animal, group, replicate) |>
gt::gt()
replicate | animal | group | treatment | dose | SampleID | AnimalID |
---|---|---|---|---|---|---|
1 | 1 | 1 | vehicle | 0 | vehicle_1_1 | vehicle_1 |
1 | 1 | 2 | TRT1 | 25 | TRT1_1_1 | TRT1_1 |
1 | 1 | 3 | TRT2 | 25 | TRT2_1_1 | TRT2_1 |
1 | 1 | 4 | TRT1 | 50 | TRT1_1_1 | TRT1_1 |
1 | 1 | 5 | TRT2 | 50 | TRT2_1_1 | TRT2_1 |
1 | 2 | 1 | vehicle | 0 | vehicle_2_1 | vehicle_2 |
1 | 2 | 2 | TRT1 | 25 | TRT1_2_1 | TRT1_2 |
1 | 2 | 3 | TRT2 | 25 | TRT2_2_1 | TRT2_2 |
1 | 2 | 4 | TRT1 | 50 | TRT1_2_1 | TRT1_2 |
1 | 2 | 5 | TRT2 | 50 | TRT2_2_1 | TRT2_2 |
Corner wells of the plates should be left empty. This means on a 48 well plate we can place 44 samples. Since we have 60 samples, they will fit on 2 plates.
First, we create a BatchContainer object that provides all possible locations.
bc <- BatchContainer$new(
dimensions = c("plate" = n_plates, "column" = 8, "row" = 6),
exclude = exclude_wells
)
bc
#> Batch container with 88 locations.
#> Dimensions: plate, column, row
bc$n_locations
#> [1] 88
bc$get_locations() |> head()
#> # A tibble: 6 × 3
#> plate column row
#> <int> <int> <int>
#> 1 1 1 2
#> 2 1 1 3
#> 3 1 1 4
#> 4 1 1 5
#> 5 1 2 1
#> 6 1 2 2
Next, we use the random assignment function to place samples to plate locations.
bc <- assign_random(bc, samples)
bc$get_samples()
#> # A tibble: 88 × 10
#> plate column row replicate animal group treatment dose SampleID AnimalID
#> <int> <int> <int> <chr> <chr> <dbl> <chr> <fct> <chr> <chr>
#> 1 1 1 2 1 3 4 TRT1 50 TRT1_3_1 TRT1_3
#> 2 1 1 3 3 2 2 TRT1 25 TRT1_2_3 TRT1_2
#> 3 1 1 4 4 3 1 vehicle 0 vehicle_3… vehicle…
#> 4 1 1 5 <NA> <NA> NA <NA> <NA> <NA> <NA>
#> 5 1 2 1 4 1 1 vehicle 0 vehicle_1… vehicle…
#> 6 1 2 2 1 1 5 TRT2 50 TRT2_1_1 TRT2_1
#> 7 1 2 3 2 2 3 TRT2 25 TRT2_2_2 TRT2_2
#> 8 1 2 4 <NA> <NA> NA <NA> <NA> <NA> <NA>
#> 9 1 2 5 1 3 5 TRT2 50 TRT2_3_1 TRT2_3
#> 10 1 2 6 3 1 2 TRT1 25 TRT1_1_3 TRT1_1
#> # ℹ 78 more rows
bc$get_samples(remove_empty_locations = TRUE)
#> # A tibble: 60 × 10
#> plate column row replicate animal group treatment dose SampleID AnimalID
#> <int> <int> <int> <chr> <chr> <dbl> <chr> <fct> <chr> <chr>
#> 1 1 1 2 1 3 4 TRT1 50 TRT1_3_1 TRT1_3
#> 2 1 1 3 3 2 2 TRT1 25 TRT1_2_3 TRT1_2
#> 3 1 1 4 4 3 1 vehicle 0 vehicle_3… vehicle…
#> 4 1 2 1 4 1 1 vehicle 0 vehicle_1… vehicle…
#> 5 1 2 2 1 1 5 TRT2 50 TRT2_1_1 TRT2_1
#> 6 1 2 3 2 2 3 TRT2 25 TRT2_2_2 TRT2_2
#> 7 1 2 5 1 3 5 TRT2 50 TRT2_3_1 TRT2_3
#> 8 1 2 6 3 1 2 TRT1 25 TRT1_1_3 TRT1_1
#> 9 1 3 3 4 2 5 TRT2 50 TRT2_2_4 TRT2_2
#> 10 1 3 4 1 1 1 vehicle 0 vehicle_1… vehicle…
#> # ℹ 50 more rows
To check the results visually, we can plot of the result using the
plot_plate
function.
To not show empty wells, we can directly plot the sample table as
well.
plot_plate(bc$get_samples(remove_empty_locations = TRUE),
plate = plate, column = column, row = row,
.color = treatment, .alpha = dose
)
Sometimes we may wish to move samples, or to swap samples, or to
manually assign some locations. To move individual samples or manually
assigning all locations we can use the
BatchContainer$move_samples()
method.
Warning: The $move_samples()
method will modify
the BatchContainer
object in place. That is usually faster
than creating a copy. Most of the time you will probably call
optimize_design()
instead of moving samples manually.
To swap two or more samples, use
bc$move_samples(src = c(1L, 2L), dst = c(2L, 1L))
plot_plate(bc$get_samples(remove_empty_locations = TRUE),
plate = plate, column = column, row = row,
.color = treatment, .alpha = dose
)
To assign all samples in one go, use the option
location_assignment
.
The example below orders samples by ID and adds the empty locations afterwards
bc$move_samples(
location_assignment = c(
1:nrow(samples),
rep(NA, (bc$n_locations - nrow(samples)))
)
)
plot_plate(bc$get_samples(remove_empty_locations = TRUE, include_id = TRUE),
plate = plate, column = column, row = row,
.color = .sample_id
)
Once we have setup an initial layout, which may be suboptimal, we can
optimize it in multiple ways, for instance by sample shuffling. The
optimization procedure is invoked with
e.g. optimize_design
. Here we use a simple shuffling
schedule: swap 10 samples for 100 times, then swap 2 samples for 400
times.
In the context of randomization, a good layout means that known
independent variables and/or covariates that may affect the dependent
variable(s) are as uncorrelated as possible with the layout. To evaluate
how good a layout is, we need a scoring function, which we pass a
scoring function to the optimize_design
function.
This function will assess how well treatment and dose are balanced across the two plates.
bc <- optimize_design(bc,
scoring = osat_score_generator(
batch_vars = "plate",
feature_vars = c("treatment", "dose")
),
# shuffling schedule
n_shuffle = c(rep(10, 200), rep(2, 400))
)
#> Warning in osat_score(bc, batch_vars = batch_vars, feature_vars = feature_vars,
#> : NAs in features / batch columns; they will be excluded from scoring
#> Checking variances of 1-dim. score vector.
#> ... (282.827) - OK
#> Initial score: 80
#> Achieved score: 70 at iteration 2
#> Achieved score: 46 at iteration 4
#> Achieved score: 22 at iteration 5
#> Achieved score: 18 at iteration 8
#> Achieved score: 16 at iteration 14
#> Achieved score: 2 at iteration 15
#> Achieved score: 0 at iteration 254
Development of the score can be viewed with
The layout after plate batching looks the following
plot_plate(bc$get_samples(remove_empty_locations = TRUE),
plate = plate, column = column, row = row,
.color = treatment, .alpha = dose
)
Looking at treatment, we see it’s evenly distributed across the plates
ggplot(
bc$get_samples(remove_empty_locations = TRUE),
aes(x = treatment, fill = treatment)
) +
geom_bar() +
facet_wrap(~plate)
To properly distinguish between empty and excluded locations one can do the following.
add_excluded = TRUE
and set
rename_empty = TRUE
;na.value
.color_palette <- c(
TRT1 = "blue", TRT2 = "purple",
vehicle = "orange", empty = "white"
)
plot_plate(bc,
plate = plate, column = column, row = row,
.color = treatment, .alpha = dose,
add_excluded = TRUE, rename_empty = TRUE
) +
scale_fill_manual(values = color_palette, na.value = "darkgray")
To remove all empty wells from the plot, hand the pruned sample list
to plot_plate
rather than the whole
BatchContainer
object. You can still assign your own
colors.
plot_plate(bc$get_samples(remove_empty_locations = TRUE),
plate = plate, column = column, row = row,
.color = treatment, .alpha = dose
) +
scale_fill_viridis_d()
Note: removing all empty and excluded wells will lead to omitting completely empty rows or columns!
plot_plate(
bc$get_samples(remove_empty_locations = TRUE) |>
filter(column != 2),
plate = plate, column = column, row = row,
.color = treatment, .alpha = dose
) +
scale_fill_viridis_d()
To summarize
BatchContainer
with
BatchContainer$new()
.assign_random
and plot_plate
to assign samples randomly and to plot the plate layout. If necessary,
you can retrieve the samples from the BatchContainer instance
bc
with the method bc$get_samples()
, or move
samples with the method bc$move_samples()
. The better
approach usually is to optimize the design with
optimize_design()
.scoring
parameter to the optimize_design()
function. The sample
assignent is optimized by shuffling the samples.Now you have already the first experience of using
designit
for randomization, it is time to apply the
learning to your work. If you need more examples or if you want to
understand more details of the package, please explore other vignettes
of the package as well as check out the documentations.
sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] lubridate_1.9.4 forcats_1.0.0 stringr_1.5.1
#> [4] dplyr_1.1.4 purrr_1.0.4 readr_2.1.5
#> [7] tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.1
#> [10] tidyverse_2.0.0 designit_0.5.0.9000 rmarkdown_2.29
#>
#> loaded via a namespace (and not attached):
#> [1] gt_0.11.1 utf8_1.2.4 sass_0.4.9 generics_0.1.3
#> [5] xml2_1.3.6 stringi_1.8.4 hms_1.1.3 digest_0.6.37
#> [9] magrittr_2.0.3 evaluate_1.0.3 grid_4.4.2 timechange_0.3.0
#> [13] RColorBrewer_1.1-3 fastmap_1.2.0 jsonlite_1.8.9 viridisLite_0.4.2
#> [17] scales_1.3.0 jquerylib_0.1.4 cli_3.6.3 rlang_1.1.5
#> [21] cowplot_1.1.3 munsell_0.5.1 withr_3.0.2 cachem_1.1.0
#> [25] yaml_2.3.10 tools_4.4.2 tzdb_0.4.0 colorspace_2.1-1
#> [29] assertthat_0.2.1 buildtools_1.0.0 vctrs_0.6.5 R6_2.5.1
#> [33] lifecycle_1.0.4 pkgconfig_2.0.3 pillar_1.10.1 bslib_0.9.0
#> [37] gtable_0.3.6 data.table_1.16.4 glue_1.8.0 xfun_0.50
#> [41] tidyselect_1.2.1 sys_3.4.3 knitr_1.49 farver_2.1.2
#> [45] htmltools_0.5.8.1 maketools_1.3.1 labeling_0.4.3 compiler_4.4.2