Package 'designit' reference manual

Title:	Blocking and Randomization for Experimental Design
Description:	Intelligently assign samples to batches in order to reduce batch effects. Batch effects can have a significant impact on data analysis, especially when the assignment of samples to batches coincides with the contrast groups being studied. By defining a batch container and a scoring function that reflects the contrasts, this package allows users to assign samples in a way that minimizes the potential impact of batch effects on the comparison of interest. Among other functionality, we provide an implementation for OSAT score by Yan et al. (2012, <doi:10.1186/1471-2164-13-689>).
Authors:	Iakov I. Davydov [aut, cre, cph] , Juliane Siebourg-Polster [aut, cph] , Guido Steiner [aut, cph], Konrad Rudolph [ctb] , Jitao David Zhang [aut, cph] , Balazs Banfai [aut, cph] , F. Hoffman-La Roche [cph, fnd]
Maintainer:	Iakov I. Davydov <[email protected]>
License:	MIT + file LICENSE
Version:	0.5.0.9000
Built:	2025-03-11 06:06:26 UTC
Source:	https://github.com/bedapub/designit

Alternative acceptance function for multi-dimensional scores in which order (left to right, e.g. first to last) denotes relevance.

Description

Alternative acceptance function for multi-dimensional scores in which order (left to right, e.g. first to last) denotes relevance.

Usage

accept_leftmost_improvement(current_score, best_score, ..., tolerance = 0)
accept_leftmost_improvement(current_score, best_score, ..., tolerance = 0)

Arguments

`current_score`	One- or multi-dimensional score from the current optimizing iteration (double or vector of doubles)
`best_score`	Best one- or multi-dimensional score found so far (double or vector of doubles)
`...`	Ignored arguments that may be used by alternative acceptance functions
`tolerance`	Tolerance value: When comparing score vectors from left to right, differences within +/- tol won't immediately shortcut the comparison at this point, allowing improvement in a less important score to exhibit some influence

Value

Boolean, TRUE if current score should be taken as the new optimal score, FALSE otherwise

Distributes samples based on a sample sheet.

Description

Distributes samples based on a sample sheet.

Usage

assign_from_table(batch_container, samples)
assign_from_table(batch_container, samples)

Arguments

`batch_container`	Instance of BatchContainer class
`samples`	`data.frame` with samples (a sample sheet). This `data.frame` (or `tibble::tibble()`) should contain samples together with their locations. No `.sample_id` column can be present in the sample sheet. In `batch_container` already has samples assigned, the function will check if samples in `batch_container` are identical to the ones in the `samples` argument.

Value

Returns a new BatchContainer.

Examples

bc <- BatchContainer$new(
  dimensions = list(
    plate = 2,
    column = list(values = letters[1:3]),
    row = 3
  )
)

sample_sheet <- tibble::tribble(
  ~plate, ~column, ~row, ~sampleID, ~group,
  1, "a", 1, 1, "TRT",
  1, "b", 2, 2, "CNTRL",
  2, "a", 1, 3, "TRT",
  2, "b", 2, 4, "CNTRL",
  2, "a", 3, 5, "TRT",
)
# assign samples from the sample sheet
bc <- assign_from_table(bc, sample_sheet)

bc$get_samples(remove_empty_locations = TRUE)

bc <- BatchContainer$new(
  dimensions = list(
    plate = 2,
    column = list(values = letters[1:3]),
    row = 3
  )
)

sample_sheet <- tibble::tribble(
  ~plate, ~column, ~row, ~sampleID, ~group,
  1, "a", 1, 1, "TRT",
  1, "b", 2, 2, "CNTRL",
  2, "a", 1, 3, "TRT",
  2, "b", 2, 4, "CNTRL",
  2, "a", 3, 5, "TRT",
)
# assign samples from the sample sheet
bc <- assign_from_table(bc, sample_sheet)

bc$get_samples(remove_empty_locations = TRUE)

Distributes samples in order.

Description

First sample is assigned to the first location, second sample is assigned to the second location, etc.

Usage

assign_in_order(batch_container, samples = NULL)
assign_in_order(batch_container, samples = NULL)

Arguments

`batch_container`	Instance of BatchContainer class
`samples`	data.frame with samples.

Value

Returns a new BatchContainer.

Examples

samples <- data.frame(sampId = 1:3, sampName = letters[1:3])
samples

bc <- BatchContainer$new(dimensions = c("row" = 3, "column" = 2))
bc

set.seed(42)
# assigns samples randomly
bc <- assign_random(bc, samples)
bc$get_samples()

# assigns samples in order
bc <- assign_in_order(bc)
bc$get_samples()
samples <- data.frame(sampId = 1:3, sampName = letters[1:3])
samples

bc <- BatchContainer$new(dimensions = c("row" = 3, "column" = 2))
bc

set.seed(42)
# assigns samples randomly
bc <- assign_random(bc, samples)
bc$get_samples()

# assigns samples in order
bc <- assign_in_order(bc)
bc$get_samples()

Assignment function which distributes samples randomly.

Description

Assignment function which distributes samples randomly.

Usage

assign_random(batch_container, samples = NULL)
assign_random(batch_container, samples = NULL)

Arguments

`batch_container`	Instance of BatchContainer class
`samples`	data.frame with samples.

Value

Returns a new BatchContainer.

Examples

samples <- data.frame(sampId = 1:3, sampName = letters[1:3])
samples

bc <- BatchContainer$new(dimensions = c("row" = 3, "column" = 2))
bc

set.seed(42)
# assigns samples randomly
bc <- assign_random(bc, samples)
bc$get_samples()

# assigns samples in order
bc <- assign_in_order(bc)
bc$get_samples()
samples <- data.frame(sampId = 1:3, sampName = letters[1:3])
samples

bc <- BatchContainer$new(dimensions = c("row" = 3, "column" = 2))
bc

set.seed(42)
# assigns samples randomly
bc <- assign_random(bc, samples)
bc$get_samples()

# assigns samples in order
bc <- assign_in_order(bc)
bc$get_samples()

Creates a BatchContainer from a table (data.frame/tibble::tibble) containing sample and location information.

Description

Creates a BatchContainer from a table (data.frame/tibble::tibble) containing sample and location information.

Usage

batch_container_from_table(tab, location_cols)
batch_container_from_table(tab, location_cols)

Arguments

`tab`	A table with location and sample information. Table rows with all `NA`s in sample information columns are treated as empty locations.
`location_cols`	Names of columns containing information about locations.

Value

A BatchContainer assigned samples.

Examples

tab <- data.frame(
  row = rep(1:3, each = 3),
  column = rep(1:3, 3),
  sample_id = c(1, 2, 3, NA, 5, 6, 7, NA, 9)
)
bc <- batch_container_from_table(tab, location_cols = c("row", "column"))
tab <- data.frame(
  row = rep(1:3, each = 3),
  column = rep(1:3, 3),
  sample_id = c(1, 2, 3, NA, 5, 6, 7, NA, 9)
)
bc <- batch_container_from_table(tab, location_cols = c("row", "column"))

R6 Class representing a batch container.

Description

Describes container dimensions and samples to container location assignment.

Details

A typical workflow starts with creating a BatchContainer. Then samples can be assigned to locations in that container.

Public fields

trace: Optimization trace, a tibble::tibble()

Active bindings

scoring_f

Scoring functions used for optimization. Each scoring function should receive a BatchContainer. This function should return a floating point score value for the assignment. This a list of functions. Upon assignment a single function will be automatically converted to a list In the later case each function is called.

has_samples

Returns TRUE if BatchContainer has samples.

has_samples_attr

Returns TRUE if BatchContainer has sample atrributes assigned.

n_locations

Returns number of locations in a BatchContainer.

n_dimensions

Returns number of dimensions in a BatchContainer. This field cannot be assigned.

dimension_names

character vector with dimension names. This field cannot be assigned.

samples

Samples in the batch container. When assigning data.frame should not have column named .sample_id column.

samples_attr

Extra attributes of samples. If set, this is included into BatchContainer$get_samples() output.

assignment

Sample assignment vector. Should contain NAs for empty locations.

Assigning this field is deprecated, please use ⁠$move_samples()⁠ instead.

Methods

Method `new()`

Create a new BatchContainer object.

Usage

BatchContainer$new(locations_table, dimensions, exclude = NULL)

Arguments

locations_table: A table with available locations.
dimensions: A vector or list of dimensions. Every dimension should have a name. Could be an integer vector of dimensions or a named list. Every value of a list could be either dimension size or parameters for BatchContainerDimension$new(). Can be used as an alternative to passing locations_table.
exclude: data.frame with excluded locations of a container. Only used together with dimensions.

Examples

bc <- BatchContainer$new(
  dimensions = list(
    "plate" = 3,
    "row" = list(values = letters[1:3]),
    "column" = list(values = c(1, 3))
  ),
  exclude = data.frame(plate = 1, row = "a", column = c(1, 3), stringsAsFactors = FALSE)
)

bc

Method `get_samples()`

Return table with samples and sample assignment.

Usage

BatchContainer$get_samples(
  assignment = TRUE,
  include_id = FALSE,
  remove_empty_locations = FALSE,
  as_tibble = TRUE
)

Arguments

assignment: Return sample assignment. If FALSE, only samples table is returned, with out batch assignment.
include_id: Keep .sample_id in the table. Use TRUE for lower overhead.
remove_empty_locations: Removes empty locations from the result tibble.
as_tibble: Return tibble. If FALSE returns data.table. This should have lower overhead, as internally there is a cached data.table.

Returns

table with samples and sample assignment.

Method `get_locations()`

Get a table with all the locations in a BatchContainer.

Usage

BatchContainer$get_locations()

Returns

A tibble with all the available locations.

Method `move_samples()`

Move samples between locations modifying the BatchContainer in place

This method can receive either src and dst or locations_assignment.

Usage

BatchContainer$move_samples(src, dst, location_assignment)

Arguments

src: integer vector of source locations
dst: integer vector of destination locations (the same length as src).
location_assignment: integer vector with location assignment. The length of the vector should match the number of locations, NA should be used for empty locations.

Returns

BatchContainer, invisibly

Method `score()`

Score current sample assignment,

Usage

BatchContainer$score(scoring)

Arguments

scoring: a function or a names list of scoring functions. Each function should return a numeric vector.

Returns

Returns a named vector of all scoring functions values.

Method `copy()`

Create an independent copy (clone) of a BatchContainer

Usage

BatchContainer$copy()

Returns

Returns a new BatchContainer

Method `print()`

Prints information about BatchContainer.

Usage

BatchContainer$print(...)

Arguments

...: not used.

Method `scores_table()`

Return a table with scores from an optimization.

Usage

BatchContainer$scores_table(index = NULL, include_aggregated = FALSE)

Arguments

index: optimization index, all by default
include_aggregated: include aggregated scores

Returns

a tibble::tibble() with scores

Method `plot_trace()`

Plot trace

Usage

BatchContainer$plot_trace(index = NULL, include_aggregated = FALSE, ...)

Arguments

index: optimization index, all by default
include_aggregated: include aggregated scores
...: not used.

Returns

a ggplot2::ggplot() object List of scoring functions. Tibble with batch container locations. Tibble with sample information and sample ids. Sample attributes, a data.table. Vector with assignment of sample ids to locations. Cached data.table with samples assignment. Validate sample assignment.

Examples


## ------------------------------------------------
## Method `BatchContainer$new`
## ------------------------------------------------

bc <- BatchContainer$new(
  dimensions = list(
    "plate" = 3,
    "row" = list(values = letters[1:3]),
    "column" = list(values = c(1, 3))
  ),
  exclude = data.frame(plate = 1, row = "a", column = c(1, 3), stringsAsFactors = FALSE)
)

bc
## ------------------------------------------------
## Method `BatchContainer$new`
## ------------------------------------------------

bc <- BatchContainer$new(
  dimensions = list(
    "plate" = 3,
    "row" = list(values = letters[1:3]),
    "column" = list(values = c(1, 3))
  ),
  exclude = data.frame(plate = 1, row = "a", column = c(1, 3), stringsAsFactors = FALSE)
)

bc

R6 Class representing a batch container dimension.

Description

R6 Class representing a batch container dimension.

Public fields

name: dimension name.
values: vector of dimension values.

Active bindings

size: Returns size of a dimension.
short_info: Returns a string summarizing the dimension. E.g., "mydim<size=10>".

Methods

Public methods

BatchContainerDimension$new()
BatchContainerDimension$clone()

Method `new()`

Create a new BatchContainerDimension object.

This is usually used implicitly via BatchContainer$new().

Usage

BatchContainerDimension$new(name, size = NULL, values = NULL)

Arguments

name

Dimension name, a character string. Requiered.

size

Dimension size. Setting this implies that dimension values are 1:size.

values

Explicit list of dimension values. Could be numeric, character or factor.

It is required to provide dimension namd and either size of values.

Examples

plate_dimension <- BatchContainerDimension$new("plate", size=3)
row_dimension <- BatchContainerDimension$new("row", values = letters[1:3])
column_dimension <- BatchContainerDimension$new("column", values = 1:3)

bc <- BatchContainer$new(
  dimensions = list(plate_dimension, row_dimension, column_dimension),
  exclude = data.frame(plate = 1, row = "a", column = c(1, 3), stringsAsFactors = FALSE)
)

bc

Method `clone()`

The objects of this class are cloneable with this method.

Usage

BatchContainerDimension$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples


## ------------------------------------------------
## Method `BatchContainerDimension$new`
## ------------------------------------------------

plate_dimension <- BatchContainerDimension$new("plate", size=3)
row_dimension <- BatchContainerDimension$new("row", values = letters[1:3])
column_dimension <- BatchContainerDimension$new("column", values = 1:3)

bc <- BatchContainer$new(
  dimensions = list(plate_dimension, row_dimension, column_dimension),
  exclude = data.frame(plate = 1, row = "a", column = c(1, 3), stringsAsFactors = FALSE)
)

bc
## ------------------------------------------------
## Method `BatchContainerDimension$new`
## ------------------------------------------------

plate_dimension <- BatchContainerDimension$new("plate", size=3)
row_dimension <- BatchContainerDimension$new("row", values = letters[1:3])
column_dimension <- BatchContainerDimension$new("column", values = 1:3)

bc <- BatchContainer$new(
  dimensions = list(plate_dimension, row_dimension, column_dimension),
  exclude = data.frame(plate = 1, row = "a", column = c(1, 3), stringsAsFactors = FALSE)
)

bc

Compile list of all possible ways to assign levels of the allocation variable to a given set of subgroups

Description

All information needed to perform this function (primarily the number and size of subgroups plus the levels of the allocation variable) are contained in and extracted from the subgroup object.

Usage

compile_possible_subgroup_allocation(
  subgroup_object,
  fullTree = FALSE,
  maxCalls = 1e+06
)
compile_possible_subgroup_allocation(
  subgroup_object,
  fullTree = FALSE,
  maxCalls = 1e+06
)

Arguments

`subgroup_object`	A subgrouping object as returned by `form_homogeneous_subgroups()`
`fullTree`	Boolean: Enforce full search of the possibility tree, independent of the value of `maxCalls`
`maxCalls`	Maximum number of recursive calls in the search tree, to avoid long run times with very large trees

Value

List of possible allocations; Each allocation is an integer vector of allocation levels that are assigned in that order to the subgroups with given sizes

Reshuffle sample indices completely randomly

Description

This function was just added to test early on the functionality of optimize_design() to accept a permutation vector rather than a list with src and dst indices.

Usage

complete_random_shuffling(batch_container, ...)
complete_random_shuffling(batch_container, ...)

Arguments

`batch_container`	The batch-container.
`...`	Other params that are passed to a generic shuffling function (like the iteration number).

Value

A random permutation of the sample assignment in the container.

Examples

data("invivo_study_samples")
bc <- BatchContainer$new(
  dimensions = c("plate" = 2, "column" = 5, "row" = 6)
)
scoring_f <- osat_score_generator("plate", "Sex")
bc <- optimize_design(
  bc, scoring = scoring_f, invivo_study_samples,
  max_iter = 100,
  shuffle_proposal_func = complete_random_shuffling
)
data("invivo_study_samples")
bc <- BatchContainer$new(
  dimensions = c("plate" = 2, "column" = 5, "row" = 6)
)
scoring_f <- osat_score_generator("plate", "Sex")
bc <- optimize_design(
  bc, scoring = scoring_f, invivo_study_samples,
  max_iter = 100,
  shuffle_proposal_func = complete_random_shuffling
)

Drop highest order interactions

Description

Drop highest order interactions

Usage

drop_order(.terms, m = -1)
drop_order(.terms, m = -1)

Arguments

`.terms`	`terms.object`
`m`	order of interaction (highest available if -1)

Aggregation of scores: take first (primary) score only

Description

This function enables comparison of the results of two scoring functions by just basing the decision on the first element. This reflects the original behavior of the optimization function, just evaluating the 'auxiliary' scores for the user's information.

Usage

first_score_only(scores, ...)
first_score_only(scores, ...)

Arguments

`scores`	A score or multiple component score vector
`...`	Parameters to be ignored by this aggregation function

Value

The aggregated score, i.e. the first element of a multiple-component score vector.

Examples

first_score_only(c(1, 2, 3))
first_score_only(c(1, 2, 3))

Form groups and subgroups of 'homogeneous' samples as defined by certain variables and size constraints

Description

Form groups and subgroups of 'homogeneous' samples as defined by certain variables and size constraints

Usage

form_homogeneous_subgroups(
  batch_container,
  allocate_var,
  keep_together_vars = c(),
  n_min = NA,
  n_max = NA,
  n_ideal = NA,
  subgroup_var_name = NULL,
  prefer_big_groups = TRUE,
  strict = TRUE
)
form_homogeneous_subgroups(
  batch_container,
  allocate_var,
  keep_together_vars = c(),
  n_min = NA,
  n_max = NA,
  n_ideal = NA,
  subgroup_var_name = NULL,
  prefer_big_groups = TRUE,
  strict = TRUE
)

Arguments

`batch_container`	Batch container with all samples assigned that are to be grouped and sub-grouped
`allocate_var`	Name of a variable in the `samples` table to inform possible groupings, as (sub)group sizes must add up to the correct totals
`keep_together_vars`	Vector of column names in sample table; groups are formed by pooling samples with identical values of all those variables
`n_min`	Minimal number of samples in one sub(!)group; by default 1
`n_max`	Maximal number of samples in one sub(!)group; by default the size of the biggest group
`n_ideal`	Ideal number of samples in one sub(!)group; by default the floor or ceiling of `mean(n_min,n_max)`, depending on the setting of `prefer_big_groups`
`subgroup_var_name`	An optional column name for the subgroups which are formed (or NULL)
`prefer_big_groups`	Boolean; indicating whether or not bigger subgroups should be preferred in case of several possibilities
`strict`	Boolean; if TRUE, subgroup size constraints have to be met strictly, implying the possibility of finding no solution at all

Value

Subgroup object to be used in subsequent calls to compile_possible_subgroup_allocation()

Generate `terms.object` (formula with attributes)

Description

Generate terms.object (formula with attributes)

Usage

generate_terms(.tbl, ...)
generate_terms(.tbl, ...)

Arguments

`.tbl`	data
`...`	columns to skip (unquoted)

Value

terms.object

Get highest order interaction

Description

Get highest order interaction

Usage

get_order(.terms)
get_order(.terms)

Arguments

.terms

terms.object

Value

highest order (numeric).

A sample list from an in vivo experiment with multiple treatments and 2 strains

Description

This sample list is intended to be used in connection with the "invivo_study_treatments" data object

Usage

data(invivo_study_samples)
data(invivo_study_samples)

Format

An object of class "tibble"

AnimalID: The animal IDs, i.e. unique identifiers for each animal
Strain: Strain (A or B)
Sex: Female (F) or Male (M)
BirthDate: Date of birth, not available for all the animals
Earmark: Markings to distinguish individual animals, applied on the left (L), right (R) or both(B) ears
ArrivalWeight: Initial body weight of the animal
Arrival weight Unit: Unit of the body weight, here: grams
Litter: The litter IDs, grouping offspring from one set of parents

Author(s)

Guido Steiner

A treatment list together with additional constraints on the strain and sex of animals

Description

This treatment list is intended to be used in connection with the "invivo_study_samples" data object

Usage

data(invivo_study_treatments)
data(invivo_study_treatments)

Format

An object of class "tibble"

Treatment: The treatment to be given to an individual animal (1-3, plus a few untreated cases)
Strain: Strain (A or B) - a constraint which kind of animal may receive the respective treatment
Sex: Female (F) or Male (M) - a constraint which kind of animal may receive the respective treatment

Author(s)

Guido Steiner

Aggregation of scores: L1 norm

Description

This function enables comparison of the results of two scoring functions by calculating an L1 norm (Manhattan distance from origin).

Usage

L1_norm(scores, ...)
L1_norm(scores, ...)

Arguments

`scores`	A score or multiple component score vector
`...`	Parameters to be ignored by this aggregation function

Value

The L1 norm as an aggregated score.

Examples

L1_norm(c(2, 2))
L1_norm(c(2, 2))

Aggregation of scores: L2 norm squared

Description

This function enables comparison of the results of two scoring functions by calculating an L2 norm (euclidean distance from origin). Since this is only used for ranking solutions, the squared L2 norm is returned.

Usage

L2s_norm(scores, ...)
L2s_norm(scores, ...)

Arguments

`scores`	A score or multiple component score vector
`...`	Parameters to be ignored by this aggregation function

Value

The squared L2 norm as an aggregated score.

Examples

L2s_norm(c(2, 2))
L2s_norm(c(2, 2))

Create locations table from dimensions and exclude table

Description

Create locations table from dimensions and exclude table

Usage

locations_table_from_dimensions(dimensions, exclude)
locations_table_from_dimensions(dimensions, exclude)

Arguments

`dimensions`	A vector or list of dimensions. Every dimension should have a name. Could be an integer vector of dimensions or a named list. Every value of a list could be either dimension size or parameters for BatchContainerDimension$new().
`exclude`	data.frame with excluded locations of a container.

Value

a tibble::tibble() with all the available locations.

Subject sample list with group and time plus controls

Description

A sample list with 9 columns as described below. There are 3 types of records (rows) indicated by the SampleType variable. Patient samples, controls and spike-in standards. Patient samples were collected over up to 7 time points. Controls and SpikeIns are QC samples for distribution of the samples on 96 well plates.

Usage

data(longitudinal_subject_samples)
data(longitudinal_subject_samples)

Format

An object of class "tibble"

SampleID: A unique sample identifier.
SampleType: Indicates whether the sample is a patient sample, control oder spike-in.
SubjectID: The subject identifier.
Group: Indicates the treatment group of a subject.
Week: Sampling time points in weeks of study.
Sex: Subject Sex, Female (F) or Male (M).
Age: Subject age.
BMI: Subject Body Mass Index.
SamplesPerSubject: Look up variable for the number of samples per subject. This varies as not subject have samples from all weeks.

Author(s)

Juliane Siebourg

Alternative acceptance function for multi-dimensional scores with exponentially downweighted score improvements from left to right

Description

Alternative acceptance function for multi-dimensional scores with exponentially downweighted score improvements from left to right

Usage

mk_exponentially_weighted_acceptance_func(
  kappa = 0.5,
  simulated_annealing = FALSE,
  temp_function = mk_simanneal_temp_func(T0 = 500, alpha = 0.8)
)
mk_exponentially_weighted_acceptance_func(
  kappa = 0.5,
  simulated_annealing = FALSE,
  temp_function = mk_simanneal_temp_func(T0 = 500, alpha = 0.8)
)

Arguments

`kappa`	Coefficient that determines how quickly the weights for the individual score improvements drop when going from left to right (i.e. first to last score). Weight for the first score's delta is 1, then the original delta multiplied with kappa^(p-1) for the p'th score
`simulated_annealing`	Boolean; if TRUE, simulated annealing (SA) will be used to minimize the weighted improved score
`temp_function`	In case SA is used, a temperature function that returns the annealing temperature for a certain iteration number

Value

Acceptance function which returns TRUE if current score should be taken as the new optimal score, FALSE otherwise

Create a list of scoring functions (one per plate) that quantify the spatially homogeneous distribution of conditions across the plate

Description

Create a list of scoring functions (one per plate) that quantify the spatially homogeneous distribution of conditions across the plate

Usage

mk_plate_scoring_functions(
  batch_container,
  plate = NULL,
  row,
  column,
  group,
  p = 2,
  penalize_lines = "soft"
)
mk_plate_scoring_functions(
  batch_container,
  plate = NULL,
  row,
  column,
  group,
  p = 2,
  penalize_lines = "soft"
)

Arguments

`batch_container`	Batch container (bc) with all columns that denote plate related information
`plate`	Name of the bc column that holds the plate identifier (may be missing or NULL in case just one plate is used)
`row`	Name of the bc column that holds the plate row number (integer values starting at 1)
`column`	Name of the bc column that holds the plate column number (integer values starting at 1)
`group`	Name of the bc column that denotes a group/condition that should be distributed on the plate
`p`	p parameter for minkowski type of distance metrics. Special cases: p=1 - Manhattan distance; p=2 - Euclidean distance
`penalize_lines`	How to penalize samples of the same group in one row or column of the plate. Valid options are: 'none' - there is no penalty and the pure distance metric counts, 'soft' - penalty will depend on the well distance within the shared plate row or column, 'hard' - samples in the same row/column will score a zero distance

Value

List of scoring functions, one per plate, that calculate a real valued measure for the quality of the group distribution (the lower the better).

Examples

data("invivo_study_samples")
bc <- BatchContainer$new(
  dimensions = c("column" = 6, "row" = 10)
)
bc <- assign_random(bc, invivo_study_samples)
scoring_f <- mk_plate_scoring_functions(
  bc,
  row = "row", column = "column", group = "Sex"
)
bc <- optimize_design(bc, scoring = scoring_f, max_iter = 100)
plot_plate(bc$get_samples(), .col = Sex)

data("invivo_study_samples")
bc <- BatchContainer$new(
  dimensions = c("column" = 6, "row" = 10)
)
bc <- assign_random(bc, invivo_study_samples)
scoring_f <- mk_plate_scoring_functions(
  bc,
  row = "row", column = "column", group = "Sex"
)
bc <- optimize_design(bc, scoring = scoring_f, max_iter = 100)
plot_plate(bc$get_samples(), .col = Sex)

Generate acceptance function for an optimization protocol based on simulated annealing

Description

Generate acceptance function for an optimization protocol based on simulated annealing

Usage

mk_simanneal_acceptance_func(
  temp_function = mk_simanneal_temp_func(T0 = 500, alpha = 0.8)
)
mk_simanneal_acceptance_func(
  temp_function = mk_simanneal_temp_func(T0 = 500, alpha = 0.8)
)

Arguments

temp_function

A temperature function that returns the annealing temperature for a certain cycle k

Value

A function that takes parameters (current_score, best_score, iteration) for an optimization step and return a Boolean indicating whether the current solution should be accepted or dismissed. Acceptance probability of a worse solution decreases with annealing temperature.

Create a temperature function that returns the annealing temperature at a given step (iteration)

Description

Supported annealing types are currently "Exponential multiplicative", "Logarithmic multiplicative", "Quadratic multiplicative" and "Linear multiplicative", each with dedicated constraints on alpha. For information, see http://what-when-how.com/artificial-intelligence/a-comparison-of-cooling-schedules-for-simulated-annealing-artificial-intelligence/

Usage

mk_simanneal_temp_func(T0, alpha, type = "Quadratic multiplicative")
mk_simanneal_temp_func(T0, alpha, type = "Quadratic multiplicative")

Arguments

`T0`	Initial temperature at step 1 (when k=0)
`alpha`	Rate of cooling
`type`	Type of annealing protocol. Defaults to the quadratic multiplicative method which seems to perform well.

Value

Temperature at cycle k.

Created a shuffling function that permutes samples within certain subgroups of the container locations

Description

If length(n_swaps)==1, the returned function may be called an arbitrary number of times. If length(n_swaps)>1 the returned function may be called length(n_swaps) timed before returning NULL, which would be the stopping criterion if all requested swaps have been exhausted.

Usage

mk_subgroup_shuffling_function(
  subgroup_vars,
  restrain_on_subgroup_levels = c(),
  n_swaps = 1
)
mk_subgroup_shuffling_function(
  subgroup_vars,
  restrain_on_subgroup_levels = c(),
  n_swaps = 1
)

Arguments

`subgroup_vars`	Column names of the variables that together define the relevant subgroups
`restrain_on_subgroup_levels`	Permutations can be forced to take place only within a level of the factor of the subgrouping variable. In this case, the user must pass only one subgrouping variable and a number of levels that together define the permuted subgroup.
`n_swaps`	Vector with number of swaps to be proposed in successive calls to the returned function (each value should be in valid range from 1..floor(n_locations/2))

Value

Function to return a list with length n vectors src and dst, denoting source and destination index for the swap operation, or NULL if the user provided a defined protocol for the number of swaps and the last iteration has been reached

Examples

set.seed(42)

bc <- BatchContainer$new(
  dimensions = c(
    plate = 2,
    row = 4, col = 4
  )
)

bc <- assign_in_order(bc, samples = tibble::tibble(
  Group = c(rep(c("Grp 1", "Grp 2", "Grp 3", "Grp 4"), each = 8)),
  ID = 1:32
))

# here we use a 2-step approach:
# 1. Assign samples to plates.
# 2. Arrange samples within plates.

# overview of sample assagnment before optimization
plot_plate(bc,
  plate = plate, row = row, column = col, .color = Group
)

# Step 1, assign samples to plates
scoring_f <- osat_score_generator(
  batch_vars = c("plate"), feature_vars = c("Group")
)
bc <- optimize_design(
  bc,
  scoring = scoring_f,
  max_iter = 10, # the real number of iterations should be bigger
  n_shuffle = 2,
  quiet = TRUE
)
plot_plate(
  bc,
  plate = plate, row = row, column = col, .color = Group
)

# Step 2, distribute samples within plates
scoring_f <- mk_plate_scoring_functions(
  bc,
  plate = "plate", row = "row", column = "col", group = "Group"
)
bc <- optimize_design(
  bc,
  scoring = scoring_f,
  max_iter = 50,
  shuffle_proposal_func = mk_subgroup_shuffling_function(subgroup_vars = c("plate")),
  aggregate_scores_func = L2s_norm,
  quiet = TRUE
)
plot_plate(bc,
  plate = plate, row = row, column = col, .color = Group
)
set.seed(42)

bc <- BatchContainer$new(
  dimensions = c(
    plate = 2,
    row = 4, col = 4
  )
)

bc <- assign_in_order(bc, samples = tibble::tibble(
  Group = c(rep(c("Grp 1", "Grp 2", "Grp 3", "Grp 4"), each = 8)),
  ID = 1:32
))

# here we use a 2-step approach:
# 1. Assign samples to plates.
# 2. Arrange samples within plates.

# overview of sample assagnment before optimization
plot_plate(bc,
  plate = plate, row = row, column = col, .color = Group
)

# Step 1, assign samples to plates
scoring_f <- osat_score_generator(
  batch_vars = c("plate"), feature_vars = c("Group")
)
bc <- optimize_design(
  bc,
  scoring = scoring_f,
  max_iter = 10, # the real number of iterations should be bigger
  n_shuffle = 2,
  quiet = TRUE
)
plot_plate(
  bc,
  plate = plate, row = row, column = col, .color = Group
)

# Step 2, distribute samples within plates
scoring_f <- mk_plate_scoring_functions(
  bc,
  plate = "plate", row = "row", column = "col", group = "Group"
)
bc <- optimize_design(
  bc,
  scoring = scoring_f,
  max_iter = 50,
  shuffle_proposal_func = mk_subgroup_shuffling_function(subgroup_vars = c("plate")),
  aggregate_scores_func = L2s_norm,
  quiet = TRUE
)
plot_plate(bc,
  plate = plate, row = row, column = col, .color = Group
)

Create function to propose swaps of samples on each call, either with a constant number of swaps or following a user defined protocol

Description

If length(n_swaps)==1, the returned function may be called an arbitrary number of times. If length(n_swaps)>1 and called without argument, the returned function may be called length(n_swaps) timed before returning NULL, which would be the stopping criterion if all requested swaps have been exhausted. Alternatively, the function may be called with an iteration number as the only argument, giving the user some freedom how to iterate over the sample swapping protocol.

Usage

mk_swapping_function(n_swaps = 1)
mk_swapping_function(n_swaps = 1)

Arguments

n_swaps

Vector with number of swaps to be proposed in successive calls to the returned function (each value should be in valid range from 1..floor(n_samples/2))

Value

Function to return a list with length n vectors src and dst, denoting source and destination index for the swap operation, or NULL if the user provided a defined protocol for the number of swaps and the last iteration has been reached.

Examples

data("invivo_study_samples")
bc <- BatchContainer$new(
  dimensions = c("plate" = 2, "column" = 5, "row" = 6)
)
scoring_f <- osat_score_generator("plate", "Sex")
optimize_design(
  bc, scoring = scoring_f, invivo_study_samples,
  max_iter = 100,
  shuffle_proposal_func = mk_swapping_function(1)
)
data("invivo_study_samples")
bc <- BatchContainer$new(
  dimensions = c("plate" = 2, "column" = 5, "row" = 6)
)
scoring_f <- osat_score_generator("plate", "Sex")
optimize_design(
  bc, scoring = scoring_f, invivo_study_samples,
  max_iter = 100,
  shuffle_proposal_func = mk_swapping_function(1)
)

Unbalanced treatment and time sample list

Description

A sample list with 4 columns SampleName, Well, Time and Treatment Not all treatments are avaliable at all time points. All samples are placed on the same plate.

Usage

data(multi_trt_day_samples)
data(multi_trt_day_samples)

Format

An object of class "tibble"

Author(s)

siebourj

Generic optimizer that can be customized by user provided functions for generating shuffles and progressing towards the minimal score

Description

Generic optimizer that can be customized by user provided functions for generating shuffles and progressing towards the minimal score

Usage

optimize_design(
  batch_container,
  samples = NULL,
  scoring = NULL,
  n_shuffle = NULL,
  shuffle_proposal_func = NULL,
  acceptance_func = accept_strict_improvement,
  aggregate_scores_func = identity,
  check_score_variance = TRUE,
  autoscale_scores = FALSE,
  autoscaling_permutations = 100,
  autoscale_useboxcox = TRUE,
  sample_attributes_fixed = FALSE,
  max_iter = 10000,
  min_delta = NA,
  quiet = FALSE
)
optimize_design(
  batch_container,
  samples = NULL,
  scoring = NULL,
  n_shuffle = NULL,
  shuffle_proposal_func = NULL,
  acceptance_func = accept_strict_improvement,
  aggregate_scores_func = identity,
  check_score_variance = TRUE,
  autoscale_scores = FALSE,
  autoscaling_permutations = 100,
  autoscale_useboxcox = TRUE,
  sample_attributes_fixed = FALSE,
  max_iter = 10000,
  min_delta = NA,
  quiet = FALSE
)

Arguments

`batch_container`	An instance of `BatchContainer`.
`samples`	A `data.frame` with sample information. Should be `NULL` if the `BatchContainer` already has samples in it.
`scoring`	Scoring function or a named `list()` of scoring functions.
`n_shuffle`	Vector of length 1 or larger, defining how many random sample swaps should be performed in each iteration. If `length(n_shuffle)==1`, this sets no limit to the number of iterations. Otherwise, the optimization stops if the swapping protocol is exhausted.
`shuffle_proposal_func`	A user defined function to propose the next shuffling of samples. Takes priority over n_shuffle if both are provided. The function is called with a BatchContainer `bc` and an integer parameter `iteration` for the current iteration number, allowing very flexible shuffling strategies. Mapper syntax is supported (see `purrr::as_mapper()`). The returned function must either return a list with fields `src`and `dst` (for pairwise sample swapping) or a numeric vector with a complete re-assigned sample order.
`acceptance_func`	Alternative function to select a new score as the best one. Defaults to strict improvement rule, i.e. all elements of a score have to be smaller or equal in order to accept the solution as better. This may be replaced with an alternative acceptance function included in the package (e.g. `mk_simanneal_acceptance_func()`) or a user provided function. Mapper syntax is supported (see `purrr::as_mapper()`).
`aggregate_scores_func`	A function to aggregate multiple scores AFTER (potential) auto-scaling and BEFORE acceptance evaluation. If a function is passed, (multi-dimensional) scores will be transformed (often to a single double value) before calling the acceptance function. E.g., see `first_score_only()` or `worst_score()`. Note that particular acceptance functions may require aggregation of a score to a single scalar in order to work, see for example those generated by `mk_simanneal_acceptance_func()`. Mapper syntax is supported (see `purrr::as_mapper()`).
`check_score_variance`	Logical: if TRUE, scores will be checked for variability under sample permutation and the optimization is not performed if at least one subscore appears to have a zero variance.
`autoscale_scores`	Logical: if TRUE, perform a transformation on the fly to equally scale scores to a standard normal. This makes scores more directly comparable and easier to aggregate.
`autoscaling_permutations`	How many random sample permutations should be done to estimate autoscaling parameters. (Note: minimum will be 20, regardless of the specified value)
`autoscale_useboxcox`	Logical; if TRUE, use a boxcox transformation for the autoscaling if possible at all. Requires installation of the `bestNormalize` package.
`sample_attributes_fixed`	Logical; if TRUE, sample shuffle function may generate altered sample attributes at each iteration. This affects estimation of score distributions. (Parameter only relevant if shuffle function does introduce attributes!)
`max_iter`	Stop optimization after a maximum number of iterations, independent from other stopping criteria (user defined shuffle proposal or min_delta).
`min_delta`	If not NA, optimization is stopped as soon as successive improvement (i.e. euclidean distance between score vectors from current best and previously best solution) drops below min_delta.
`quiet`	If TRUE, suppress non-critical warnings or messages.

Value

A trace object

Examples

data("invivo_study_samples")
bc <- BatchContainer$new(
  dimensions = c("plate" = 2, "column" = 5, "row" = 6)
)
bc <- optimize_design(bc, invivo_study_samples,
  scoring = osat_score_generator("plate", "Sex"),
  max_iter = 100
)
plot_plate(bc$get_samples(), .col = Sex)
data("invivo_study_samples")
bc <- BatchContainer$new(
  dimensions = c("plate" = 2, "column" = 5, "row" = 6)
)
bc <- optimize_design(bc, invivo_study_samples,
  scoring = osat_score_generator("plate", "Sex"),
  max_iter = 100
)
plot_plate(bc$get_samples(), .col = Sex)

Convenience wrapper to optimize a typical multi-plate design

Description

The batch container will in the end contain the updated experimental layout

Usage

optimize_multi_plate_design(
  batch_container,
  across_plates_variables = NULL,
  within_plate_variables = NULL,
  plate = "plate",
  row = "row",
  column = "column",
  n_shuffle = 1,
  max_iter = 1000,
  quiet = FALSE
)
optimize_multi_plate_design(
  batch_container,
  across_plates_variables = NULL,
  within_plate_variables = NULL,
  plate = "plate",
  row = "row",
  column = "column",
  n_shuffle = 1,
  max_iter = 1000,
  quiet = FALSE
)

Arguments

`batch_container`	Batch container (bc) with all columns that denote plate related information
`across_plates_variables`	Vector with bc column name(s) that denote(s) groups/conditions to be balanced across plates, sorted by relative importance of the factors
`within_plate_variables`	Vector with bc column name(s) that denote(s) groups/conditions to be spaced out within each plate, sorted by relative importance of the factors
`plate`	Name of the bc column that holds the plate identifier
`row`	Name of the bc column that holds the plate row number (integer values starting at 1)
`column`	Name of the bc column that holds the plate column number (integer values starting at 1)
`n_shuffle`	Vector of length 1 or larger, defining how many random sample swaps should be performed in each iteration. See `optimize_design()`.
`max_iter`	Stop any of the optimization runs after this maximum number of iterations. See `optimize_design()`.
`quiet`	If TRUE, suppress informative messages.

Value

A list with named traces, one for each optimization step

Compute OSAT score for sample assignment.

Description

The OSAT score is intended to ensure even distribution of samples across batches and is closely related to the chi-square test contingency table (Yan et al. (2012) doi:10.1186/1471-2164-13-689).

Usage

osat_score(bc, batch_vars, feature_vars, expected_dt = NULL, quiet = FALSE)
osat_score(bc, batch_vars, feature_vars, expected_dt = NULL, quiet = FALSE)

Arguments

`bc`	BatchContainer with samples or `data.table`/data.frame where every row is a location in a container and a sample in this location.
`batch_vars`	character vector with batch variable names to take into account for the score computation.
`feature_vars`	character vector with sample variable names to take into account for score computation.
`expected_dt`	A `data.table` with expected number of samples sample variables and batch variables combination. This is not required, however it does not change during the optimization process. So it is a good idea to cache this value.
`quiet`	Do not warn about `NA`s in feature columns.

Value

a list with two attributes: ⁠$score⁠ (numeric score value), ⁠$expected_dt⁠ (expected counts data.table for reuse)

Examples

sample_assignment <- tibble::tribble(
  ~ID, ~SampleType, ~Sex, ~plate,
  1, "Case", "Female", 1,
  2, "Case", "Female", 1,
  3, "Case", "Male", 2,
  4, "Control", "Female", 2,
  5, "Control", "Female", 1,
  6, "Control", "Male", 2,
  NA, NA, NA, 1,
  NA, NA, NA, 2,
)

osat_score(sample_assignment,
  batch_vars = "plate",
  feature_vars = c("SampleType", "Sex")
)
sample_assignment <- tibble::tribble(
  ~ID, ~SampleType, ~Sex, ~plate,
  1, "Case", "Female", 1,
  2, "Case", "Female", 1,
  3, "Case", "Male", 2,
  4, "Control", "Female", 2,
  5, "Control", "Female", 1,
  6, "Control", "Male", 2,
  NA, NA, NA, 1,
  NA, NA, NA, 2,
)

osat_score(sample_assignment,
  batch_vars = "plate",
  feature_vars = c("SampleType", "Sex")
)

Convenience wrapper for the OSAT score

Description

This function wraps osat_score() in order to take full advantage of the speed gain without managing the buffered objects in the user code.

Usage

osat_score_generator(batch_vars, feature_vars, quiet = FALSE)
osat_score_generator(batch_vars, feature_vars, quiet = FALSE)

Arguments

`batch_vars`	character vector with batch variable names to take into account for the score computation.
`feature_vars`	character vector with sample variable names to take into account for score computation.
`quiet`	Do not warn about `NA`s in feature columns.

Value

A function that returns the OSAT score for a specific sample arrangement

Examples

sample_assignment <- tibble::tribble(
  ~ID, ~SampleType, ~Sex, ~plate,
  1, "Case", "Female", 1,
  2, "Case", "Female", 1,
  3, "Case", "Male", 2,
  4, "Control", "Female", 2,
  5, "Control", "Female", 1,
  6, "Control", "Male", 2,
  NA, NA, NA, 1,
  NA, NA, NA, 2,
)

osat_scoring_function <- osat_score_generator(
  batch_vars = "plate",
  feature_vars = c("SampleType", "Sex")
)

osat_scoring_function(sample_assignment)
sample_assignment <- tibble::tribble(
  ~ID, ~SampleType, ~Sex, ~plate,
  1, "Case", "Female", 1,
  2, "Case", "Female", 1,
  3, "Case", "Male", 2,
  4, "Control", "Female", 2,
  5, "Control", "Female", 1,
  6, "Control", "Male", 2,
  NA, NA, NA, 1,
  NA, NA, NA, 2,
)

osat_scoring_function <- osat_score_generator(
  batch_vars = "plate",
  feature_vars = c("SampleType", "Sex")
)

osat_scoring_function(sample_assignment)

Example dataset with a plate effect

Description

Here top and bottom row were both used as controls (in dilutions). The top row however was affected differently than the bottom one. This makes normalization virtually impossible.

Usage

data(plate_effect_example)
data(plate_effect_example)

Format

An object of class "tibble"

row: Plate row
column: Plate column
conc: Sample concentration
log_conc: Logarithm of sample concentration
treatment: Sample treatment
readout: Readout from experiment

Author(s)

Balazs Banfai

Plot plate layouts

Description

Plot plate layouts

Usage

plot_plate(
  .tbl,
  plate = plate,
  row = row,
  column = column,
  .color,
  .alpha = NULL,
  .pattern = NULL,
  title = paste("Layout by", rlang::as_name(rlang::enquo(plate))),
  add_excluded = FALSE,
  rename_empty = FALSE
)
plot_plate(
  .tbl,
  plate = plate,
  row = row,
  column = column,
  .color,
  .alpha = NULL,
  .pattern = NULL,
  title = paste("Layout by", rlang::as_name(rlang::enquo(plate))),
  add_excluded = FALSE,
  rename_empty = FALSE
)

Arguments

`.tbl`	a `tibble` (or `data.frame`) with the samples assigned to locations. Alternatively a BatchContainter with samples can be supplied here.
`plate`	optional dimension variable used for the plate ids
`row`	the dimension variable used for the row ids
`column`	the dimension variable used for the column ids
`.color`	the continuous or discrete variable to color by
`.alpha`	a continuous variable encoding transparency
`.pattern`	a discrete variable encoding tile pattern (needs ggpattern)
`title`	string for the plot title
`add_excluded`	flag to add excluded wells (in bc$exclude) to the plot. A BatchContainer must be provided for this.
`rename_empty`	whether NA entries in sample table should be renamed to 'empty'.

Value

the ggplot object

Author(s)

siebourj

Examples

nPlate <- 3
nColumn <- 4
nRow <- 6

treatments <- c("CTRL", "TRT1", "TRT2")
timepoints <- c(1, 2, 3)


bc <- BatchContainer$new(
  dimensions = list(
    plate = nPlate,
    column = list(values = letters[1:nColumn]),
    row = nRow
  )
)

sample_sheet <- tibble::tibble(
  sampleID = 1:(nPlate * nColumn * nRow),
  Treatment = rep(treatments, each = floor(nPlate * nColumn * nRow) / length(treatments)),
  Timepoint = rep(timepoints, floor(nPlate * nColumn * nRow) / length(treatments))
)

# assign samples from the sample sheet
bc <- assign_random(bc, samples = sample_sheet)

plot_plate(bc$get_samples(),
  plate = plate, column = column, row = row,
  .color = Treatment, .alpha = Timepoint
)


plot_plate(bc$get_samples(),
  plate = plate, column = column, row = row,
  .color = Treatment, .pattern = Timepoint
)

nPlate <- 3
nColumn <- 4
nRow <- 6

treatments <- c("CTRL", "TRT1", "TRT2")
timepoints <- c(1, 2, 3)


bc <- BatchContainer$new(
  dimensions = list(
    plate = nPlate,
    column = list(values = letters[1:nColumn]),
    row = nRow
  )
)

sample_sheet <- tibble::tibble(
  sampleID = 1:(nPlate * nColumn * nRow),
  Treatment = rep(treatments, each = floor(nPlate * nColumn * nRow) / length(treatments)),
  Timepoint = rep(timepoints, floor(nPlate * nColumn * nRow) / length(treatments))
)

# assign samples from the sample sheet
bc <- assign_random(bc, samples = sample_sheet)

plot_plate(bc$get_samples(),
  plate = plate, column = column, row = row,
  .color = Treatment, .alpha = Timepoint
)


plot_plate(bc$get_samples(),
  plate = plate, column = column, row = row,
  .color = Treatment, .pattern = Timepoint
)

Generate in one go a shuffling function that produces permutations with specific constraints on multiple sample variables and group sizes fitting one specific allocation variable

Description

Generate in one go a shuffling function that produces permutations with specific constraints on multiple sample variables and group sizes fitting one specific allocation variable

Usage

shuffle_grouped_data(
  batch_container,
  allocate_var,
  keep_together_vars = c(),
  keep_separate_vars = c(),
  n_min = NA,
  n_max = NA,
  n_ideal = NA,
  subgroup_var_name = NULL,
  report_grouping_as_attribute = FALSE,
  prefer_big_groups = FALSE,
  strict = TRUE,
  fullTree = FALSE,
  maxCalls = 1e+06
)
shuffle_grouped_data(
  batch_container,
  allocate_var,
  keep_together_vars = c(),
  keep_separate_vars = c(),
  n_min = NA,
  n_max = NA,
  n_ideal = NA,
  subgroup_var_name = NULL,
  report_grouping_as_attribute = FALSE,
  prefer_big_groups = FALSE,
  strict = TRUE,
  fullTree = FALSE,
  maxCalls = 1e+06
)

Arguments

`batch_container`	Batch container with all samples assigned that are to be grouped and sub-grouped
`allocate_var`	Name of a variable in the `samples` table to inform possible groupings, as (sub)group sizes must add up to the correct totals
`keep_together_vars`	Vector of column names in sample table; groups are formed by pooling samples with identical values of all those variables
`keep_separate_vars`	Vector of column names in sample table; items with identical values in those variables will not be put into the same subgroup if at all possible
`n_min`	Minimal number of samples in one sub(!)group; by default 1
`n_max`	Maximal number of samples in one sub(!)group; by default the size of the biggest group
`n_ideal`	Ideal number of samples in one sub(!)group; by default the floor or ceiling of `mean(n_min,n_max)`, depending on the setting of `prefer_big_groups`
`subgroup_var_name`	An optional column name for the subgroups which are formed (or NULL)
`report_grouping_as_attribute`	Boolean, if TRUE, add an attribute table to the permutation functions' output, to be used in scoring during the design optimization
`prefer_big_groups`	Boolean; indicating whether or not bigger subgroups should be preferred in case of several possibilities
`strict`	Boolean; if TRUE, subgroup size constraints have to be met strictly, implying the possibility of finding no solution at all
`fullTree`	Boolean: Enforce full search of the possibility tree, independent of the value of `maxCalls`
`maxCalls`	Maximum number of recursive calls in the search tree, to avoid long run times with very large trees

Value

Shuffling function that on each call returns an index vector for a valid sample permutation

Shuffling proposal function with constraints.

Description

Can be used with optimize_design to improve convergence speed.

Usage

shuffle_with_constraints(src = TRUE, dst = TRUE)
shuffle_with_constraints(src = TRUE, dst = TRUE)

Arguments

`src`	Expression to define possible source locations in the samples/locations table. Usually evaluated based on `BatchContainer$get_samples(include_id = TRUE, as_tibble = FALSE)` as an environment (see also `with()`). A single source location is selected from rows where the expression evaluates to`TRUE`.
`dst`	Expression to define possible destination locations in the samples/locations table. Usually evaluated based on `BatchContainer$get_samples()` as an environment. Additionally a special variable `.src` is available in this environment which describes the selected source row from the table.

Value

Returns a function which accepts a BatchContainer and an iteration number (i). This function returns a list with two names: src vector of length 2 and dst vector of length two. See BatchContainer$move_samples().

Examples

set.seed(43)

samples <- data.frame(
  id = 1:100,
  sex = sample(c("F", "M"), 100, replace = TRUE),
  group = sample(c("treatment", "control"), 100, replace = TRUE)
)

bc <- BatchContainer$new(
  dimensions = c("plate" = 5, "position" = 25)
)

scoring_f <- function(samples) {
  osat_score(
    samples,
    "plate",
    c("sex", "group")
  )$score
}

# in this example we treat all the positions in the plate as equal.
# when shuffling we enforce that source location is non-empty,
# and destination location has a different plate number
bc <- optimize_design(
  bc,
  scoring = scoring_f,
  samples,
  shuffle_proposal = shuffle_with_constraints(
    # source is non-empty location
    !is.na(.sample_id),
    # destination has a different plate
    plate != .src$plate
  ),
  max_iter = 10
)
set.seed(43)

samples <- data.frame(
  id = 1:100,
  sex = sample(c("F", "M"), 100, replace = TRUE),
  group = sample(c("treatment", "control"), 100, replace = TRUE)
)

bc <- BatchContainer$new(
  dimensions = c("plate" = 5, "position" = 25)
)

scoring_f <- function(samples) {
  osat_score(
    samples,
    "plate",
    c("sex", "group")
  )$score
}

# in this example we treat all the positions in the plate as equal.
# when shuffling we enforce that source location is non-empty,
# and destination location has a different plate number
bc <- optimize_design(
  bc,
  scoring = scoring_f,
  samples,
  shuffle_proposal = shuffle_with_constraints(
    # source is non-empty location
    !is.na(.sample_id),
    # destination has a different plate
    plate != .src$plate
  ),
  max_iter = 10
)

Compose shuffling function based on already available subgrouping and allocation information

Description

Compose shuffling function based on already available subgrouping and allocation information

Usage

shuffle_with_subgroup_formation(
  subgroup_object,
  subgroup_allocations,
  keep_separate_vars = c(),
  report_grouping_as_attribute = FALSE
)
shuffle_with_subgroup_formation(
  subgroup_object,
  subgroup_allocations,
  keep_separate_vars = c(),
  report_grouping_as_attribute = FALSE
)

Arguments

`subgroup_object`	A subgrouping object as returned by `form_homogeneous_subgroups()`
`subgroup_allocations`	A list of possible assignments of the allocation variable as returned by `compile_possible_subgroup_allocation()`
`keep_separate_vars`	Vector of column names in sample table; items with identical values in those variables will not be put into the same subgroup if at all possible
`report_grouping_as_attribute`	Boolean, if TRUE, add an attribute table to the permutation functions' output, to be used in scoring during the design optimization

Value

Shuffling function that on each call returns an index vector for a valid sample permutation

Aggregation of scores: sum up all individual scores

Description

Aggregation of scores: sum up all individual scores

Usage

sum_scores(scores, na.rm = FALSE, ...)
sum_scores(scores, na.rm = FALSE, ...)

Arguments

`scores`	A score or multiple component score vector
`na.rm`	Boolean. Should NA values be ignored when obtaining the maximum? FALSE by default as ignoring NA values may render the sum meaningless.
`...`	Parameters to be ignored by this aggregation function

Value

The aggregated score, i.e. the sum of all indicidual scores.

Examples

sum_scores(c(3, 2, 1))
sum_scores(c(3, 2, 1))

Validates sample data.frame.

Description

Validates sample data.frame.

Usage

validate_samples(samples)
validate_samples(samples)

Arguments

samples

A data.frame having a sample annotation per row.

Aggregation of scores: take the maximum (i.e. worst score only)

Description

This function enables comparison of the results of two scoring functions by just basing the decision on the largest element. This corresponds to the infinity-norm in ML terms.

Usage

worst_score(scores, na.rm = FALSE, ...)
worst_score(scores, na.rm = FALSE, ...)

Arguments

`scores`	A score or multiple component score vector
`na.rm`	Boolean. Should NA values be ignored when obtaining the maximum? FALSE by default as ignoring NA values may hide some issues with the provided scoring functions and also the aggregated value cannot be seen as the proper infinity norm anymore.
`...`	Parameters to be ignored by this aggregation function

Value

The aggregated score, i.e. the value of the largest element in a multiple-component score vector.

Examples

worst_score(c(3, 2, 1))
worst_score(c(3, 2, 1))

Package 'designit'

Help Index

Alternative acceptance function for multi-dimensional scores in which order (left to right, e.g. first to last) denotes relevance.

Description

Usage

Arguments

Value

Distributes samples based on a sample sheet.

Description

Usage

Arguments

Value

Examples

Distributes samples in order.

Description

Usage

Arguments

Value

Examples

Assignment function which distributes samples randomly.

Description

Usage

Arguments

Value

Examples

Creates a BatchContainer from a table (data.frame/tibble::tibble) containing sample and location information.

Description

Usage

Arguments

Value

Examples

R6 Class representing a batch container.

Description

Details

Public fields

Active bindings

Methods

Public methods

Method new()

Usage

Arguments

Examples

Method get_samples()

Usage

Arguments

Returns

Method get_locations()

Usage

Returns

Method move_samples()

Usage

Arguments

Returns

Method score()

Usage

Arguments

Returns

Method copy()

Usage

Returns

Method print()

Usage

Arguments

Method scores_table()

Usage

Arguments

Returns

Method plot_trace()

Usage

Arguments

Returns

Examples

R6 Class representing a batch container dimension.

Description

Public fields

Active bindings

Methods

Public methods

Method new()

Usage

Method `new()`

Method `get_samples()`

Method `get_locations()`

Method `move_samples()`

Method `score()`

Method `copy()`

Method `print()`

Method `scores_table()`

Method `plot_trace()`

Method `new()`

Method `clone()`

Generate `terms.object` (formula with attributes)