generate_replicate_results.RdWrapper for computing n_bt bootstrap replica, combining the
functionality of compute_intermediate_results and
summarise_intermediate_results.
generate_replicate_results(
base_compare,
n_bt,
grouping_var,
seed = NULL,
ps_flags = list(intermed = FALSE, summarise = FALSE),
label_distribution = NULL,
cost_fp = NULL,
replace_zero_division_with = options::opt("replace_zero_division_with"),
drop_empty_groups = options::opt("drop_empty_groups"),
progress = options::opt("progress")
)
generate_replicate_results_dplyr(
base_compare,
n_bt,
grouping_var,
seed = NULL,
label_distribution = NULL,
ps_flags = list(intermed = FALSE, summarise = FALSE),
cost_fp = NULL,
progress = FALSE
)A data.frame as generated by create_comparison.
An integer number of resamples to be used for bootstrapping.
A character vector of variables that must be present in
base_compare.
A seed passed to resampling step for reproducibility.
A list as returned by set_ps_flags.
Expects a data.frame with columns "label_id",
"label_freq", "n_docs". label_freq corresponds to the number of
occurences a label has in the gold standard. n_docs corresponds to
the total number of documents in the gold standard.
A numeric value > 0, defaults to NULL.
In macro averaged results (doc-avg, subj-avg), it may occur that some
instances have no predictions or no gold standard. In these cases,
calculating precision and recall may lead to division by zero. CASIMiR
standardly removes these missing values from macro averages, leading to a
smaller support (count of instances that were averaged). Other
implementations of macro averaged precision and recall default to 0 in these
cases. This option allows to control the default. Set any value between 0
and 1. (Defaults to NULL, overwritable using option 'casimir.replace_zero_division_with' or environment variable 'R_CASIMIR_REPLACE_ZERO_DIVISION_WITH')
Should empty levels of factor variables be dropped in grouped set retrieval
computation? (Defaults to TRUE, overwritable using option 'casimir.drop_empty_groups' or environment variable 'R_CASIMIR_DROP_EMPTY_GROUPS')
Display progress bars for iterated computations (like bootstrap CI or
pr curves). (Defaults to FALSE, overwritable using option 'casimir.progress' or environment variable 'R_CASIMIR_PROGRESS')
A data.frame containing n_bt boot replica of results as
returned by compute_intermediate_results and
summarise_intermediate_results.
generate_replicate_results_dplyr(): Variant with dplyr based
internals rather than collapse internals.