summarise_intermediate_results.RdCompute the mean of intermediate results created by
compute_intermediate_results.
summarise_intermediate_results(
intermediate_results,
propensity_scored = FALSE,
label_distribution = NULL,
set = FALSE,
replace_zero_division_with = options::opt("replace_zero_division_with")
)As produced by
compute_intermediate_results. This requires a list containing:
results_table A data.frame with columns "prec",
"rprec", "rec", "f1".
grouping_var A character vector of variables to group by.
Logical, whether to use propensity scores as weights.
Expects a data.frame with columns "label_id",
"label_freq", "n_docs". label_freq corresponds to the number of
occurences a label has in the gold standard. n_docs corresponds to
the total number of documents in the gold standard.
Logical. Allow in-place modification of
intermediate_results. Only recommended for internal package usage.
In macro averaged results (doc-avg, subj-avg), it may occur that some
instances have no predictions or no gold standard. In these cases,
calculating precision and recall may lead to division by zero. CASIMiR
standardly removes these missing values from macro averages, leading to a
smaller support (count of instances that were averaged). Other
implementations of macro averaged precision and recall default to 0 in these
cases. This option allows to control the default. Set any value between 0
and 1. (Defaults to NULL, overwritable using option 'casimir.replace_zero_division_with' or environment variable 'R_CASIMIR_REPLACE_ZERO_DIVISION_WITH')
A data.frame with columns "metric", "value".