compute_intermediate_results_rr.RdCompute intermediate ranked retrieval results per group such as Discounted Cumulative Gain (DCG), Ideal Discounted Cumulative Gain (IDCG), Normalised Discounted Cumulative Gain (NDCG) and Label Ranking Average Precision (LRAP).
compute_intermediate_results_rr(
gold_vs_pred,
grouping_var,
drop_empty_groups = options::opt("drop_empty_groups")
)A data.frame as generated by create_comparison,
additionally containing a column "score".
A character vector of grouping variables that must be
present in gold_vs_pred.
Should empty levels of factor variables be dropped in grouped set retrieval
computation? (Defaults to TRUE, overwritable using option 'casimir.drop_empty_groups' or environment variable 'R_CASIMIR_DROP_EMPTY_GROUPS')
A data.frame with columns "dcg", "idcg", "ndcg", "lrap".
library(casimir)
gold <- tibble::tribble(
~doc_id, ~label_id,
"A", "a",
"A", "b",
"A", "c",
"A", "d",
"A", "e",
)
pred <- tibble::tribble(
~doc_id, ~label_id, ~score,
"A", "f", 0.3277,
"A", "e", 0.32172,
"A", "b", 0.13517,
"A", "g", 0.10134,
"A", "h", 0.09152,
"A", "a", 0.07483,
"A", "i", 0.03649,
"A", "j", 0.03551,
"A", "k", 0.03397,
"A", "c", 0.03364
)
gold_vs_pred <- create_comparison(gold, pred)
compute_intermediate_results_rr(
gold_vs_pred,
rlang::syms(c("doc_id"))
)
#> # A tibble: 1 × 5
#> doc_id dcg idcg ndcg lrap
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 A 1.78 4.54 0.391 0.207