Compute intermediate ranked retrieval results per group such as Discounted Cumulative Gain (DCG), Ideal Discounted Cumulative Gain (IDCG), Normalised Discounted Cumulative Gain (NDCG) and Label Ranking Average Precision (LRAP).

compute_intermediate_results_rr(
  gold_vs_pred,
  grouping_var,
  drop_empty_groups = options::opt("drop_empty_groups")
)

Arguments

gold_vs_pred

A data.frame as generated by create_comparison, additionally containing a column "score".

grouping_var

A character vector of grouping variables that must be present in gold_vs_pred.

drop_empty_groups

Should empty levels of factor variables be dropped in grouped set retrieval computation? (Defaults to TRUE, overwritable using option 'casimir.drop_empty_groups' or environment variable 'R_CASIMIR_DROP_EMPTY_GROUPS')

Value

A data.frame with columns "dcg", "idcg", "ndcg", "lrap".

Examples


library(casimir)

gold <- tibble::tribble(
  ~doc_id, ~label_id,
  "A", "a",
  "A", "b",
  "A", "c",
  "A", "d",
  "A", "e",
)

pred <- tibble::tribble(
  ~doc_id, ~label_id, ~score,
  "A", "f", 0.3277,
  "A", "e", 0.32172,
  "A", "b", 0.13517,
  "A", "g", 0.10134,
  "A", "h", 0.09152,
  "A", "a", 0.07483,
  "A", "i", 0.03649,
  "A", "j", 0.03551,
  "A", "k", 0.03397,
  "A", "c", 0.03364
)

gold_vs_pred <- create_comparison(gold, pred)

compute_intermediate_results_rr(
  gold_vs_pred,
  rlang::syms(c("doc_id"))
)
#> # A tibble: 1 × 5
#>   doc_id   dcg  idcg  ndcg  lrap
#>   <chr>  <dbl> <dbl> <dbl> <dbl>
#> 1 A       1.78  4.54 0.391 0.207