Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

marc21-frequency(1)

NAME

marc21-frequency — Compute a frequency table of values

SYNOPSIS

marc21 frequency [OPTIONS] <QUERY> [PATH]…
marc21 freq [OPTIONS] <QUERY> [PATH]…

DESCRIPTION

This command computes a frequency table over all values (columns) of the given query expression. The resulting frequency table is sorted in descending order (the most frequent value is printed first). If the count of two or more subfield values is equal, these lines are given in lexicographical order. The set of data fields, which are included in the result of a record, can be restricted by an optional predicate.

ARGUMENTS

<QUERY>
A MARC-21 query expression.

OPTIONS

-H, --header <header>
Insert a header row before the data. The header should be entered as a comma-separated list. Leading and trailing spaces in each column are automatically removed.
--tsv
Write output tab-separated (TSV)
-o, --output <path>
Write output to <path> instead of stdout. If the filename ends in .tsv or .tsv.gz, the output is automatically saved in TSV format. The output is gzip-compressed when the filename ends with .gz.

FILTER OPTIONS

-l, --limit <n>
Limit the result to first <n> records (a limit value 0 means no limit)
-s, --skip-invalid
Skip invalid records that can’t be decoded
--strsim-threshold
The minimum score for string similarity comparisons (0 <= score <= 100)
--where
An expression for filtering records
--filter-normalization <form>
Transliterate the given filter or query expression into the specified Unicode normal form. Possible values: nfd, nfkd, nfc, nfkc. This option can also be specified by setting the environment variable MARC21_FILTER_NORMALIZATION.

COMMON OPTIONS

-p, --progress
If set, show a progress bar
-c, --compression
Specify compression level (0..=9)

EXIT STATUS

  • 0 — Command succeeded.
  • 1 — Command failed.

EXAMPLES

The following example creates a frequency table based on the year of the last update (field 005/00-04).

$ marc21 frequency -s -H 'year,count' '005[0:4]' GND.mrc`
year,count
2025,1193157
2024,1131644
2021,854178
2022,848635
2023,760070
2016,734399
2010,564136
2017,522303
2020,498302
2008,465916
2019,423590
2011,423077
2014,422959
2018,375568
2013,295991
2015,245866
2026,221200
2012,135738
2009,104168