Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

marc21-dedup(1)

NAME

marc21-dedup — Remove duplicate records from the input

SYNOPSIS

marc21 count [OPTIONS] [PATH]…

DESCRIPTION

This command deduplicates records that occur multiple times. Duplicates are identified by comparing the control number (field 001) of a record.

OPTIONS

FILTER OPTIONS

-l, --limit <n>
Limit the result to first <n> records (a limit value 0 means no limit)
-s, --skip-invalid
Skip invalid records that can’t be decoded
--strsim-threshold
The minimum score for string similarity comparisons (0 <= score <= 100)
--where
An expression for filtering records
--filter-normalization <form>
Transliterate the given filter or query expression into the specified Unicode normal form. Possible values: nfd, nfkd, nfc, nfkc. This option can also be specified by setting the environment variable MARC21_FILTER_NORMALIZATION.

COMMON OPTIONS

-p, --progress
If set, show a progress bar
-c, --compression
Specify compression level (0..=9)

EXIT STATUS

  • 0 — Command succeeded.
  • 1 — Command failed.

EXAMPLES

In the following example, all duplicate records found in the input files s1.mrc and s2.mrc are removed and written to the output file out.mrc:

$ marc21 dedup s1.mrc s2.mrc -o out.mrc