Duplicate Record handling
A specific row (a record) in a file is considered a duplicate of another row only if all of the keys compare as equal. It doesn’t matter what the remainder of the row has in it, it will still be considered to be a duplicate record. There are special processing rules that can be applied to the duplicate sets of rows. You may need the duplicate rows to have an exact order or you may want to only keep one of them and delete the rest.
You can keep all of the duplicates by just accepting the default for the --duplicate parameter. The default value for this parameter is original. This means that the original order of the duplicates will be maintained.
You can do just the opposite process and get all duplicates in reverse order with --duplicate reverse. This will take a group of duplicates and reverse the order of them from what they were in the original file.
It is possible to remove all but one of a group of duplicates, but you have to specify if you want to keep the first of a group of duplicates or the last in the group. To keep the first row in a group of duplicate rows use --duplicate first. And if you want to keep the last row in a group of duplicates use --duplicate last.
–duplicate can have one and only one of these options: firstOnly, lastOnly, reverse, original.
Notice how the examples given are using shortened versions of the options. You only need to type in the first letters of an option up to the point that makes it identifiably unique. So the options in this case can all be specified with a single letter. --duplicate f for instance.
