linux - shell 排序去重问题

Question

用 shell 处理一个文本文件，内容如下： {代码...} 根据第一列去重，相同的保留第二列值最大的那个，结果数据应该是这样的： {代码...} 看了下 uniq 命令，好像不支持按字段去重。请问该如何去重呢？

阿神 · Answer

Method 1

cat data.txt | sort -rnk2 | awk '{if (!keys[]) print cat data.txt | sort -k1,1 | awk '{
    if (lastKey == ) {
        if (lastValue < ) {
            lastLine = rrreee;
            lastValue = int();
        }
    } else {
        if (lastLine) {
            print lastLine;
        }

        lastKey = ;
        lastLine = rrreee;
        lastValue = int();
    }
} END {
    if (lastLine) {
        print lastLine;
    }
}'
; keys[] = 1;}'

First arrange in reverse order in the second column to ensure that the numbers are output from large to small, and then use awk. Only the string in the first column will be output when it appears for the first time, and the others will be discarded. This should solve the problem. Problem. However, this method may cause awk to occupy a lot of memory, which may cause problems if the file is too large.

Method 2

rrreee

This solution is to sort by the first column, and then use awk to filter the results. The filtering process is equivalent to an enhanced version of uniq. This solution is much better in terms of memory usage, but the amount of code is slightly larger and not very concise.

高洛峰 · Answer

$ sort -r a.txt | awk '{print , }' | uniq -f1 | awk '{print , }'
fdf 284
dfg 576
csb 513
asd 346
adf 263

Reverse order, reverse the first and second columns, remove duplicates by the second column, reverse the first and second columns

高洛峰 · Answer

awk  'BEGIN{ a[]= }{ if (>a[] )  a[]=  }END{for (i in a) if (i)  print i,a[i]}' data.txt

Put the first column into the array and then compare the values in the array and replace the larger value with the new value

高洛峰 · Answer

[root@localhost ~]# sort -k2r 1.txt|awk '!a[$1]++'
dfg     576
csb     513
asd     346
fdf     284 
adf     263

linux - shell 排序去重问题

reply all(4)I'll reply