blocks|key|1620847|text|如果列的宽度不固定，您仍然可以使用排序：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1620848|sort+-t+'%7C'+--key=10,10+-g+FILENAME|code-block|syntax|javascript|1620849|1620850|+-t标志将设置分隔符。|ordered-list-item|offset|length|style|CODE|1620851|+-g仅用于自然数字排序。|1620852|1620853|entityMap^0|0|0|0|1|2|0|1|2|0|0^^$0|@$1|2|3|4|5|6|7|T|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|U|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|V|8|@]|9|@]|A|$]]|$1|H|3|I|5|J|7|W|8|@$K|X|L|Y|M|N]]|9|@]|A|$]]|$1|O|3|P|5|J|7|Z|8|@$K|10|L|11|M|N]]|9|@]|A|$]]|$1|Q|3|-4|5|6|7|12|8|@]|9|@]|A|$]]|$1|R|3|-4|5|6|7|13|8|@]|9|@]|A|$]]]|S|$]]

If the columns are not fixed width, you can still use sort:

<pre><code>sort -t '|' --key=10,10 -g FILENAME
</code></pre>

<ol>
<li>The <code>-t</code> flag will set the separator.</li>
<li>The <code>-g</code> is just for natural numeric ordering.</li>
</ol>

blocks|key|1620780|text|如果所有输入数据的格式都是如上所述-即固定大小的字段-并且输出中行的顺序并不重要，那么sort+--key=8,19+--unique应该可以做到这一点。如果顺序很重要，但重复行始终是连续的，则uniq+-s+8+-w+11将起作用。如果字段不是固定宽度的，但重复行始终是连续的，则Pax的awk脚本将起作用。不过，在最一般的情况下，对于一行程序来说，我们可能会看到一些稍微复杂的东西。|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|1620781|entityMap^0|17|O|2P|F|0^^$0|@$1|2|3|4|5|6|7|H|8|@$9|I|A|J|B|C]|$9|K|A|L|B|C]]|D|@]|E|$]]|$1|F|3|-4|5|6|7|M|8|@]|D|@]|E|$]]]|G|$]]

If all your input data is formatted as above - i.e. fixed-size fields - and the order of the lines in the output doesn't matter, <code>sort --key=8,19 --unique</code> should do the trick. If the order does matter, but duplicate lines are always consecutive, <code>uniq -s 8 -w 11</code> will work. If the fields are not fixed-width but duplicate lines are always consecutive, Pax's awk script will work. In the most general case we're probably looking at something slightly too complicated for a one-liner though.

blocks|key|1511505|text|假设它们是连续的，并且您想要删除后续的，下面的awk脚本将执行此操作：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1511506|awk+-F'%7C'+'NR==1+{print;x=$2}+NR>1+{if+($2+!=+x)+{print;x=$2}}'|code-block|syntax|javascript|1511507|它的工作方式是打印第一行并存储第二列。然后，对于后续行，它跳过存储值与第二列相同的行(如果不同，则打印该行并更新存储值)。|1511508|如果它们不是连续的，我会选择Perl解决方案，其中您维护一个关联数组来检测和删除重复项-我会编写它，但我的3yo女儿刚刚醒来，现在是午夜，她着凉了-明天见，如果我能挺过晚上的话:-)|1511509|entityMap^0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|M|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|N|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|O|8|@]|9|@]|A|$]]|$1|I|3|J|5|6|7|P|8|@]|9|@]|A|$]]|$1|K|3|-4|5|6|7|Q|8|@]|9|@]|A|$]]]|L|$]]

Assuming that they're consecutive and you want to remove subsequent ones, the following awk script will do it:

<pre><code>awk -F'|' 'NR==1 {print;x=$2} NR&gt;1 {if ($2 != x) {print;x=$2}}'
</code></pre>

It works by printing the first line and storing the second column. Then for subsequent lines, it skips ones where the stored value and second column are the same (if different, it prints the line and updates the stored value).

If they're not consecutive, I'd opt for a Perl solution where you maintain an associative array to detect and remove duplicates - I'd code it up but my 3yo daughter has just woken up , it's midnight and she has a cold - see you all tomorrow, if I survive the night :-)

blocks|key|1619106|text|这是用于删除行中重复单词的代码。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1619107|awk+'{for+(i=1;+i<=NF;+i%2B%2B)+{x=0;+for(j=i-1;+j>=1;+j--)+{if+($i+==+$j){x=1}+}+if(+x+!=+1){printf+("%25s+",+$i)+}}print+""}'+sent|code-block|syntax|javascript|1619108|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

This is the code which is used for removing duplicate words in the line..

<pre><code>awk '{for (i=1; i&lt;=NF; i++) {x=0; for(j=i-1; j&gt;=1; j--) {if ($i == $j){x=1} } if( x != 1){printf ("%s ", $i) }}print ""}' sent
</code></pre>

blocks|key|318799|text|Unix包含python，因此以下几行代码可能正是您所需要的：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|318800|f=open('input.txt','rt')
d={}
for+s+in+f.readlines():
++l=s.split('%7C')
++if+l[2]+not+in+d:
++++print+s
++++d[l[2]]=True|code-block|syntax|javascript|318801|这将在不需要固定长度的情况下工作，即使相同的值不是相邻的。|318802|entityMap^0|0|0|0^^$0|@$1|2|3|4|5|6|7|K|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|L|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|M|8|@]|9|@]|A|$]]|$1|I|3|-4|5|6|7|N|8|@]|9|@]|A|$]]]|J|$]]

Unix includes python, so the following few-liners may be just what you need:

<pre><code>f=open('input.txt','rt')
d={}
for s in f.readlines():
 l=s.split('|')
 if l[2] not in d:
 print s
 d[l[2]]=True
</code></pre>

This will work without requiring fixed-length, and even if identical values are not neighbours.

blocks|key|318813|text|此awk将仅打印第二列不是05408736032的那些行|type|unstyled|depth|inlineStyleRanges|entityRanges|data|318814|awk+'{if($2!=05408736032}{print}'+filename|code-block|syntax|javascript|318815|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

this awk will print only those line where second column is not 05408736032

<pre><code>awk '{if($2!=05408736032}{print}' filename
</code></pre>

blocks|key|318857|text|对输入文件执行两次遍历:+1)找到重复的值，2)删除它们|type|unstyled|depth|inlineStyleRanges|entityRanges|data|318858|awk+-F\%7C+'
++++{count[$2]%2B%2B}+
++++END+{for+(x+in+count)+{if+(count[x]+>+1)+{print+x}}}
'+input.txt+>input.txt.dups

awk+-F\%7C+'
++++NR==FNR+{dup[$1]%2B%2B;+next}
++++!($2+in+dup)+{print}
'+input.txt.dups+input.txt|code-block|syntax|javascript|318859|如果您使用bash，您可以省略临时文件:使用进程替换合并到一行：(深呼吸)|318860|awk+-F\%7C+'NR==FNR+{dup[$1]%2B%2B;+next}+!($2+in+dup)+{print}'+<(awk+-F\%7C+'{count[$2]%2B%2B}+END+{for+(x+in+count)+{if+(count[x]+>+1)+{print+x}}}'+input.txt)+input.txt|318861|(呼！)|318862|entityMap^0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|O|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|P|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|Q|8|@]|9|@]|A|$]]|$1|I|3|J|5|D|7|R|8|@]|9|@]|A|$E|F]]|$1|K|3|L|5|6|7|S|8|@]|9|@]|A|$]]|$1|M|3|-4|5|6|7|T|8|@]|9|@]|A|$]]]|N|$]]

Takes two passes over the input file: 1) find the duplicate values, 2) remove them

<pre><code>awk -F\| '
 {count[$2]++} 
 END {for (x in count) {if (count[x] &gt; 1) {print x}}}
' input.txt &gt;input.txt.dups

awk -F\| '
 NR==FNR {dup[$1]++; next}
 !($2 in dup) {print}
' input.txt.dups input.txt
</code></pre>

If you use bash, you can omit the temp file: combine into one line using process substitution: (deep breath)

<pre><code>awk -F\| 'NR==FNR {dup[$1]++; next} !($2 in dup) {print}' &lt;(awk -F\| '{count[$2]++} END {for (x in count) {if (count[x] &gt; 1) {print x}}}' input.txt) input.txt
</code></pre>

(phew!)

blocks|key|106499|text|awk+-F"%7C"+'!_[$2]%2B%2B'+file|type|code-block|depth|inlineStyleRanges|entityRanges|data|syntax|javascript|106500|unstyled|entityMap^0|0^^$0|@$1|2|3|4|5|6|7|G|8|@]|9|@]|A|$B|C]]|$1|D|3|-4|5|E|7|H|8|@]|9|@]|A|$]]]|F|$]]

blocks|key|1620890|text|将行放入散列中，使用line作为键和值，然后遍历散列(这应该适用于几乎所有编程语言，如awk、perl等)。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1620891|entityMap^0|0^^$0|@$1|2|3|4|5|6|7|D|8|@]|9|@]|A|$]]|$1|B|3|-4|5|6|7|E|8|@]|9|@]|A|$]]]|C|$]]

Put the lines in a hash, using line as key and value, then iterate over the hash (this should work in almost any programming language, awk, perl, etc.)

I want to remove all lines where all the second column 05408736032 are same

0009300|05408736032|89|01|001|0|0|0|1|NNNNNNYNNNNNNNNN|asdf| 0009367|05408736032|89|01|001|0|0|0|1|NNNNNNYNNNNNNNNN|adff|

these lines are not consecutive. Its fine to remove all the lines . I dont have to keep one of them around. 

Sorry my unix fu is really weak from non usage :) .

removing duplicate lines from file /grep

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

我想删除所有第二列05408736032都相同的行0009300|05408736032|89|01|001|0|0|0|1|NNNNNNYNNNNNNNNN|asdf| 0009367|05408736032|89|01|001|0|0|0|1|NNNNNNYNNNNNNNNN|adff|这些行不是连续的。删除所有行是很好的。我不需要把他们中的任何一个留在身边。对不起，我的unix fu由于不使

问从文件/grep中删除重复行
EN

回答 9

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从文件/grep中删除重复行EN

回答 9

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从文件/grep中删除重复行
EN