blocks|key|498397|text|关于以下几点：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|498398|+awk+-F,+'(NR==FNR){a[$1];next}!($1+in+a)'+blacklist.csv+candidates.csv|code-block|syntax|javascript|498399|这是怎么工作的？|offset|length|style|BOLD|498400|awk程序是一系列模式-动作对，编写为：|498401|condition+{+action+}
condition+{+action+}
...|498402|其中condition通常是表达式，action是一系列命令。在这里，第一个条件-动作对读为：|CODE|498403|(NR==FNR){a[$1];next}如果总记录计数NR等于文件FNR的记录计数(也就是说，如果我们正在读取第一个文件)，将所有值存储在数组a中，然后跳到下一个记录(不要做任何其他事情)。|unordered-list-item|498404|!($1+in+a)，如果第一个字段不在数组a中，则执行默认操作，即打印该行。这将只在第二个文件上工作，因为第一个条件-动作对的条件不存在。|498405|entityMap^0|0|0|0|8|0|0|0|2|9|I|6|0|0|L|S|2|Y|3|20|1|0|0|A|M|1|0^^$0|@$1|2|3|4|5|6|7|10|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|11|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|12|8|@$I|13|J|14|K|L]]|9|@]|A|$]]|$1|M|3|N|5|6|7|15|8|@]|9|@]|A|$]]|$1|O|3|P|5|D|7|16|8|@]|9|@]|A|$E|F]]|$1|Q|3|R|5|6|7|17|8|@$I|18|J|19|K|S]|$I|1A|J|1B|K|S]]|9|@]|A|$]]|$1|T|3|U|5|V|7|1C|8|@$I|1D|J|1E|K|S]|$I|1F|J|1G|K|S]|$I|1H|J|1I|K|S]|$I|1J|J|1K|K|S]]|9|@]|A|$]]|$1|W|3|X|5|V|7|1L|8|@$I|1M|J|1N|K|S]|$I|1O|J|1P|K|S]]|9|@]|A|$]]|$1|Y|3|-4|5|6|7|1Q|8|@]|9|@]|A|$]]]|Z|$]]

What about the following:

<pre><code> awk -F, '(NR==FNR){a[$1];next}!($1 in a)' blacklist.csv candidates.csv
</code></pre>

How does this work?

An awk program is a series of pattern-action pairs, written as:

<pre><code>condition { action }
condition { action }
...
</code></pre>

where <code>condition</code> is typically an expression and <code>action</code> a series of commands. Here, the first condition-action pairs read:

<ul>
<li><code>(NR==FNR){a[$1];next}</code> if the total record count <code>NR</code> equals the record count of the file <code>FNR</code> (i.e. if we are reading the first file), store all values in array <code>a</code> and skip to the next record (do not do anything else)</li>
<li><code>!($1 in a)</code> if the first field is not in the array <code>a</code> then perform the default action which is print the line. This will only work on the second file as the condition of the first condition-action pair does not hold.</li>
</ul>

blocks|key|1034458|text|您可以同时使用sed和grep来获得输出|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|1034459|$+sed+-e+'s/[0-9]%2B/&\,/g'+blacklist.csv+>+filter.csv
$+grep+-Fvf+filter.csv+candidates.csv
id,value
4,1
50,5|code-block|syntax|javascript|1034460|sed命令向每个id添加一个,，并将输出添加到一个filter.csv。E是在MacOSX/FreeBSD中解释正则表达式，与GNU+sed中的-r相同。|1034461|grep使用选项f在文件之间进行比较，然后使用v删除行。F用于固定字符串。|1034462|entityMap^0|7|3|B|4|0|0|0|3|8|2|E|1|P|A|10|1|1V|3|20|2|0|0|4|8|1|N|1|S|1|0^^$0|@$1|2|3|4|5|6|7|Q|8|@$9|R|A|S|B|C]|$9|T|A|U|B|C]]|D|@]|E|$]]|$1|F|3|G|5|H|7|V|8|@]|D|@]|E|$I|J]]|$1|K|3|L|5|6|7|W|8|@$9|X|A|Y|B|C]|$9|Z|A|10|B|C]|$9|11|A|12|B|C]|$9|13|A|14|B|C]|$9|15|A|16|B|C]|$9|17|A|18|B|C]|$9|19|A|1A|B|C]]|D|@]|E|$]]|$1|M|3|N|5|6|7|1B|8|@$9|1C|A|1D|B|C]|$9|1E|A|1F|B|C]|$9|1G|A|1H|B|C]|$9|1I|A|1J|B|C]]|D|@]|E|$]]|$1|O|3|-4|5|6|7|1K|8|@]|D|@]|E|$]]]|P|$]]

You can use <code>sed</code> and <code>grep</code> together to get the output

<pre><code>$ sed -e 's/[0-9]+/&amp;\,/g' blacklist.csv &gt; filter.csv
$ grep -Fvf filter.csv candidates.csv
id,value
4,1
50,5
</code></pre>

<code>sed</code> command adds a <code>,</code> to each <code>id</code> and output to a <code>filter.csv</code>. <code>E</code> is to interprete regex in MacOSX/FreeBSD, same as <code>-r</code> in GNU <code>sed</code>.

<code>grep</code> uses the option <code>f</code> to compare between files, then remove the lines using <code>v</code>. <code>F</code> is for fixed string.

blocks|key|498446|text|如果您不太关心candidates.csv文件中行的顺序，可以使用以下方法：|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|498447|join+-v+1+-t,+<(sort+-t,+candidates.csv)+<(sort+blacklist.csv)|code-block|syntax|javascript|498448|-v+1请求第一个文件(排序的candidates.csv)中的所有行，这些行在第一个字段上与第二个文件(+blacklist.csv)不匹配。-t,只是将逗号设置为分隔符。|498449|如果您关心candidates.csv文件中的标题行，您可以在排序之前删除它，或者更改顺序。|498450|entityMap^0|7|E|0|0|0|4|F|E|1I|D|20|3|0|5|E|0^^$0|@$1|2|3|4|5|6|7|Q|8|@$9|R|A|S|B|C]]|D|@]|E|$]]|$1|F|3|G|5|H|7|T|8|@]|D|@]|E|$I|J]]|$1|K|3|L|5|6|7|U|8|@$9|V|A|W|B|C]|$9|X|A|Y|B|C]|$9|Z|A|10|B|C]|$9|11|A|12|B|C]]|D|@]|E|$]]|$1|M|3|N|5|6|7|13|8|@$9|14|A|15|B|C]]|D|@]|E|$]]|$1|O|3|-4|5|6|7|16|8|@]|D|@]|E|$]]]|P|$]]

If you are not too concerned about the order of the lines in your <code>candidates.csv</code> file you could use the following:

<pre><code>join -v 1 -t, &lt;(sort -t, candidates.csv) &lt;(sort blacklist.csv)
</code></pre>

The <code>-v 1</code> requests all the lines from the first file (the sorted <code>candidates.csv</code>) which do not match on the first field with the second file (the <code>blacklist.csv</code>). The <code>-t,</code> just sets the comma as a separator.

If you're concerned about the header line in the <code>candidates.csv</code> file you could strip it before the sorting or change the order.

I have two files: 

<code>candidates.csv</code>:

<pre><code>id,value
1,123
4,1
2,5
50,5
</code></pre>

<code>blacklist.csv</code>:

<pre><code>1
2
5
3
10
</code></pre>

I'd like to remove all rows from <code>candidates.csv</code> in which the first column (<code>id</code>) has a value contained in <code>blacklist.csv</code>. <code>id</code> is always numeric. In this case I'd like my output to look like this:

<pre><code>id,value
4,1
50,5
</code></pre>

So far, my script for identifying the duplicate lines looks like this:

<pre><code>cat candidates.csv | cut -d \, -f 1 | grep -f blacklist.csv -w
</code></pre>

This gives me the output

<pre><code>1
2
</code></pre>

Now I somehow need to pipe this information back into <code>sed</code>/<code>awk</code>/<code>gawk</code>/... to delete the duplicates, but I don't know how. Any ideas how I can continue from here? Or is there a better solution altogether? My only restriction is that it has to run in bash.

How to delete rows from a csv file based on a list values from another file?

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

我有两份文件：candidates.csvid,value1,1234,12,550,5blacklist.csv125310我想从candidates.csv中删除第一列(id)中包含在blacklist.csv中的值的所有行。id总是数字的。在这种情况下，我希望我的输出如下所示：id,value4,150,5到目前...

问如何根据另一个文件的列表值从csv文件中删除行？
EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何根据另一个文件的列表值从csv文件中删除行？EN