文章/答案/技术大牛

发布

社区首页 >问答首页 >如何从第三个字段打印CSV

问如何从第三个字段打印CSV
EN

Unix & Linux用户

提问于 2018-03-21 09:40:46

回答 3查看 393关注 0票数 2

我想从第三个字段捕捉csv线，直到双引号(")

more test

"linux02","PLD26","net2-thrift-netconf","net.driver.memory","2"
"linux02","PLD26","net2-thrift-netconf","net.executor.cores","2"
"linux02","PLD26","net2-thrift-netconf","net.executor.instances","2"
"linux02","PLD26","net2-thrift-netconf","net.executor.memory","2"
"linux02","PLD26","net2-thrift-netconf","net.sql.shuffle.partitions","141"
"linux02","PLD26","net2-thrift-netconf","net.dynamicAllocation.enabled","true"
"linux02","PLD26","net2-thrift-netconf","net.dynamicAllocation.initialExecutors","2"
"linux02","PLD26","net2-thrift-netconf","net.dynamicAllocation.minExecutors","2"
"linux02","PLD26","net2-thrift-netconf","net.dynamicAllocation.maxExecutors","20"

我试过这个

sed s'/,/ /g' test | awk '{print $3","$4","$5}' | sed s'/"//g'
,,
net2-thrift-netconf,net.driver.memory
net2-thrift-netconf,net.executor.cores
net2-thrift-netconf,net.executor.instances
net2-thrift-netconf,net.executor.memory
net2-thrift-netconf,net.sql.shuffle.partitions
net2-thrift-netconf,net.dynamicAllocation.enabled
net2-thrift-netconf,net.dynamicAllocation.initialExecutors
net2-thrift-netconf,net.dynamicAllocation.minExecutors
net2-thrift-netconf,net.dynamicAllocation.maxExecutors
,,

但是我的语法有问题，因为这个语法也打印"，“第二种语法不优雅。

预期产出：

net2-thrift-netconf,net.driver.memory,2
net2-thrift-netconf,net.executor.cores,2
net2-thrift-netconf,net.executor.instances,2
net2-thrift-netconf,net.executor.memory,2
net2-thrift-netconf,net.sql.shuffle.partitions,141
net2-thrift-netconf,net.dynamicAllocation.enabled,true
net2-thrift-netconf,net.dynamicAllocation.initialExecutors,2
net2-thrift-netconf,net.dynamicAllocation.minExecutors,2
net2-thrift-netconf,net.dynamicAllocation.maxExecutors,20

csv

linux

text-processing

awk

sed

回答 3

Unix & Linux用户

回答已采纳

发布于 2018-03-21 10:07:12

只使用sed：

sed -E 's/"//g; s/^([^,]*,){2}//' infile

s/"//g，去掉所有双引号。
^([^,]*,){2}，从乞求行开始，去掉所有后面的逗号，最多重复两次。

或者使用awk：

awk -F\" '{$1=$2=$3=$4=$5=""}1' OFS="" infile

票数 3

Unix & Linux用户

发布于 2018-03-21 10:03:46

看起来这只是一个问题，或者删除引号，然后从第三个字段打印到行尾：

$ tr -d \" < file | cut -d, -f3-
net2-thrift-netconf,net.driver.memory,2
net2-thrift-netconf,net.executor.cores,2
net2-thrift-netconf,net.executor.instances,2
net2-thrift-netconf,net.executor.memory,2
net2-thrift-netconf,net.sql.shuffle.partitions,141
net2-thrift-netconf,net.dynamicAllocation.enabled,true
net2-thrift-netconf,net.dynamicAllocation.initialExecutors,2
net2-thrift-netconf,net.dynamicAllocation.minExecutors,2
net2-thrift-netconf,net.dynamicAllocation.maxExecutors,20

因此，tr -d \"从第三个到最后一个,-separated字段中删除引号和cut -d, -f3-打印。

票数 7

Unix & Linux用户

发布于 2018-03-21 13:20:13

您确实应该对CSV数据使用正确的CSV解析器。这里有一种使用红宝石的方法

ruby -rcsv -e '
  CSV.foreach(ARGV.shift) do |row|
    wanted = row.drop(2)   # ignore first 2 fields
    puts CSV.generate_line(wanted, :force_quotes=>false)
  end
' test

net2-thrift-netconf,net.driver.memory,2
net2-thrift-netconf,net.executor.cores,2
net2-thrift-netconf,net.executor.instances,2
net2-thrift-netconf,net.executor.memory,2
net2-thrift-netconf,net.sql.shuffle.partitions,141
net2-thrift-netconf,net.dynamicAllocation.enabled,true
net2-thrift-netconf,net.dynamicAllocation.initialExecutors,2
net2-thrift-netconf,net.dynamicAllocation.minExecutors,2
net2-thrift-netconf,net.dynamicAllocation.maxExecutors,20

或者是一字一句

ruby -rcsv -e 'CSV.foreach(ARGV.shift) {|r| puts CSV.generate_line(r.drop(2), :force_quotes=>false)}' test

票数 2

页面原文内容由Unix & Linux提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://unix.stackexchange.com/questions/432522

复制

相似问题

问如何从第三个字段打印CSV
EN

回答 3

Unix & Linux用户

Unix & Linux用户

Unix & Linux用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何从第三个字段打印CSVEN

回答 3

Unix & Linux用户

Unix & Linux用户

Unix & Linux用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何从第三个字段打印CSV
EN