首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >使用Powershell删除CSV文件中的附加逗号

使用Powershell删除CSV文件中的附加逗号
EN

Stack Overflow用户
提问于 2022-09-01 17:34:46
回答 3查看 68关注 0票数 1

我有一个csv文件,我想要导入到sql,但没有正确的格式化。我无法格式化生成的文件(excel文件),所以我希望使用CSV文件来实现这一点。我希望删除额外的逗号,并将部门名称(,,,)替换为正确的部门,如下面的示例所示。提前谢谢你。

示例:

代码语言:javascript
运行
复制
Current Format:
Department,,,,,,First Name,,,,Last Name,,,,,,,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Jane,,,,Doe,,,,,,,2022,Enrolled
,,,,,,Jeff,,,,Dane,,,,,,,2019,Enrolled
,,,,,,Tate,,,,Anderson,,,,,,,2019,Not Enrolled
,,,,,,Daphne,,,,Miller,,,,,,,2021,Enrolled
,,,,,,Cora,,,,Dame,,,,,,,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Dora,,,,Explorer,,,,,,,2022,Not Enrolled
,,,,,,Peppa,,,,Diggs,,,,,,,2020,Enrolled
,,,,,,Conrad,,,,Strat,,,,,,,2020,Enrolled
,,,,,,Kat,Noir,,,,2019,,,,,,,Enrolled
,,,,,,Lance,,,,Bug,2018,,,,,,,Enrolled

Ideal format:
Department,First Name,Last Name,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
Psychology,Jane,Doe,2022,Enrolled
Psychology,Jeff,Dane,2019,Enrolled
Psychology,Tate,Anderson,2019,Not Enrolled
Psychology,Daphne,Miller,2021,Enrolled
Psychology,Cora,Dame,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
Computer Science,Dora,Explorer,2022,Not Enrolled
Computer Science,Peppa,Diggs,2020,Enrolled
Computer Science,Conrad,Strat,2020,Enrolled
Computer Science,Kat,Noir,2019,Enrolled
Computer Science,Lance,Bug,2018,Enrolled
EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2022-09-01 17:59:15

给你:

代码语言:javascript
运行
复制
$csvArray = new-object System.Collections.Generic.List[string]
#Import the file
$text = (gc "C:\tmp\testdata.txt") -replace ",{2,}",","
$arrayEnd = $text.count -1
$text[1..$arrayEnd] | %{
    If ($_ -notmatch "^(,)"){
        $department = $_ -replace ","
    }
    Else {
        $csvArray.add($department + $_)
    }
}
$csvArray.Insert(0,$text[0])
$csvArray | set-content 'C:\tmp\my.csv' 
票数 1
EN

Stack Overflow用户

发布于 2022-09-01 17:59:32

使用语句

代码语言:javascript
运行
复制
& { 
  $first = $true
  switch -Wildcard -File in.csv { # Loop over all lines in file in.csv
    ',*' { # intra-department line
       # Prepend the department name, eliminate empty fields and output.
      $dept + (($_ -split ',' -ne '') -join ',')
    }
    default {
      if ($first) { # header line
        # Eliminate empty fields and output.
        ($_ -split ',' -ne '') -join ',' 
        $first = $false
      } 
      else { # department-only line
        $dept = ($_ -split ',')[0] # save department name
      }
    }
  }
} | Set-Content -Encoding utf8 out.csv

注意:

  • $_ -split ','通过,将每一行拆分为字段,-ne ''从结果数组中筛选出空字段;应用-join ','将非空字段与,重新连接,这实际上删除了多个相邻的,,从而消除了空字段。
代码语言:javascript
运行
复制
- If you don't mind the complexity of a [regex](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Regular_Expressions), you can perform the above more simply with a single [`-replace`](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Comparison_Operators#replacement-operator) operation, as shown in [Toni's helpful answer](https://stackoverflow.com/a/73573678/45375).
  • 使用switch -File是一种高效的方法,可以逐行读取文件,并基于复杂的匹配执行条件处理(作为-Wildcard的替代,您可以使用-Regex进行正则匹配,甚至可以使用脚本块({ ... }作为条件)。
代码语言:javascript
运行
复制
- As a language statement, `switch` cannot be used _directly_ in a pipeline.
- This limitation can be overcome by enclosing it in a script block (`{ ... }`) invoked with `&`, which enables the usual, memory-friendly _streaming_ behavior in the pipeline; that is, the lines are processed one by one, as are the modified output lines relayed to `Set-Content`, so that the input file needn't be read into memory _as a whole_.
  • 在您的示例中,CSV文件的纯文本处理启用了一个简单的解决方案,但通常最好将CSV文件解析为可以使用其属性的对象,使用Import-Csv cmdlet,然后再导出到CSV文件Export-Csv
票数 1
EN

Stack Overflow用户

发布于 2022-09-01 19:04:35

使用Csv cmdlet

代码语言:javascript
运行
复制
$Csv = @'
Department,,,,,,First Name,,,,Last Name,,,,,,,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Jane,,,,Doe,,,,,,,2022,Enrolled
,,,,,,Jeff,,,,Dane,,,,,,,2019,Enrolled
,,,,,,Tate,,,,Anderson,,,,,,,2019,Not Enrolled
,,,,,,Daphne,,,,Miller,,,,,,,2021,Enrolled
,,,,,,Cora,,,,Dame,,,,,,,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Dora,,,,Explorer,,,,,,,2022,Not Enrolled
,,,,,,Peppa,,,,Diggs,,,,,,,2020,Enrolled
,,,,,,Conrad,,,,Strat,,,,,,,2020,Enrolled
,,,,,,Kat,Noir,,,,2019,,,,,,,Enrolled
,,,,,,Lance,,,,Bug,2018,,,,,,,Enrolled
'@
代码语言:javascript
运行
复制
$List = ConvertFrom-Csv $Csv -Header @(1..20) # |Import-Csv .\Your.Csv -Header @(1..20)
$Columns = $List[0].PSObject.Properties.Where{ $_.Value -and $_.Value -ne 'Department' }.Name
$List |Select-Object -Property $Columns |Where-Object { $_.$($Columns[0]) } |
    ConvertTo-Csv -UseQuote Never |Select-Object -Skip 1 # |Set-Content -Encoding utf8 out.csv
代码语言:javascript
运行
复制
First Name,Last Name,School Year,Enrolment Status
Jane,Doe,2022,Enrolled
Jeff,Dane,2019,Enrolled
Tate,Anderson,2019,Not Enrolled
Daphne,Miller,2021,Enrolled
Cora,Dame,2022,Enrolled
Dora,Explorer,2022,Not Enrolled
Peppa,Diggs,2020,Enrolled
Conrad,Strat,2020,Enrolled
Kat,,,Enrolled
Lance,Bug,,Enrolled
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73573418

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档