我有一个csv文件,我想要导入到sql,但没有正确的格式化。我无法格式化生成的文件(excel文件),所以我希望使用CSV文件来实现这一点。我希望删除额外的逗号,并将部门名称(,,,)替换为正确的部门,如下面的示例所示。提前谢谢你。
示例:
Current Format:
Department,,,,,,First Name,,,,Last Name,,,,,,,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Jane,,,,Doe,,,,,,,2022,Enrolled
,,,,,,Jeff,,,,Dane,,,,,,,2019,Enrolled
,,,,,,Tate,,,,Anderson,,,,,,,2019,Not Enrolled
,,,,,,Daphne,,,,Miller,,,,,,,2021,Enrolled
,,,,,,Cora,,,,Dame,,,,,,,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Dora,,,,Explorer,,,,,,,2022,Not Enrolled
,,,,,,Peppa,,,,Diggs,,,,,,,2020,Enrolled
,,,,,,Conrad,,,,Strat,,,,,,,2020,Enrolled
,,,,,,Kat,Noir,,,,2019,,,,,,,Enrolled
,,,,,,Lance,,,,Bug,2018,,,,,,,Enrolled
Ideal format:
Department,First Name,Last Name,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
Psychology,Jane,Doe,2022,Enrolled
Psychology,Jeff,Dane,2019,Enrolled
Psychology,Tate,Anderson,2019,Not Enrolled
Psychology,Daphne,Miller,2021,Enrolled
Psychology,Cora,Dame,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
Computer Science,Dora,Explorer,2022,Not Enrolled
Computer Science,Peppa,Diggs,2020,Enrolled
Computer Science,Conrad,Strat,2020,Enrolled
Computer Science,Kat,Noir,2019,Enrolled
Computer Science,Lance,Bug,2018,Enrolled
发布于 2022-09-01 17:59:15
给你:
$csvArray = new-object System.Collections.Generic.List[string]
#Import the file
$text = (gc "C:\tmp\testdata.txt") -replace ",{2,}",","
$arrayEnd = $text.count -1
$text[1..$arrayEnd] | %{
If ($_ -notmatch "^(,)"){
$department = $_ -replace ","
}
Else {
$csvArray.add($department + $_)
}
}
$csvArray.Insert(0,$text[0])
$csvArray | set-content 'C:\tmp\my.csv'
发布于 2022-09-01 17:59:32
使用语句
& {
$first = $true
switch -Wildcard -File in.csv { # Loop over all lines in file in.csv
',*' { # intra-department line
# Prepend the department name, eliminate empty fields and output.
$dept + (($_ -split ',' -ne '') -join ',')
}
default {
if ($first) { # header line
# Eliminate empty fields and output.
($_ -split ',' -ne '') -join ','
$first = $false
}
else { # department-only line
$dept = ($_ -split ',')[0] # save department name
}
}
}
} | Set-Content -Encoding utf8 out.csv
注意:
$_ -split ','
通过,
将每一行拆分为字段,-ne ''
从结果数组中筛选出空字段;应用-join ','
将非空字段与,
重新连接,这实际上删除了多个相邻的,
,从而消除了空字段。- If you don't mind the complexity of a [regex](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Regular_Expressions), you can perform the above more simply with a single [`-replace`](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Comparison_Operators#replacement-operator) operation, as shown in [Toni's helpful answer](https://stackoverflow.com/a/73573678/45375).
switch -File
是一种高效的方法,可以逐行读取文件,并基于复杂的匹配执行条件处理(作为-Wildcard
的替代,您可以使用-Regex
进行正则匹配,甚至可以使用脚本块({ ... }
作为条件)。- As a language statement, `switch` cannot be used _directly_ in a pipeline.
- This limitation can be overcome by enclosing it in a script block (`{ ... }`) invoked with `&`, which enables the usual, memory-friendly _streaming_ behavior in the pipeline; that is, the lines are processed one by one, as are the modified output lines relayed to `Set-Content`, so that the input file needn't be read into memory _as a whole_.
Import-Csv
cmdlet,然后再导出到CSV文件Export-Csv
,发布于 2022-09-01 19:04:35
$Csv = @'
Department,,,,,,First Name,,,,Last Name,,,,,,,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Jane,,,,Doe,,,,,,,2022,Enrolled
,,,,,,Jeff,,,,Dane,,,,,,,2019,Enrolled
,,,,,,Tate,,,,Anderson,,,,,,,2019,Not Enrolled
,,,,,,Daphne,,,,Miller,,,,,,,2021,Enrolled
,,,,,,Cora,,,,Dame,,,,,,,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Dora,,,,Explorer,,,,,,,2022,Not Enrolled
,,,,,,Peppa,,,,Diggs,,,,,,,2020,Enrolled
,,,,,,Conrad,,,,Strat,,,,,,,2020,Enrolled
,,,,,,Kat,Noir,,,,2019,,,,,,,Enrolled
,,,,,,Lance,,,,Bug,2018,,,,,,,Enrolled
'@
$List = ConvertFrom-Csv $Csv -Header @(1..20) # |Import-Csv .\Your.Csv -Header @(1..20)
$Columns = $List[0].PSObject.Properties.Where{ $_.Value -and $_.Value -ne 'Department' }.Name
$List |Select-Object -Property $Columns |Where-Object { $_.$($Columns[0]) } |
ConvertTo-Csv -UseQuote Never |Select-Object -Skip 1 # |Set-Content -Encoding utf8 out.csv
First Name,Last Name,School Year,Enrolment Status
Jane,Doe,2022,Enrolled
Jeff,Dane,2019,Enrolled
Tate,Anderson,2019,Not Enrolled
Daphne,Miller,2021,Enrolled
Cora,Dame,2022,Enrolled
Dora,Explorer,2022,Not Enrolled
Peppa,Diggs,2020,Enrolled
Conrad,Strat,2020,Enrolled
Kat,,,Enrolled
Lance,Bug,,Enrolled
https://stackoverflow.com/questions/73573418
复制相似问题