我正在创建一个预处理脚本,它将读取csv文件,查看两个字段(P_ID和don_date),将这两个字段连接在一起为我的散列条目创建一个键,然后将行(拆分为一个数组)作为键的值。
如果文件中的另一行具有相同的p_id和don_date (这将与我的猫哈希键相匹配),我希望获得代码中看到的其他值,并将它们添加到现有哈希项的值中。我正在进行一些测试,但对于引用/去引用以及在Perl中的所有工作方式(我是Python的家伙)有点困惑,这样每一行都会有一个唯一的P_ID/Don_Date值,否则它将被卷到一个现有的值中。
my @lineFields = ();
my %rollUpHash = ();
# Open file and loop through lines
#foreach my $line (<INFH>)
while(my $line = <INFH>)
{
# chomp($line);
# print STDERR "New line in file\n";
# <STDIN>;
@lineFields = split(/,/, $line);
# Pull out pertinent values for possible roll-up into a total
my $p_id = $lineFields[18];
my $don_date = $lineFields[19];
my $nd_amt = $lineFields[14];
my $deduct_amt = $lineFields[15];
my $nondeduct_ytd = $lineFields[16];
my $deduct_ytd = $lineFields[17];
my $amount = $lineFields[24];
my $anonymous = $lineFields[26];
my $int_code_ex0003 = $lineFields[39];
my $int_code_ex0006 = $lineFields[40];
my $int_code_ex0028 = $lineFields[41];
my $sumDeduct_NonDeduct_ytd = $deduct_ytd + $nondeduct_ytd;
my $hashKey = $p_id . $don_date;
# say "P_ID is $p_id\nDon_Date is $don_date\nND_Amt is $nd_amt\nDeduct_Amt is $deduct_amt\nNonDeduct_YTD is $nondeduct_ytd\nDeduct_YTD is $deduct_ytd\nAmount is $amount\nAnonymous is $anonymous\n0003 is $int_code_ex0003\n0006 is $int_code_ex0006\n0028 is $int_code_ex0028";
# say "hashKey is $hashKey";
# say "$sumDeduct_NonDeduct_ytd";
if (exists($rollUpHash{$hashKey}))
{
say("Same key found, summing up!")
$rollUpHash{$hashKey}[14] += $lineFields[14];
$rollUpHash{$hashKey}[15] += $lineFields[15];
$rollUpHash{$hashKey}[16] += $lineFields[16];
$rollUpHash{$hashKey}[17] += $lineFields[17];
$rollUpHash{$hashKey}[24] += $lineFields[24];
push @{$rollUpHash{$hashKey}}, $sumDeduct_NonDeduct_ytd;
# print %rollUpHash;
}
else
{
$rollUpHash{$hashKey} = \@lineFields;
}
foreach my $key (keys %rollUpHash)
{
print OUTFH "$key is @{$rollUpHash{$key}}";
}以下是三行已擦洗的输入数据:
152099-00001,,100,100,400,100,175,100,700,200,200,500,0,0,0,300,0,2575,105666,10/28/14,197800,23764962,"Jefferson,Mark",300,1004,N,N,N,D,Mike and Bonnie,Mike and Bonnie Gregorovitch,715 81st St NE,,,Central,IL,52402-7256,UNITED STATES,,,,,Y,soandso@email.com,,(888) 888-8888,"Jefferson,Mark",2222B,BASIC,,
342029-00015,,200,0,400,200,200,200,200,200,200,200,0,0,0,200,0,2000,3184444,09/27/14,197800,40949,"Macrow,Gregory",100,1004,N,N,N,D,John and Amber, John and Amber Meadows,PO Box 706,,,Logan,MD,01111-0704,UNITED STATES,,,,,Y,othersoandso@email,,(999) 999-9999,"Macrow,Gregory",2222B,BASIC,,
342029-00014,,200,0,400,200,200,200,200,200,200,200,0,0,0,200,0,2000,3184444,09/27/14,197800,22145,"Bartholomew,Vincent",100,1004,N,N,N,D,John and Amber, John and Amber Meadows,PO Box 706,,,Logan,MD,01111-0704,UNITED STATES,,,,,Y,othersoandso@email,,(999) 999-9999,"Bartholomew,Vincent",2222B,BASIC,,任何帮助解决这个问题将是非常感谢的!
发布于 2015-07-22 20:18:05
还不清楚您想要对$sumDeduct_NonDeduct_ytd做什么:您真的想在行中追加新的和,还是用它替换最后一列?
同时,在包含引号的CSV上使用split /,/也是错误的。使用案文:CSV
#! /usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
use Text::CSV;
use constant {
ND_AMT => 14,
DEDUCT_AMT => 15,
NONDEDUCT_YTD => 16,
DEDUCT_YTD => 17,
PID => 18,
DON_DATE => 19,
AMOUNT => 24,
};
my $csv = 'Text::CSV'->new({ binary => 1 });
my %hash;
open my $IN, '<:encoding(utf-8)', shift or die $!;
while (my $row = $csv->getline($IN)) {
my $key = join ':', @{$row}[PID, DON_DATE];
if (exists $hash{$key}) {
$hash{$key}[$_] += $row->[$_] for ND_AMT, DEDUCT_AMT, NONDEDUCT_YTD,
DEDUCT_YTD, AMOUNT;
my $sum = $row->[DEDUCT_YTD] + $row->[NONDEDUCT_YTD];
$hash{$key}[-1] = $sum; # Or do you mean something else?
} else {
$hash{$key} = $row; # You can store the reference directly,
} # as you get a fresh new one for each iteration.
}
for my $key (keys %hash) {
say "$key : @{ $hash{$key} }"; # You should rather use $csv->print.
}https://stackoverflow.com/questions/31571451
复制相似问题