文章/答案/技术大牛

发布

问在linux中验证文件头
EN

Stack Overflow用户

提问于 2018-02-27 18:45:06

回答 2查看 630关注 0票数 1

我正在使用下面的脚本来验证文件头。为此，我创建了一个只有标题的文件，并将其与另一个文件进行比较，该文件具有列的数据以及标题。

awk -F"|" 'FNR==NR{hn=split($0,header); next}
     FNR==1 {n=split($0,fh)
            for(i=0;i<=hn; i++)
                if (fh[i]!=header[i]) {
                   printf "%s:order of %s is not correct\n",FILENAME, header[i]
                 next}
            if (hn==n)
                print FILENAME, "has expected order of fields"
        else
                print FILENAME, "has extra fields"
next
                }' key /Scripts/gst/Kenan_Test_Scenarios1.txt

文件2标头和数据(Kenan_Test_Scenarios1.txt)

SourceIdentifier|SourceFileName|GLAccountCode|Division|SubDivision|ProfitCentre1|ProfitCentre2|PlantCode|ReturnPeriod|SupplierGSTIN|DocumentType|SupplyType|DocumentNumber|DocumentDate|OriginalDocumentNumber|OriginalDocumentDate|CRDRPreGST|LineNumber|CustomerGSTIN|UINorComposition|OriginalCustomerGSTIN|CustomerName|CustomerCode|BillToState|ShipToState|POS|PortCode|ShippingBillNumber|ShippingBillDate|FOB|ExportDuty|HSNorSAC|ProductCode|ProductDescription|CategoryOfProduct|UnitOfMeasurement|Quantity|TaxableValue|IntegratedTaxRate|IntegratedTaxAmount|CentralTaxRate|CentralTaxAmount|StateUTTaxRate|StateUTTaxAmount|CessRateAdvalorem|CessAmountAdvalorem|CessRateSpecific|CessAmountSpecific|InvoiceValue|ReverseChargeFlag|TCSFlag|eComGSTIN|ITCFlag|ReasonForCreditDebitNote|AccountingVoucherNumber|AccountingVoucherDate|Userdefinedfield1|Userdefinedfield2|Userdefinedfield3
KEN|TEST1|||Tela|Outw|ANP|POST|1017|36AAA|NV|TX|4841446542|2017-12-12||2035-06-11|Y|1|36AAACB89|||||||36||||||94||Telecomm Servi||||1557.20|0.00|10.00|9.00|140.15|9.00|140.15|||||18.50||||||||B2B INV||

得到下面的输出，这是不正确的，尽管两个文件中的头是相同的。

 is not correctnan_Test_Scenarios1.txt:order of Userdefinedfield3

您可以帮助我纠正代码，也需要捕获如果多个标题名称有msimatch

bash

shell

perl

awk

scripting

回答 2

Stack Overflow用户

发布于 2018-02-27 19:37:08

好了，您已经标记了这个perl，下面是perl的答案。我认为您关注的是错误的问题--为什么不逐行读取，将它们解析成散列，然后输出您想要的排序：

#!/usr/bin/env perl

use strict;
use warnings;

use Data::Dumper;

open ( my $first_file, '<', 'file_name_here' ) or die $!; 
chomp ( my @header = split /\|/, <$first_file> ); 
close ( $first_file ); 
#debugging
print Dumper \@header; 

open  ( my $second_file, '<', 'second_file_name_here' ) or die $!; 
chomp ( my @second_header = split /\|/, <$second_file> );

print join ( "|", @header ), "\n";
while ( <$second_file> ) {
    my %row;
    #use ordering of column headings to read into named fields; 
    @row{@second_header} = split /\|/;
    #debugging output to show you what's going on. 
    print Dumper \%row; 

    print join ("|", @row{@header} ), "\n";
}

这样你就不会关心顺序是否错误，因为你可以向前修正它。

如果您确实需要比较，那么您可以迭代每个@header数组并查找差异。但这更多的是一个你实际想要得到什么的问题--我建议你看看Array::Utils，因为它让你可以轻松地使用array_diff、intersect和unique。

票数 2

Stack Overflow用户

发布于 2018-02-27 23:13:10

这可能会派上用场

$ diff -y --suppress-common-lines  <(tr '|' '\n' <file1) <(tr '|' '\n' <file2)

将您的第一个文件原样用于file1，并使用以下内容

$ sed 's/2/8/;s/Export/Import/' file1 > file2

创建第二个文件。运行该脚本会给出

ProfitCentre2                                                 | ProfitCentre8
ExportDuty                                                    | ImportDuty

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/49006763

复制

相似问题

问在linux中验证文件头
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在linux中验证文件头EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在linux中验证文件头
EN