2つのタブで区切られたファイルを比較し、ファイルに主キーがなく、列ヘッダーを持つ出力を比較しようとしています。
私はそれに非常に近いですが、私が直面している問題は、私が触れたコードスニペットが主キーを持っている場合にのみ機能することです。
awk '
NR==1 {
for (i=1; i<=NF; i++)
header[i] = $i
}
NR==FNR {
for (i=1; i<=NF; i++) {
A[i,NR] = $i
}
next
}
{
for (i=1; i<=NF; i++)
if (A[i,FNR] != $i)
print "ID#-" $1 ": " header[i] "- " ARGV[1] " value= ", A[i,FNR]" / " ARGV[2] " value= "$i
}' t1.csv t2.csv
誰もが実装方法を助けることができますか?
- 主キーがない場合
- 行数が等しくなく、1つのファイルに履歴がありません。
t1.csv
Month ClientSegment ClientType IssuerClientSegment NetworkID VD
2020-12 COMMUNITY EXEMPT COMMUNITY 0 OTHER
2020-12 COMMUNITY EXEMPT COMMUNITY 2 OTHER
2020-12 COMMUNITY EXEMPT COMMUNITY 5 OTHER
t2.csv
Month ClientSegment ClientType IssuerClientSegment NetworkID VD
2020-12 COMMUNITY EXEMPT COMMUNITY 0 OTHER
2020-12 COMMUNITY EXEMPT COMMUNITY 2 OTHER1
2020-13 COMMUNITY EXEMPT COMMUNITY 2 PUSH
2020-13 COMMUNITY EXEMPT COMMUNITY 3 OTHER
予想される出力は次のとおりです。
Row 2, Column: VD- t1.csv value= OTHER / t2.csv value= OTHER1
Missing in t2.csv
Month Client Segment Client Type Issuer Client Segment Network ID VD
2020-12 COMMUNITY EXEMPT COMMUNITY 5 OTHER
Missing in t1.csv
Month Client Segment Client Type Issuer Client Segment Network ID VD
2020-13 COMMUNITY EXEMPT COMMUNITY 2 PUSH
2020-13 COMMUNITY EXEMPT COMMUNITY 3 OTHER
答え1
使用daff
:
daff --input-format tsv t1.csv t2.csv
@@ Month ClientSegment ClientType IssuerClientSegment NetworkID VD
2020-12 COMMUNITY EXEMPT COMMUNITY 0 OTHER
→ 2020-12 COMMUNITY EXEMPT COMMUNITY 2 OTHER→OTHER1
+++ 2020-13 COMMUNITY EXEMPT COMMUNITY 2 PUSH
+++ 2020-13 COMMUNITY EXEMPT COMMUNITY 3 OTHER
--- 2020-12 COMMUNITY EXEMPT COMMUNITY 5 OTHER
installを使用してくださいpip install daff
(これが必要な場合がありますsudo apt install python-pip
)。
答え2
awk '
{ key = $1 OFS $2 OFS $3 OFS $4 OFS $5 }
! secondInput {
file1[key] = $6
NRfile1[key] = NR
next
}
(key in file1) {
if (file1[key] != $NF) { print "diff-line#:", NRfile1[key] "|" FNR, $0 }
delete file1[key]
next
}
{ print "missing in file1: ", $0 }
END {
for (key in file1) {
print "missing in file2: ", key, file1[key]
}
}' file1 secondInput=1 file2
出力:
diff-line#: 3|3 2020-12 COMMUNITY EXEMPT COMMUNITY 2 OTHER1
missing in file1: 2020-13 COMMUNITY EXEMPT COMMUNITY 2 PUSH
missing in file1: 2020-13 COMMUNITY EXEMPT COMMUNITY 3 OTHER
missing in file2: 2020-12 COMMUNITY EXEMPT COMMUNITY 5 OTHER