ここから質問ですcsvファイルの値の並べ替え
問題は、最初の列と一致する方法に従って5番目の列の値をソートしたいということです。
入力する:
"DB Instance Identifier","Engine","Instance Type","MultiAZ","Reservation ID","State","Start Time","Reserved Engine","Reserved DB Instance Class"
"alpha-db-dev","postgres","db.t4g.micro",false,"alpha-db-dev-26aug2022","active","2022-08-26","postgresql","db.t4g.micro"
"alpha-db-prod","postgres","db.t4g.small",true,"alpha-db-prod-26aug2022","active","2022-08-26","postgresql","db.t4g.small"
"alpha-db-staging","postgres","db.t4g.micro",false,"alpha-db-staging-26aug2022","active","2022-08-26","postgresql","db.t4g.micro"
"beta-db-dev-primary","aurora-postgresql","db.t3.medium",false,"charlie-db-dev-8dec2021","active","2021-12-08","postgresql","db.t3.micro"
"beta-db-prod-primary","aurora-postgresql","db.r5.xlarge",false,
"beta-db-prod-replica","aurora-postgresql","db.r5.xlarge",false,
"beta-db-staging-primary","aurora-postgresql","db.t3.medium",false,
"charlie-db-dev","postgres","db.t3.micro",false,
"charlie-db-prod-dms","postgres","db.m4.4xlarge",true,
"charlie-db-prod-dms-replica","postgres","db.m4.4xlarge",false,
"charlie-db-staging","postgres","db.t2.large",false,
"charlie-db-staging-loadtest","postgres","db.m4.4xlarge",false,
"charlie-kong-db-dev","postgres","db.t3.micro",false,
"charlie-kong-db-prod","postgres","db.m4.large",true,
"charlie-kong-db-staging","postgres","db.t2.small",false,
"delta-db-prod","mysql","db.t3.small",true,
"delta-db-recon-prod","mysql","db.t3.micro",false,
"delta-db-staging","mysql","db.t3.micro",false,
希望の出力は次のとおりです。
"DB Instance Identifier","Engine","Instance Type","MultiAZ","Reservation ID","State","Start Time","Reserved Engine","Reserved DB Instance Class"
"alpha-db-dev","postgres","db.t4g.micro",false,"alpha-db-dev-26aug2022","active","2022-08-26","postgresql","db.t4g.micro"
"alpha-db-prod","postgres","db.t4g.small",true,"alpha-db-prod-26aug2022","active","2022-08-26","postgresql","db.t4g.small"
"alpha-db-staging","postgres","db.t4g.micro",false,"alpha-db-staging-26aug2022","active","2022-08-26","postgresql","db.t4g.micro"
"beta-db-dev-primary","aurora-postgresql","db.t3.medium",false,
"beta-db-prod-primary","aurora-postgresql","db.r5.xlarge",false,
"beta-db-prod-replica","aurora-postgresql","db.r5.xlarge",false,
"beta-db-staging-primary","aurora-postgresql","db.t3.medium",false,
"charlie-db-dev","postgres","db.t3.micro",false,"charlie-db-dev-8dec2021","active","2021-12-08","postgresql","db.t3.micro"
"charlie-db-prod-dms","postgres","db.m4.4xlarge",true,
"charlie-db-prod-dms-replica","postgres","db.m4.4xlarge",false,
"charlie-db-staging","postgres","db.t2.large",false,
"charlie-db-staging-loadtest","postgres","db.m4.4xlarge",false,
"charlie-kong-db-dev","postgres","db.t3.micro",false,
"charlie-kong-db-prod","postgres","db.m4.large",true,
"charlie-kong-db-staging","postgres","db.t2.small",false,
"delta-db-prod","mysql","db.t3.small",true,
"delta-db-recon-prod","mysql","db.t3.micro",false,
"delta-db-staging","mysql","db.t3.micro",false,
Ed Mortonは、以前の質問でawkスクリプトを使用してこの質問に答えました。
BEGIN { FS=OFS="," }
NR == FNR {
id = $2
sub(/-[^-]+$/,"",id)
vals[id] = $2 OFS $3
next
}
{ print (FNR>1 ? $1 OFS vals[$1] : $0) }
,
しかし、私が作業しているcsvファイルの値と一致するようにスクリプトの値を理解して変更しようとすると、上記のawkスクリプトが最初に比較を行うため、何もしないか、単に誤った値が表示されます。列1と2。しかし、今回は最初の列と5番目の列を比較したいと思います。その後、行全体(列5から9)が最初の列に揃えられます。私は何をすべきですか?私はまだ学んでいます。
編集:コンマ区切りの値を使用しているため、混乱を防ぐために入力データの値を変更しました。コードエディタを使用する代わりに、Excelでファイルを開き、そこから貼り付けるときに間違って貼り付けたようです。混乱させて申し訳ありません!
答え1
さまざまなフィールド数を持つ同様のファイルが複数あると思われるので、まず入力をヘッド(最初の4つのフィールドとコンマ)とテール(最後の5番目のフィールド)に分割しました。したがって、一致させたいフィールドが何であれ、簡単に実行できます。および/または一致するフィールドの前/後に含めるフィールドの数(たとえば、このスクリプトの途中で{4}
変更したばかりの5番目のフィールドではなく7番目のフィールドから比較してコピーしたい場合):{6}
$ cat tst.awk
BEGIN { FS=OFS="," }
{
match($0,/([^,]+,){4}/)
head = substr($0,RSTART,RLENGTH)
tail = substr($0,RSTART+RLENGTH)
}
NR == FNR {
$0 = tail
id = $1
sub(/-[^-]+$/,"\"",id)
vals[id] = $0
next
}
{
print (FNR>1 ? head vals[$1] : $0)
}
$ awk -f tst.awk file file
"DB Instance Identifier","Engine","Instance Type","MultiAZ","Reservation ID","State","Start Time","Reserved Engine","Reserved DB Instance Class"
"alpha-db-dev","postgres","db.t4g.micro",false,"alpha-db-dev-26aug2022","active","2022-08-26","postgresql","db.t4g.micro"
"alpha-db-prod","postgres","db.t4g.small",true,"alpha-db-prod-26aug2022","active","2022-08-26","postgresql","db.t4g.small"
"alpha-db-staging","postgres","db.t4g.micro",false,"alpha-db-staging-26aug2022","active","2022-08-26","postgresql","db.t4g.micro"
"beta-db-dev-primary","aurora-postgresql","db.t3.medium",false,
"beta-db-prod-primary","aurora-postgresql","db.r5.xlarge",false,
"beta-db-prod-replica","aurora-postgresql","db.r5.xlarge",false,
"beta-db-staging-primary","aurora-postgresql","db.t3.medium",false,
"charlie-db-dev","postgres","db.t3.micro",false,"charlie-db-dev-8dec2021","active","2021-12-08","postgresql","db.t3.micro"
"charlie-db-prod-dms","postgres","db.m4.4xlarge",true,
"charlie-db-prod-dms-replica","postgres","db.m4.4xlarge",false,
"charlie-db-staging","postgres","db.t2.large",false,
"charlie-db-staging-loadtest","postgres","db.m4.4xlarge",false,
"charlie-kong-db-dev","postgres","db.t3.micro",false,
"charlie-kong-db-prod","postgres","db.m4.large",true,
"charlie-kong-db-staging","postgres","db.t2.small",false,
"delta-db-prod","mysql","db.t3.small",true,
"delta-db-recon-prod","mysql","db.t3.micro",false,
"delta-db-staging","mysql","db.t3.micro",false,
答え2
現在使用されているファイルは「、」をフィールド区切り文字として使用し、多くのフィールドが二重引用符で囲まれています。次のことができます。
$ awk ' BEGIN{ FS=OFS="," }
NR==FNR{
if (NR==1) {next}
id=$5;
sub(/-[^-]+"$/,"\"",id);
a[id]=$5 OFS $6 OFS $7 OFS $8 OFS $9; next
}
{
if (FNR==1) {print; next}
print $1,$2,$3,$4,a[$1]
}
' infile infile
与えられた入力に対して以下が印刷されます。
"DB Instance Identifier","Engine","Instance Type","MultiAZ","Reservation ID","State","Start Time","Reserved Engine","Reserved DB Instance Class"
"alpha-db-dev","postgres","db.t4g.micro",false,"alpha-db-dev-26aug2022","active","2022-08-26","postgresql","db.t4g.micro"
"alpha-db-prod","postgres","db.t4g.small",true,"alpha-db-prod-26aug2022","active","2022-08-26","postgresql","db.t4g.small"
"alpha-db-staging","postgres","db.t4g.micro",false,"alpha-db-staging-26aug2022","active","2022-08-26","postgresql","db.t4g.micro"
"beta-db-dev-primary","aurora-postgresql","db.t3.medium",false,
"beta-db-prod-primary","aurora-postgresql","db.r5.xlarge",false,
"beta-db-prod-replica","aurora-postgresql","db.r5.xlarge",false,
"beta-db-staging-primary","aurora-postgresql","db.t3.medium",false,
"charlie-db-dev","postgres","db.t3.micro",false,"charlie-db-dev-8dec2021","active","2021-12-08","postgresql","db.t3.micro"
"charlie-db-prod-dms","postgres","db.m4.4xlarge",true,
"charlie-db-prod-dms-replica","postgres","db.m4.4xlarge",false,
"charlie-db-staging","postgres","db.t2.large",false,
"charlie-db-staging-loadtest","postgres","db.m4.4xlarge",false,
"charlie-kong-db-dev","postgres","db.t3.micro",false,
"charlie-kong-db-prod","postgres","db.m4.large",true,
"charlie-kong-db-staging","postgres","db.t2.small",false,
"delta-db-prod","mysql","db.t3.small",true,
"delta-db-recon-prod","mysql","db.t3.micro",false,
"delta-db-staging","mysql","db.t3.micro",false,