以下の入力ファイルがあります。
ID~NAME~CREATED_DATE~NOTES~LAST_MODIFIED_DATE
"12345"~"abc"~"9/7/2022 10:05:18 AM"~"new patiant"~"9/7/2022 11:52:18 AM"
"25451"~"bdc"~"11/7/2022 10:05:18 AM"~"next
month
visit"~"11/7/2022 10:05:18 AM"
"45522"~"xyz"~"1/8/2022 11:05:18 AM"~"new visiting patient"~"1/8/2022 11:05:18 AM"
"52447"~"pqr"~"5/5/2022 10:05:18 AM"~"transferred
back
to
hospital"~"5/5/2022 10:05:18 AM"
"24541"~"rds"~"4/5/2022 05:05:18 AM"~"new patient"~"4/5/2022 05:05:18 AM"
以下は私が望む出力です。
ID~NAME~CREATED_DATE~NOTES~LAST_MODIFIED_DATE
"12345"~"abc"~"9/7/2022 10:05:18 AM"~"new patiant"~"9/7/2022 11:52:18 AM"
"25451"~"bdc"~"11/7/2022 10:05:18 AM"~"next month visit"~"11/7/2022 10:05:18 AM"
"45522"~"xyz"~"1/8/2022 11:05:18 AM"~"transferred back to hospital"~"1/8/2022 11:05:18 AM"
"52447"~"pqr"~"5/5/2022 10:05:18 AM"~"new visiting patient"~"5/5/2022 10:05:18 AM"
"24541"~"rds"~"4/5/2022 05:05:18 AM"~"new patient"~"4/5/2022 05:05:18 AM"
助けてください!
答え1
GNU awkを使用したマルチ文字RSとRT:
$ awk -v RS='([^~]+~){4}[^~]+\n' '{print gensub(/[[:space:]]+/," ","g",RT)}' file
ID~NAME~CREATED_DATE~NOTES~LAST_MODIFIED_DATE
"12345"~"abc"~"9/7/2022 10:05:18 AM"~"new patiant"~"9/7/2022 11:52:18 AM"
"25451"~"bdc"~"11/7/2022 10:05:18 AM"~"next month visit"~"11/7/2022 10:05:18 AM"
"45522"~"xyz"~"1/8/2022 11:05:18 AM"~"new visiting patient"~"1/8/2022 11:05:18 AM"
"52447"~"pqr"~"5/5/2022 10:05:18 AM"~"transferred back to hospital"~"5/5/2022 10:05:18 AM"
"24541"~"rds"~"4/5/2022 05:05:18 AM"~"new patient"~"4/5/2022 05:05:18 AM"