特定の文字列で始まる行のみ sed

2024-6-8 • tag-icon

bash text-processing sed files scripting

特定の文字列で始まる行のみ sed

私は次のファイル形式を持っています

Received from +11231231234 at 2021-10-10T19:56:50-07:00:
This is a message that contains words like from, at, etc.

Sent to +11231231234 at 2021-10-11T06:50:57+00:00:
This is another message that contains words like to, at, etc.

「受信」と「送信」の行を整理したい。次のsedコマンドを使用できます。

cat file |  sed 's/from//g' | sed 's/to/    /g' | sed 's/+\w\+//' | sed 's/at//g' | \
sed 's/T/ /g' | sed 's/[[:digit:].]*\:$//' | sed 's/[[:digit:].]*\:$//' | sed 's/-$//' |  \
sed 's/-$//' | sed 's/+$//'

次の結果を生成します

Received    2021-10-10 19:56:50
This is a message that contains words like  ,  , etc.

Sent        2021-10-11 06:50:57
This is another message that contains words like  ,  , etc.

ご覧のとおり、「Received」と「Sent」の行は本当によく整理されています。しかし、メッセージラインも整理されます！「Received」と「Sent」で始まる行にのみこれらの操作を適用するにはどうすればよいですか？

答え1

sedのアドレスは次のとおりです。

sed -E '/^(Received|Sent) (from|to) \+[0-9]+ at/ s/ .*([0-9]{4}-[0-9]{2}-[0-9]{2})T([0-9:]{8}).*/        \1 \2/'

Receivedアドレスは、置換がorで始まり、Sentその後にfromorがto続き、+その後に数字sumが続く行にのみ適用されることを意味しますat。
置換は空白と一致し始め、日付をキャプチャします[0-9]{4}（4回繰り返される数字など。）;T時間を再マッチしてキャプチャします。時間以降のコンテンツは一致しますが、キャプチャされません。これにより、一致する部分全体がいくつかの空白とキャプチャされた日時で置き換えられます。

関連情報