AWK

Question 1

AWK

GNU awkまたはmawkの使用：

$ awk '$1~"^"word{printf("--\n%s",$0)}' word='are' RS='--\n' infile
--
are you happy
--
are(you hungry
too

これは、変数の単語をレコードの先頭で一致させる単語に設定し、レコード区切り記号（RS）を「--」に設定し、新しい行を追加します\n。次に、（）に一致する単語で始まるレコードにフォーマットされた$1~"^"wordレコードを印刷します。形式は、「--」で始まり、見つかった正確なレコードを含む新しい行を持つことです。

グレブ

（GNUオプション-z）grepを使用してください。

grep -Pz -- '--\nare(?:[^\n]*\n)+?(?=--|\Z)' infile
grep -Pz -- '(?s)--\nare.*?(?=\n--|\Z)\n' infile
grep -Pz -- '(?s)--\nare(?:(?!\n--).)*\n' infile

説明次の説明では、PCREオプションを(?x)使用して、実際の（動作する）正規表現と共に（大きな）説明コメント（およびスペース）をインラインで追加します。コメント（およびほとんどのスペース）（次の改行まで）が削除された場合、結果の文字列はまだ同じ正規表現です。これにより、作業コードの正規表現の詳細な説明が可能になります。これにより、コードのメンテナンスが容易になります。

オプション1正規表現 `(?x)--\nare(?:[^\n]*\n)+?(?=--|\Z)`

(?x)   # match the remainder of the pattern with the following
       # effective flags: x
       #      x modifier: extended. Spaces and text after a # 
       #      in the pattern are ignored
--     # matches the characters -- literally (case sensitive)
\n     # matches a line-feed (newline) character (ASCII 10)
are    # matches the characters are literally (case sensitive)
(?:    #      Non-Capturing Group (?:[^\n]*\n)+?
[^\n]  #           matches non-newline characters
*      #           Quantifier — Matches between zero and unlimited times, as
       #           many times as possible, giving back as needed (greedy)
\n     #           matches a line-feed (newline) character (ASCII 10)
)      #      Close the Non-Capturing Group
+?     # Quantifier — Matches between one and unlimited times, as
       # few times as possible, expanding as needed (lazy)
       # A repeated capturing group will only capture the last iteration.
       # Put a capturing group around the repeated group to capture all
       # iterations or use a non-capturing group instead if you're not
       # interested in the data
(?=    # Positive Lookahead (?=--|\Z)
       # Assert that the Regex below matches
       #      1st Alternative --
--     #           matches the characters -- literally (case sensitive)
|      #      2nd Alternative \Z
\Z     #           \Z asserts position at the end of the string, or before
       #           the line terminator right at the end of the 
       #           string (if any)
)      #      Closing the lookahead.

オプション2正規表現 `(?sx)--\nare.*?(?=\n--|\Z)\n`

(?sx)  # match the remainder of the pattern with the following eff. flags: sx
       #        s modifier: single line. Dot matches newline characters
       #        x modifier: extended. Spaces and text after a # in 
       #        the pattern are ignored
--     # matches the characters -- literally (case sensitive)
\n     # matches a line-feed (newline) character (ASCII 10)
are    # matches the characters are literally (case sensitive)
.*?    # matches any character 
       #        Quantifier — Matches between zero and unlimited times,
       #        as few times as possible, expanding as needed (lazy).
(?=    # Positive Lookahead (?=\n--|\Z)
       # Assert that the Regex below matches
       #        1st Alternative \n--
\n     #               matches a line-feed (newline) character (ASCII 10)
--     #               matches the characters -- literally.
|      #        2nd Alternative \Z
\Z     #               \Z asserts position at the end of the string, or
       #               before the line terminator right at
       #               the end of the string (if any)
)      # Close the lookahead parenthesis.
\n     #        matches a line-feed (newline) character (ASCII 10)

オプション3正規表現 `(?xs)--\nare(?:(?!\n--).)*\n`

(?xs)  # match the remainder of the pattern with the following eff. flags: xs
       # modifier x : extended. Spaces and text after a # in are ignored
       # modifier s : single line. Dot matches newline characters
--     # matches the characters -- literally (case sensitive)
\n     # matches a line-feed (newline) character (ASCII 10)
are    # matches the characters are literally (case sensitive)
(?:    # Non-capturing group (?:(?!\n--).)
(?!    #      Negative Lookahead (?!\n--)
       #           Assert that the Regex below does not match
\n     #                matches a line-feed (newline) character (ASCII 10)
--     #                matches the characters -- literally
)      #      Close Negative lookahead
.      #      matches any character
)      # Close the Non-Capturing group.
*      # Quantifier — Matches between zero and unlimited times, as many
       # times as possible, giving back as needed (greedy)
\n     # matches a line-feed (newline) character (ASCII 10)

sed

$ sed -nEe 'bend
            :start  ;N;/^--\nare/!b
            :loop   ;/^--$/!{p;n;bloop}
            :end    ;/^--$/bstart'           infile

Answer

AWK

GNU awkまたはmawkの使用：

$ awk '$1~"^"word{printf("--\n%s",$0)}' word='are' RS='--\n' infile
--
are you happy
--
are(you hungry
too

これは、変数の単語をレコードの先頭で一致させる単語に設定し、レコード区切り記号（RS）を「--」に設定し、新しい行を追加します\n。次に、（）に一致する単語で始まるレコードにフォーマットされた$1~"^"wordレコードを印刷します。形式は、「--」で始まり、見つかった正確なレコードを含む新しい行を持つことです。

グレブ

（GNUオプション-z）grepを使用してください。

grep -Pz -- '--\nare(?:[^\n]*\n)+?(?=--|\Z)' infile
grep -Pz -- '(?s)--\nare.*?(?=\n--|\Z)\n' infile
grep -Pz -- '(?s)--\nare(?:(?!\n--).)*\n' infile

説明次の説明では、PCREオプションを(?x)使用して、実際の（動作する）正規表現と共に（大きな）説明コメント（およびスペース）をインラインで追加します。コメント（およびほとんどのスペース）（次の改行まで）が削除された場合、結果の文字列はまだ同じ正規表現です。これにより、作業コードの正規表現の詳細な説明が可能になります。これにより、コードのメンテナンスが容易になります。

オプション1正規表現 `(?x)--\nare(?:[^\n]*\n)+?(?=--|\Z)`

(?x)   # match the remainder of the pattern with the following
       # effective flags: x
       #      x modifier: extended. Spaces and text after a # 
       #      in the pattern are ignored
--     # matches the characters -- literally (case sensitive)
\n     # matches a line-feed (newline) character (ASCII 10)
are    # matches the characters are literally (case sensitive)
(?:    #      Non-Capturing Group (?:[^\n]*\n)+?
[^\n]  #           matches non-newline characters
*      #           Quantifier — Matches between zero and unlimited times, as
       #           many times as possible, giving back as needed (greedy)
\n     #           matches a line-feed (newline) character (ASCII 10)
)      #      Close the Non-Capturing Group
+?     # Quantifier — Matches between one and unlimited times, as
       # few times as possible, expanding as needed (lazy)
       # A repeated capturing group will only capture the last iteration.
       # Put a capturing group around the repeated group to capture all
       # iterations or use a non-capturing group instead if you're not
       # interested in the data
(?=    # Positive Lookahead (?=--|\Z)
       # Assert that the Regex below matches
       #      1st Alternative --
--     #           matches the characters -- literally (case sensitive)
|      #      2nd Alternative \Z
\Z     #           \Z asserts position at the end of the string, or before
       #           the line terminator right at the end of the 
       #           string (if any)
)      #      Closing the lookahead.

オプション2正規表現 `(?sx)--\nare.*?(?=\n--|\Z)\n`

(?sx)  # match the remainder of the pattern with the following eff. flags: sx
       #        s modifier: single line. Dot matches newline characters
       #        x modifier: extended. Spaces and text after a # in 
       #        the pattern are ignored
--     # matches the characters -- literally (case sensitive)
\n     # matches a line-feed (newline) character (ASCII 10)
are    # matches the characters are literally (case sensitive)
.*?    # matches any character 
       #        Quantifier — Matches between zero and unlimited times,
       #        as few times as possible, expanding as needed (lazy).
(?=    # Positive Lookahead (?=\n--|\Z)
       # Assert that the Regex below matches
       #        1st Alternative \n--
\n     #               matches a line-feed (newline) character (ASCII 10)
--     #               matches the characters -- literally.
|      #        2nd Alternative \Z
\Z     #               \Z asserts position at the end of the string, or
       #               before the line terminator right at
       #               the end of the string (if any)
)      # Close the lookahead parenthesis.
\n     #        matches a line-feed (newline) character (ASCII 10)

オプション3正規表現 `(?xs)--\nare(?:(?!\n--).)*\n`

(?xs)  # match the remainder of the pattern with the following eff. flags: xs
       # modifier x : extended. Spaces and text after a # in are ignored
       # modifier s : single line. Dot matches newline characters
--     # matches the characters -- literally (case sensitive)
\n     # matches a line-feed (newline) character (ASCII 10)
are    # matches the characters are literally (case sensitive)
(?:    # Non-capturing group (?:(?!\n--).)
(?!    #      Negative Lookahead (?!\n--)
       #           Assert that the Regex below does not match
\n     #                matches a line-feed (newline) character (ASCII 10)
--     #                matches the characters -- literally
)      #      Close Negative lookahead
.      #      matches any character
)      # Close the Non-Capturing group.
*      # Quantifier — Matches between zero and unlimited times, as many
       # times as possible, giving back as needed (greedy)
\n     # matches a line-feed (newline) character (ASCII 10)

sed

$ sed -nEe 'bend
            :start  ;N;/^--\nare/!b
            :loop   ;/^--$/!{p;n;bloop}
            :end    ;/^--$/bstart'           infile

Question 2

GNUを使用するawkか、次のようにしますmawk。

$ awk -v word="are" -v RS='--\n' -v ORS='--\n' '$1 ~ "^" word "[[:punct:]]?"' file
are you happy
--
are(you hungry
too
--

入力と出力のレコード区切り記号--と改行文字を設定します。各段落の最初の単語はで確認できます$1。私たちはそれを与えられた単語と一致させます（句読点が続くかもしれません）。一致した場合、段落を印刷します。

出力のショートマークは、出力で使用したように、各段落の始まりではなく終点に配置されますORS。

sedスクリプトの使用:

:top
/^--/!d;                   # This is not a new paragraph, delete
N;                         # Append next line
/^--\nare[[:punct:]]?/!d;  # This is not a paragraph we want, delete
:record
n;                         # Output line, get next
/^--/!brecord;             # Not yet done with this record, branch to :record
btop;                      # Branch to :top

ランニング：

$ sed -E -f script.sed file
--
are you happy
--
are(you hungry
too

または、シェル変数を使用して1行のコードとして使用します$word。

sed -E -e ':t;/^--/!d;N;' \
       -e "/^--\n$word[[:punct:]]?/!d" \
       -e ':r;n;/^--/!br;bt' file

Answer

GNUを使用するawkか、次のようにしますmawk。

$ awk -v word="are" -v RS='--\n' -v ORS='--\n' '$1 ~ "^" word "[[:punct:]]?"' file
are you happy
--
are(you hungry
too
--

入力と出力のレコード区切り記号--と改行文字を設定します。各段落の最初の単語はで確認できます$1。私たちはそれを与えられた単語と一致させます（句読点が続くかもしれません）。一致した場合、段落を印刷します。

出力のショートマークは、出力で使用したように、各段落の始まりではなく終点に配置されますORS。

sedスクリプトの使用:

:top
/^--/!d;                   # This is not a new paragraph, delete
N;                         # Append next line
/^--\nare[[:punct:]]?/!d;  # This is not a paragraph we want, delete
:record
n;                         # Output line, get next
/^--/!brecord;             # Not yet done with this record, branch to :record
btop;                      # Branch to :top

ランニング：

$ sed -E -f script.sed file
--
are you happy
--
are(you hungry
too

または、シェル変数を使用して1行のコードとして使用します$word。

sed -E -e ':t;/^--/!d;N;' \
       -e "/^--\n$word[[:punct:]]?/!d" \
       -e ':r;n;/^--/!br;bt' file

Question 3

私も知っています、これは古い質問ですが、これらすべてのループ、分岐、パターンがジャグリングされるのを見ると簡単です

sed '/^--$/!{H;$!d;};x;/^--\nare/!d'

自然な方法で同じことを行います。

sed行単位のフローエディタなので、複数行の内容が必要な場合は、段落Hマーク（^--$）の予約済みスペースにその行を収集し、xバッファを変更して段落が印刷されるかどうかをテストします（1行の後に^--\nare1行があることを意味します）。--出発線でare）。x段落タグ予約スペースはすでにプリロードされています。

拡張機能を含むGNUツールも必要なく、プログラミング技術も必要ありません。ただ参加するだけですsed。

Answer

私も知っています、これは古い質問ですが、これらすべてのループ、分岐、パターンがジャグリングされるのを見ると簡単です

sed '/^--$/!{H;$!d;};x;/^--\nare/!d'

自然な方法で同じことを行います。

sed行単位のフローエディタなので、複数行の内容が必要な場合は、段落Hマーク（^--$）の予約済みスペースにその行を収集し、xバッファを変更して段落が印刷されるかどうかをテストします（1行の後に^--\nare1行があることを意味します）。--出発線でare）。x段落タグ予約スペースはすでにプリロードされています。

拡張機能を含むGNUツールも必要なく、プログラミング技術も必要ありません。ただ参加するだけですsed。

Question 4

あなたの質問を読んだ後も同じ気持ちになります。しなければならないgrep+を使って解決できます。PCRE。

@issacの助けのおかげで、方法＃1はこの問題を解決しました。
(?s)方法＃2は、インライン修飾子（）と予測（）を使用する方法を示しています?!...。
私の元の解決策（＃3）は、以下のセクションで強調したタイプを除いて、ほとんどの場合うまく機能します。

グレブ方法 #1

$ grep -Pzo -- '--\nare([^\n]*\n)+?(?=--|\Z)' afile

どのように動作しますか？

グレブスイッチ

-P- PCRE拡張を有効にする
-z- 入力を複数行として処理し、代わりにNULを使用してください\n（改行文字）
-o- 一致する項目のみを表示

正規表現

--\nare([^\n]*\n)+?(?=--|\Z)
- 二重ダッシュとそのあとに 1 つare、次に 0 つ以上の改行ではなく、連続または改行文字と一致します。
- +?1以上と一致しますが、欲が多くないため、積極的に続行しません。
- 最後に、(?=--|\Z)ブロックエンドガードは次のダブルダッシュ--またはファイルの終わり（\Z）を探します。

グレブ方法 #2

このメソッドは、DOTALLインライン修飾子を使用して.改行（ `n`）を一致させます。

$ grep -Pzo -- '(?s)--\nare((?!\n--).)+\n' afile

どのように動作しますか？

グレブスイッチ

-P- PCRE拡張を有効にする
-z- 入力を複数行として処理し、代わりにNULを使用してください\n（改行文字）
-o- 一致する項目のみを表示

正規表現

(?s)- インライン修飾子DOTALL - すべての点が改行と一致します。
--\nare- 改行文字の後に続く文字と一致します。are
((?!\n--).)+\n.- 順方向検索で。文字が(?!\n--)見つからない限り\n--一致します。完全一致ブロックには、+改行文字が続く1つ以上の（）が必要です\n。

grep 方法 #3(原本)

grepPCRE拡張（）を活用した-Pソリューションです。この方法は提供されたすべての例で機能しますが、次の例では失敗します。

--
are
some-other-dasher

ただし、ほとんどの場合、これを処理する必要があると想像できます。

$ grep -Pzo -- '--\nare[^\r\n]+[^-]+' afile
--
are you happy

--
are(you hungry
too

どのように動作しますか？

グレブスイッチ

-P- PCRE拡張を有効にする
-z- 入力を複数行として処理し、代わりにNULを使用してください\n（改行文字）
-o- 一致する項目のみを表示

正規表現

'--\nare[^\r\n]+[^-]+'
- 改行文字と単語が続く二重ダッシュと一致しますare。
- are次に、改行文字に会うまで残りの行を印刷し続けます。
- 次に、一連のダッシュが表示されるまで文字を印刷します。

引用する

Answer

あなたの質問を読んだ後も同じ気持ちになります。しなければならないgrep+を使って解決できます。PCRE。

@issacの助けのおかげで、方法＃1はこの問題を解決しました。
(?s)方法＃2は、インライン修飾子（）と予測（）を使用する方法を示しています?!...。
私の元の解決策（＃3）は、以下のセクションで強調したタイプを除いて、ほとんどの場合うまく機能します。

グレブ方法 #1

$ grep -Pzo -- '--\nare([^\n]*\n)+?(?=--|\Z)' afile

どのように動作しますか？

グレブスイッチ

-P- PCRE拡張を有効にする
-z- 入力を複数行として処理し、代わりにNULを使用してください\n（改行文字）
-o- 一致する項目のみを表示

正規表現

--\nare([^\n]*\n)+?(?=--|\Z)
- 二重ダッシュとそのあとに 1 つare、次に 0 つ以上の改行ではなく、連続または改行文字と一致します。
- +?1以上と一致しますが、欲が多くないため、積極的に続行しません。
- 最後に、(?=--|\Z)ブロックエンドガードは次のダブルダッシュ--またはファイルの終わり（\Z）を探します。

グレブ方法 #2

このメソッドは、DOTALLインライン修飾子を使用して.改行（ `n`）を一致させます。

$ grep -Pzo -- '(?s)--\nare((?!\n--).)+\n' afile

どのように動作しますか？

グレブスイッチ

-P- PCRE拡張を有効にする
-z- 入力を複数行として処理し、代わりにNULを使用してください\n（改行文字）
-o- 一致する項目のみを表示

正規表現

(?s)- インライン修飾子DOTALL - すべての点が改行と一致します。
--\nare- 改行文字の後に続く文字と一致します。are
((?!\n--).)+\n.- 順方向検索で。文字が(?!\n--)見つからない限り\n--一致します。完全一致ブロックには、+改行文字が続く1つ以上の（）が必要です\n。

grep 方法 #3(原本)

grepPCRE拡張（）を活用した-Pソリューションです。この方法は提供されたすべての例で機能しますが、次の例では失敗します。

--
are
some-other-dasher

ただし、ほとんどの場合、これを処理する必要があると想像できます。

$ grep -Pzo -- '--\nare[^\r\n]+[^-]+' afile
--
are you happy

--
are(you hungry
too

どのように動作しますか？

グレブスイッチ

-P- PCRE拡張を有効にする
-z- 入力を複数行として処理し、代わりにNULを使用してください\n（改行文字）
-o- 一致する項目のみを表示

正規表現

'--\nare[^\r\n]+[^-]+'
- 改行文字と単語が続く二重ダッシュと一致しますare。
- are次に、改行文字に会うまで残りの行を印刷し続けます。
- 次に、一連のダッシュが表示されるまで文字を印刷します。

AWK

答え1

AWK

グレブ

オプション1正規表現 `(?x)--\nare(?:[^\n]*\n)+?(?=--|\Z)`

オプション2正規表現 `(?sx)--\nare.*?(?=\n--|\Z)\n`

オプション3正規表現 `(?xs)--\nare(?:(?!\n--).)*\n`

sed

答え2

答え3

答え4

グレブ方法 #1

どのように動作しますか？

グレブ方法 #2

どのように動作しますか？

grep 方法 #3(原本)

どのように動作しますか？

引用する

関連情報

答え1

AWK

グレブ

オプション1正規表現 (?x)--\nare(?:[^\n]*\n)+?(?=--|\Z)

オプション2正規表現 (?sx)--\nare.*?(?=\n--|\Z)\n

オプション3正規表現 (?xs)--\nare(?:(?!\n--).)*\n

sed

答え2

答え3

答え4

グレブ方法 #1

どのように動作しますか？

グレブ方法 #2

どのように動作しますか？

grep 方法 #3(原本)

どのように動作しますか？

引用する

関連情報

オプション1正規表現 `(?x)--\nare(?:[^\n]*\n)+?(?=--|\Z)`

オプション2正規表現 `(?sx)--\nare.*?(?=\n--|\Z)\n`

オプション3正規表現 `(?xs)--\nare(?:(?!\n--).)*\n`