区切り文字でファイルを複数の部分に分割する方法

Question 1

これを行うには、sedとcsplitを使用できます。

sed -i.bak 's/----/-/g' file && csplit --suppress-matched file '/-/' '{*}'

sedファイル内の「----」を単一の「-」に置き換えます（バックアップ用のバックアップを作成します）。
csplitこれにより、ファイルは単一の「-」に基づいて分割され、複数のファイルが出力されます（例：xx00、xx01など）。

編集：タイプミスを指摘してくれた@AdminBeeに感謝します。コマンドから追加の「ファイル」を削除しました。

Answer

これを行うには、sedとcsplitを使用できます。

sed -i.bak 's/----/-/g' file && csplit --suppress-matched file '/-/' '{*}'

sedファイル内の「----」を単一の「-」に置き換えます（バックアップ用のバックアップを作成します）。
csplitこれにより、ファイルは単一の「-」に基づいて分割され、複数のファイルが出力されます（例：xx00、xx01など）。

編集：タイプミスを指摘してくれた@AdminBeeに感謝します。コマンドから追加の「ファイル」を削除しました。

Question 2

#! /usr/bin/env bash

# split-textfile-by-first-line.sh

# split a text file by delimiter lines.
# the delimiter lines are included
# as first lines of the output files.

# removing the delimiter line
# would require patching
# the arguments for tail and head.

# (c) Milan Hauth, MIT License

set -eu

input_file="$1"
first_line="$2"

input_size=$(stat -c%s "$input_file")

start=
while read match; do
    # get the byte offset of match
    end=${match%%:*}
    if [ "$end" = "-1" ]; then
        # last match. set end to end of file
        end=$input_size
        if [ -z "$start" ]; then
            echo "error: first_line was not found in input_file"
            exit 1
        fi
    fi
    if [ -n "$start" ]; then
        output_file="$input_file".$start-to-$end
        cat "$input_file" |
            tail -c +$start |
            head -c $((end - start)) \
            >"$output_file"
        echo done "$output_file"
    fi
    start=$end
done < <(
    grep -bxF "$first_line" "$input_file"
    echo -1
)

Answer

#! /usr/bin/env bash

# split-textfile-by-first-line.sh

# split a text file by delimiter lines.
# the delimiter lines are included
# as first lines of the output files.

# removing the delimiter line
# would require patching
# the arguments for tail and head.

# (c) Milan Hauth, MIT License

set -eu

input_file="$1"
first_line="$2"

input_size=$(stat -c%s "$input_file")

start=
while read match; do
    # get the byte offset of match
    end=${match%%:*}
    if [ "$end" = "-1" ]; then
        # last match. set end to end of file
        end=$input_size
        if [ -z "$start" ]; then
            echo "error: first_line was not found in input_file"
            exit 1
        fi
    fi
    if [ -n "$start" ]; then
        output_file="$input_file".$start-to-$end
        cat "$input_file" |
            tail -c +$start |
            head -c $((end - start)) \
            >"$output_file"
        echo done "$output_file"
    fi
    start=$end
done < <(
    grep -bxF "$first_line" "$input_file"
    echo -1
)

Question 3

Raku（以前のPerl_6）の使用

1行ずつ読みます。

~$ raku -ne 'BEGIN my @a; @a.push: $_ unless m/"----"/; END .say for @a[2];'  file
cccc

~$ raku -ne 'BEGIN my @a; @a.push: $_ unless m/"----"/; END .say for @a[3];'  file
dddd

一度にすべてのファイルを読む：

~$ raku -e 'slurp.split(/ "\n" | "----" /, :skip-empty).[2].put;'  file
cccc

~$ raku -e 'slurp.split(/ "\n" | "----" /, :skip-empty).[3].put;'  file
dddd

RakuはPerlプログラミング言語シリーズのプログラミング言語です。高度なUnicodeサポート機能が組み込まれています。上記では、ゼロインデックス行、目的の3行目（0インデックス== 2）、または希望の4行目（0インデックス== 3）を取得できます。

入力例：

aaaa
bbbb
----
cccc
----
dddd

付録：Rakuを使ったルーチンlines（ゆっくり読む）：

~$ raku -e '.[2].put if .chars for lines.map: *.split("----", :skip-empty);'  file
cccc

~$ raku -e '.[3].put if .chars for lines.map: *.split("----", :skip-empty);'  file
dddd

https://raku.org

Answer