バイナリファイルからnullで終わる文字列を読み取る方法

Question

オプション1：直接変数の割り当て

気になるのがヌルバイトだけであれば、好みの標準的な方法を使用してファイル内のデータを変数として直接読み取ることができるはずです。つまり、ヌルバイトを無視してファイルからデータを読み取ることができるはずです。。以下は、catコマンドとコマンドの置換を使用する例です。

$ data="$(cat eeprom)"
$ echo "${data}"
MAC_ADDRESS=12:34:56:78:90,PCB_MAIN_ID=m/SF-1V/MAIN/0.0,PCB_PIGGY1_ID=n/SF-1V/PS/0.0,CSL_HW_VARIANT=D

これはBusyBox Dockerコンテナで私にとって効果的でした。

解決策 2: `xxd`and`for`ループの使用

さらに制御したい場合は、xxdバイトを16進文字列に変換してその文字列を繰り返すことができます。その後、これらの文字列を繰り返すときに目的のロジックを適用できます。たとえば、初期NULL値を明示的にスキップして、一部の中断条件に達するまで残りのデータを印刷できます。

次のスクリプトは、有効な文字（ASCII 32〜127）の「許可リスト」を指定し、他の文字のサブシーケンスを区切り文字として処理し、有効な部分文字列をすべて抽出します。

#!/bin/sh
# get_hex_substrings.sh

# Get the path to the data-file as a command-line argument
datafile="$1"

# Keep track of state using environment variables
inside_padding_block="true"
inside_bad_block="false"

# NOTE: The '-p' flag is for "plain" output (no additional formatting)
# and the '-c 1' option specifies that the representation of each byte
# will be printed on a separate line
for h in $(xxd -p -c 1 "${datafile}"); do

    # Convert the hex character to standard decimal
    d="$((0x${h}))"

    # Case where we're still inside the initial padding block
    if [ "${inside_padding_block}" == "true" ]; then
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            inside_padding_block="false";
            printf '\x'"${h}";
        fi

    # Case where we're passed the initial padding, but inside another
    # block of non-printable characters
    elif [ "${inside_bad_block}" == "true" ]; then
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            inside_bad_block="false";
            printf '\x'"${h}";
        fi

    # Case where we're inside of a substring that we want to extract
    else
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            printf '\x'"${h}";
        else
            inside_bad_block="true";
            echo
        fi
    fi
done

if [ "${inside_bad_block}" == "false" ]; then
    echo
fi

\x00これで、サブストリングを区切るサブシーケンスの合計を使用してサンプルファイルを生成してテストできます\xff。

printf '\x00\x00\x00string1\xff\xff\xffstring2\x00\x00\x00string3\x00\x00\x00' > data.hex

以下は、スクリプトを実行したときに得られる出力です。

$ sh get_hex_substrings.sh data.hex
string1
string2
string3

解決策3：`tr`および`cut`コマンドの使用

ヌルバイトを処理するためにtrコマンドを使用することもできます。cut以下は、隣接するヌル文字を圧縮/折りたたみ、それを改行文字に変換してヌル終了文字列リストから最初のヌル終了文字列を抽出する例です。

$ printf '\000\000\000string1\000\000\000string2\000\000\000string3\000\000\000' > file.dat
$ tr -s '\000' '\n' < file.dat | cut -d$'\n' -f2
string1

Answer 1

オプション1：直接変数の割り当て

気になるのがヌルバイトだけであれば、好みの標準的な方法を使用してファイル内のデータを変数として直接読み取ることができるはずです。つまり、ヌルバイトを無視してファイルからデータを読み取ることができるはずです。。以下は、catコマンドとコマンドの置換を使用する例です。

$ data="$(cat eeprom)"
$ echo "${data}"
MAC_ADDRESS=12:34:56:78:90,PCB_MAIN_ID=m/SF-1V/MAIN/0.0,PCB_PIGGY1_ID=n/SF-1V/PS/0.0,CSL_HW_VARIANT=D

これはBusyBox Dockerコンテナで私にとって効果的でした。

解決策 2: `xxd`and`for`ループの使用

さらに制御したい場合は、xxdバイトを16進文字列に変換してその文字列を繰り返すことができます。その後、これらの文字列を繰り返すときに目的のロジックを適用できます。たとえば、初期NULL値を明示的にスキップして、一部の中断条件に達するまで残りのデータを印刷できます。

次のスクリプトは、有効な文字（ASCII 32〜127）の「許可リスト」を指定し、他の文字のサブシーケンスを区切り文字として処理し、有効な部分文字列をすべて抽出します。

#!/bin/sh
# get_hex_substrings.sh

# Get the path to the data-file as a command-line argument
datafile="$1"

# Keep track of state using environment variables
inside_padding_block="true"
inside_bad_block="false"

# NOTE: The '-p' flag is for "plain" output (no additional formatting)
# and the '-c 1' option specifies that the representation of each byte
# will be printed on a separate line
for h in $(xxd -p -c 1 "${datafile}"); do

    # Convert the hex character to standard decimal
    d="$((0x${h}))"

    # Case where we're still inside the initial padding block
    if [ "${inside_padding_block}" == "true" ]; then
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            inside_padding_block="false";
            printf '\x'"${h}";
        fi

    # Case where we're passed the initial padding, but inside another
    # block of non-printable characters
    elif [ "${inside_bad_block}" == "true" ]; then
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            inside_bad_block="false";
            printf '\x'"${h}";
        fi

    # Case where we're inside of a substring that we want to extract
    else
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            printf '\x'"${h}";
        else
            inside_bad_block="true";
            echo
        fi
    fi
done

if [ "${inside_bad_block}" == "false" ]; then
    echo
fi

\x00これで、サブストリングを区切るサブシーケンスの合計を使用してサンプルファイルを生成してテストできます\xff。

printf '\x00\x00\x00string1\xff\xff\xffstring2\x00\x00\x00string3\x00\x00\x00' > data.hex

以下は、スクリプトを実行したときに得られる出力です。

$ sh get_hex_substrings.sh data.hex
string1
string2
string3

解決策3：`tr`および`cut`コマンドの使用

ヌルバイトを処理するためにtrコマンドを使用することもできます。cut以下は、隣接するヌル文字を圧縮/折りたたみ、それを改行文字に変換してヌル終了文字列リストから最初のヌル終了文字列を抽出する例です。

$ printf '\000\000\000string1\000\000\000string2\000\000\000string3\000\000\000' > file.dat
$ tr -s '\000' '\n' < file.dat | cut -d$'\n' -f2
string1

バイナリファイルからnullで終わる文字列を読み取る方法

答え1

オプション1：直接変数の割り当て

解決策 2: `xxd`and`for`ループの使用

解決策3：`tr`および`cut`コマンドの使用

関連情報

答え1

オプション1：直接変数の割り当て

解決策 2: xxdandforループの使用

解決策3：trおよびcutコマンドの使用

関連情報

解決策 2: `xxd`and`for`ループの使用

解決策3：`tr`および`cut`コマンドの使用