sed、awk、またはtrを使用し、コロン（：）を区切り文字として使用して、各行をCSV形式に分割します。

Question 1

perlすべてのデスクトップまたはサーバーのLinuxディストリビューションで利用可能なものを使用してください。

perl -lne '
   BEGIN{$,=","}
   ($k,$v)=split":",$_,2;
   next unless defined $v;
   for($k,$v){s/"/""/g,$_=qq{"$_"}if/[$,"]/}
   $k=$t{$k}//=$t++;
   if(exists$f[$k]){print@f;@f=()}
   $f[$k]=$v;
   END{print@f;print STDERR sort{$t{$a}<=>$t{$b}}keys%t}
' your_file

ファイル全体を処理した後、ヘッダー（フィールド名を含む最初の行）がstderrとして印刷されることを除いて、ファイルを標準CSVに変換する必要があります。使用し... >body 2>hdrてどこかに保存できますcat hdr body > final_file.csv。

空行などについては特別な意味はありません。レコードは、順序に関係なく、名前の異なるフィールドのセットで構成されていると見なされます。

,またはを含むフィールドは"内部に配置され、"..."内部は"2倍に拡張してエスケープされます""（CSVルールを使用）。

$,=","たとえば、次のように変更してフィールド区切り記号を調整できます。$,="|"（または$,="\t"タブの場合）。行を削除すると、引用符とエスケープを削除できますfor($k,$v){ ... }。

awksedこれは（inまたは代わりに）行うことができます。配列全体を一度に印刷する方法がなく（繰り返し必要）、文字列を分割できないため、trもう少し複雑になります。awk限られた数のフィールド（スキルを使用する必要がありますsubstr）

Answer

perlすべてのデスクトップまたはサーバーのLinuxディストリビューションで利用可能なものを使用してください。

perl -lne '
   BEGIN{$,=","}
   ($k,$v)=split":",$_,2;
   next unless defined $v;
   for($k,$v){s/"/""/g,$_=qq{"$_"}if/[$,"]/}
   $k=$t{$k}//=$t++;
   if(exists$f[$k]){print@f;@f=()}
   $f[$k]=$v;
   END{print@f;print STDERR sort{$t{$a}<=>$t{$b}}keys%t}
' your_file

ファイル全体を処理した後、ヘッダー（フィールド名を含む最初の行）がstderrとして印刷されることを除いて、ファイルを標準CSVに変換する必要があります。使用し... >body 2>hdrてどこかに保存できますcat hdr body > final_file.csv。

空行などについては特別な意味はありません。レコードは、順序に関係なく、名前の異なるフィールドのセットで構成されていると見なされます。

,またはを含むフィールドは"内部に配置され、"..."内部は"2倍に拡張してエスケープされます""（CSVルールを使用）。

$,=","たとえば、次のように変更してフィールド区切り記号を調整できます。$,="|"（または$,="\t"タブの場合）。行を削除すると、引用符とエスケープを削除できますfor($k,$v){ ... }。

awksedこれは（inまたは代わりに）行うことができます。配列全体を一度に印刷する方法がなく（繰り返し必要）、文字列を分割できないため、trもう少し複雑になります。awk限られた数のフィールド（スキルを使用する必要がありますsubstr）

Question 2

完全性のためにawk- 。

-スクリプトawk（私たちはこれを呼ぶconvert_csv.awk）：

#!/bin/awk -f
BEGIN{FS=":";OFS=","}

# Process all non-empty lines
NF>0{
    # Check if the "key" part of the line was not yet encountered, both globally
    # and for the currently processes record.
    # If not encountered globally yet, add to list of headers (=columns).
    new_hdr=1; new_key=1;
    for (i=1; i<=n_hdrs; i++) {if (hdr[i]==$1) new_hdr=0;}
    if (new_hdr) hdr[++n_hdrs]=$1;

    for (key in val) {if (key==$1) new_key=0;}


    # Once no globally new keys are found, consider the "list of headers" as
    # complete and print it as CSV header line.
    if (!new_hdr && !hdr_printed)
    {
        for (i=1;i<=n_hdrs;i++) printf("%s%s", hdr[i], i==n_hdrs?ORS:OFS);
        hdr_printed=1;
    }

    # If the current key was already found in the currently processed record,
    # we assume that a new record was started, and print the data collected
    # so far before collecting data on the next record.
    if (!new_key)
    {
        for (i=1;i<=n_hdrs;i++) printf("%s%s", val[hdr[i]], i==n_hdrs?ORS:OFS);
        delete val;
    }

    # Associate the "value" part of the line with the "key", perform transformations
    # as necessary. Since both the 'gsub()' function used for escaping '"' to '""'
    # and the 'index()' function used to localize ',' return non-zero if an occurence
    # was found, the sum of both return values being > 0 indicates that the field
    # must be quoted.
    quote=gsub("\"","\"\"",$2)+index($2,",");
    if (quote) $2="\""$2"\"";
    val[$1]=$2;
}


# Print the last record. If it was the only record, print the header line, too (this
# is the case if 'new_hdr' is still 'true' at end-of-file).
END {
    if (new_hdr)
    {
        for (i=1;i<=n_hdrs;i++) printf("%s%s", hdr[i], i==n_hdrs?ORS:OFS);
    }

    for (i=1;i<=n_hdrs;i++) printf("%s%s", val[hdr[i]], i==n_hdrs?ORS:OFS);
}

この関数はコメントに記載されていますが、基本的に一意のキーコレクションを探して行で「すでに見つかった」キーを見つけた場合は、レコードが完了したと見なされ、レコードを印刷して次のレコードを収集するために一時バッファを消去します。。また、フィールドの特殊文字のCSV規格に準拠するために@mosvyとマークされた変換を適用します。

電話する

awk -f convert_csv.awk input.txt

Answer

完全性のためにawk- 。

-スクリプトawk（私たちはこれを呼ぶconvert_csv.awk）：

#!/bin/awk -f
BEGIN{FS=":";OFS=","}

# Process all non-empty lines
NF>0{
    # Check if the "key" part of the line was not yet encountered, both globally
    # and for the currently processes record.
    # If not encountered globally yet, add to list of headers (=columns).
    new_hdr=1; new_key=1;
    for (i=1; i<=n_hdrs; i++) {if (hdr[i]==$1) new_hdr=0;}
    if (new_hdr) hdr[++n_hdrs]=$1;

    for (key in val) {if (key==$1) new_key=0;}


    # Once no globally new keys are found, consider the "list of headers" as
    # complete and print it as CSV header line.
    if (!new_hdr && !hdr_printed)
    {
        for (i=1;i<=n_hdrs;i++) printf("%s%s", hdr[i], i==n_hdrs?ORS:OFS);
        hdr_printed=1;
    }

    # If the current key was already found in the currently processed record,
    # we assume that a new record was started, and print the data collected
    # so far before collecting data on the next record.
    if (!new_key)
    {
        for (i=1;i<=n_hdrs;i++) printf("%s%s", val[hdr[i]], i==n_hdrs?ORS:OFS);
        delete val;
    }

    # Associate the "value" part of the line with the "key", perform transformations
    # as necessary. Since both the 'gsub()' function used for escaping '"' to '""'
    # and the 'index()' function used to localize ',' return non-zero if an occurence
    # was found, the sum of both return values being > 0 indicates that the field
    # must be quoted.
    quote=gsub("\"","\"\"",$2)+index($2,",");
    if (quote) $2="\""$2"\"";
    val[$1]=$2;
}


# Print the last record. If it was the only record, print the header line, too (this
# is the case if 'new_hdr' is still 'true' at end-of-file).
END {
    if (new_hdr)
    {
        for (i=1;i<=n_hdrs;i++) printf("%s%s", hdr[i], i==n_hdrs?ORS:OFS);
    }

    for (i=1;i<=n_hdrs;i++) printf("%s%s", val[hdr[i]], i==n_hdrs?ORS:OFS);
}

この関数はコメントに記載されていますが、基本的に一意のキーコレクションを探して行で「すでに見つかった」キーを見つけた場合は、レコードが完了したと見なされ、レコードを印刷して次のレコードを収集するために一時バッファを消去します。。また、フィールドの特殊文字のCSV規格に準拠するために@mosvyとマークされた変換を適用します。

電話する

awk -f convert_csv.awk input.txt

sed、awk、またはtrを使用し、コロン（：）を区切り文字として使用して、各行をCSV形式に分割します。

答え1

答え2

関連情報