データ構造

Question 1

これがawkの仕事です。

awk -F: '{L[$1]=L[$1] "," $2} 
    END { for (l in L) printf "%s:%s\n",l,substr(L[l],2);}'

どこ

-F::区切り文字として使用
{L[$1]=L[$1] "," $2}フィールド1でインデックス付けされたコンマ区切り値を保存します。
ENDファイルの終わりに
for (l in L) 値を介したループ
printf "%s:%s\n",l,substr(L[l],2);印刷、最初のカンマをスキップ
","または、", "最後の部分文字列を適切に調整するために使用できます。

awk は以下を使用して 1 行で書くことができます。

awk -F: '....' File1 > File3

遺伝子の数を計算するには、var tou count（ここではG）を追加します。

{L[$1]=L[$1] "," $2;G[$1]++} 
END { for (l in L) printf "%s:%s:%d\n",l,substr(L[l],2),G[l];}

Answer

これがawkの仕事です。

awk -F: '{L[$1]=L[$1] "," $2} 
    END { for (l in L) printf "%s:%s\n",l,substr(L[l],2);}'

どこ

-F::区切り文字として使用
{L[$1]=L[$1] "," $2}フィールド1でインデックス付けされたコンマ区切り値を保存します。
ENDファイルの終わりに
for (l in L) 値を介したループ
printf "%s:%s\n",l,substr(L[l],2);印刷、最初のカンマをスキップ
","または、", "最後の部分文字列を適切に調整するために使用できます。

awk は以下を使用して 1 行で書くことができます。

awk -F: '....' File1 > File3

遺伝子の数を計算するには、var tou count（ここではG）を追加します。

{L[$1]=L[$1] "," $2;G[$1]++} 
END { for (l in L) printf "%s:%s:%d\n",l,substr(L[l],2),G[l];}

Question 2

そしてGNUデータの混合

datamash -t: groupby 1 collapse 2 < file
A:1,2,3
B:a,b
C:pp
D:rr

も計算したいなら、

datamash -t: groupby 1 collapse 2 count 2 < file
A:1,2,3:3
B:a,b:2
C:pp:1
D:rr:1

countunique独自のフィールド数が必要な場合でも可能です。

Answer

そしてGNUデータの混合

datamash -t: groupby 1 collapse 2 < file
A:1,2,3
B:a,b
C:pp
D:rr

も計算したいなら、

datamash -t: groupby 1 collapse 2 count 2 < file
A:1,2,3:3
B:a,b:2
C:pp:1
D:rr:1

countunique独自のフィールド数が必要な場合でも可能です。

Question 3

データ構造

 %h = (
     ...
      B => [a, b],
      A => [1, 2, 3],
     ...
 );


perl -F':' -lane '
   push @{$h{$F[0]}}, $F[1]}{
   $"=",";
   print "$_:", "@{$h{$_}}|", scalar @{$h{$_}} for keys %h;
' File1 > File1.new

単に

The field separator is set to a semicolon, thus populating each time a line is read in afresh 
the @F array. Then we append the 2nd field, $F[1], to the array of hash
keyed in on the 1st field, $F[0]. At the end, we display the key name,
followed by the array contents corresponding to this key, & the count of
the array as well.

出力

A:1,2,3|3
B:a,b|2
C:pp|1
D:rr|1

横

sed -e '
  :loop
     $!N
     s/^\(\([^:]*\):.*\)\n\2:\(.*\)/\1,\3/
   tloop
   P;D
' yourfile

Answer

データ構造

 %h = (
     ...
      B => [a, b],
      A => [1, 2, 3],
     ...
 );


perl -F':' -lane '
   push @{$h{$F[0]}}, $F[1]}{
   $"=",";
   print "$_:", "@{$h{$_}}|", scalar @{$h{$_}} for keys %h;
' File1 > File1.new

単に

The field separator is set to a semicolon, thus populating each time a line is read in afresh 
the @F array. Then we append the 2nd field, $F[1], to the array of hash
keyed in on the 1st field, $F[0]. At the end, we display the key name,
followed by the array contents corresponding to this key, & the count of
the array as well.

出力

A:1,2,3|3
B:a,b|2
C:pp|1
D:rr|1

横

sed -e '
  :loop
     $!N
     s/^\(\([^:]*\):.*\)\n\2:\(.*\)/\1,\3/
   tloop
   P;D
' yourfile

データ構造

答え1

答え2

答え3

データ構造

単に

出力

横

関連情報