データ構造

Question 1

一つの方法は、すべてをハッシュに入れることです。

# put values into a hash based on the id and tag
awk 'NR>1{n[$2","$4]+=$1}
END{
    # merge the same ids on the one line
    for(i in n){
        id=i;
        sub(/,.*/,"",id);
        a[id]=a[id]","n[i];
    }
    # print everyhing
    for(i in a){
        print i""a[i];
    }
}'

編集：私の最初の答えは質問に正しく答えませんでした。

Answer

一つの方法は、すべてをハッシュに入れることです。

# put values into a hash based on the id and tag
awk 'NR>1{n[$2","$4]+=$1}
END{
    # merge the same ids on the one line
    for(i in n){
        id=i;
        sub(/,.*/,"",id);
        a[id]=a[id]","n[i];
    }
    # print everyhing
    for(i in a){
        print i""a[i];
    }
}'

編集：私の最初の答えは質問に正しく答えませんでした。

Question 2

Perlが構造に来ます：

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

<>;  # Skip the header.

my %sum;
my %types;
while (<>) {
    my ($count, $id, $type) = grep length, split '[\s|]+';
    $sum{$id}{$type} += $count;
    $types{$type} = 1;
}

say join ',', 'id', sort keys %types;
for my $id (sort { $a <=> $b } keys %sum) {
    say join ',', $id, map $_ // q(), @{ $sum{$id} }{ sort keys %types };
}

これには、タイプテーブルとIDテーブルの2つのテーブルがあります。各IDについて、各タイプの合計を保存します。

Answer

Perlが構造に来ます：

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

<>;  # Skip the header.

my %sum;
my %types;
while (<>) {
    my ($count, $id, $type) = grep length, split '[\s|]+';
    $sum{$id}{$type} += $count;
    $types{$type} = 1;
}

say join ',', 'id', sort keys %types;
for my $id (sort { $a <=> $b } keys %sum) {
    say join ',', $id, map $_ // q(), @{ $sum{$id} }{ sort keys %types };
}

これには、タイプテーブルとIDテーブルの2つのテーブルがあります。各IDについて、各タイプの合計を保存します。

Question 3

もしGNUデータの混合それならそれはあなたのための選択です。

awk 'NR>1 {print $1, $2, $4}' OFS=, file | datamash -t, -s --filler=0 crosstab 2,3 sum 1
,1,2,3
10,0,0,588
12,0,0,10
14,0,0,883
17,0,0,98
18,17,0,77598
2,0,0,17892
21,0,0,10000
23,0,0,20000
27,0,0,63
3,0,0,6
35,0,0,2446
4,15,253,19871
5,0,0,1000

Answer

もしGNUデータの混合それならそれはあなたのための選択です。

awk 'NR>1 {print $1, $2, $4}' OFS=, file | datamash -t, -s --filler=0 crosstab 2,3 sum 1
,1,2,3
10,0,0,588
12,0,0,10
14,0,0,883
17,0,0,98
18,17,0,77598
2,0,0,17892
21,0,0,10000
23,0,0,20000
27,0,0,63
3,0,0,6
35,0,0,2446
4,15,253,19871
5,0,0,1000

Question 4

Perlを使用してCSVファイルを繰り返し、その過程で適切な種類の合計をハッシュに蓄積できます。最後に、IDごとに収集された情報が表示されます。

データ構造

%h = (
   ID1    =>  [ sum_of_type1, sum_of_type2, sum_of_type3 ],
   ...
)

これは以下のコードを理解するのに役立ちます。

真珠

perl -wMstrict -Mvars='*h' -F'\s+|\|' -lane '
   $, = chr 44, next if $. == 1;

   my($count, $id, $type) = grep /./, @F;
   $h{ $id }[ $type-1 ] += $count}{
   print $_, map { $_ || 0 } @{ $h{$_} } for sort { $a <=> $b } keys %h
' yourcsvfile

出力

2,0,0,17892
3,0,0,6
4,15,253,19871
5,0,0,1000
...

Answer

Perlを使用してCSVファイルを繰り返し、その過程で適切な種類の合計をハッシュに蓄積できます。最後に、IDごとに収集された情報が表示されます。

データ構造

%h = (
   ID1    =>  [ sum_of_type1, sum_of_type2, sum_of_type3 ],
   ...
)

これは以下のコードを理解するのに役立ちます。

真珠

perl -wMstrict -Mvars='*h' -F'\s+|\|' -lane '
   $, = chr 44, next if $. == 1;

   my($count, $id, $type) = grep /./, @F;
   $h{ $id }[ $type-1 ] += $count}{
   print $_, map { $_ || 0 } @{ $h{$_} } for sort { $a <=> $b } keys %h
' yourcsvfile

出力

2,0,0,17892
3,0,0,6
4,15,253,19871
5,0,0,1000
...

データ構造

答え1

答え2

答え3

答え4

データ構造

真珠

出力

関連情報