列のゼロ値のフィルタリング

列のゼロ値のフィルタリング

次のデータセットがあります。

071,36,035,08422,46217,00000,1,A,Broadalbin village (pt.),New York,18,46,46,46,45,45,44,46,44,46,43,43
071,36,045,29322,56209,00000,1,A,Glen Park village (pt.),New York,42,42,42,42,43,42,44,43,42,42,41,40
162,36,000,64881,00000,00000,0,A,Saltaire village,New York,37,41,41,41,41,41,41,41,41,41,41,41
157,36,103,64881,00000,00000,0,A,Saltaire village,New York,37,41,41,41,41,41,41,41,41,41,41,41
071,36,103,64881,38000,00000,1,A,Saltaire village,New York,37,41,41,41,41,41,41,41,41,41,41,41
061,36,013,00000,13035,00000,1,F,Cattaraugus Reservation,New York,38,38,38,38,37,37,39,39,38,38,38,40
061,36,029,00000,75011,00000,1,F,Tonawanda Reservation,New York,34,34,34,34,34,34,34,37,37,37,37,36
061,36,009,00000,60950,00000,1,A,Red House town,New York,38,33,33,33,34,34,35,35,34,34,35,35
071,36,065,61115,75280,00000,1,A,Remsen village (pt.),New York,28,28,28,28,28,30,28,27,27,29,29,29
071,36,023,35276,18421,00000,1,A,Homer village (pt.),New York,31,31,31,31,31,30,30,30,30,29,29,29
071,36,099,55816,63440,00000,1,A,Ovid village (pt.),New York,19,27,27,27,27,29,29,26,26,26,26,25
071,36,059,22260,56000,00000,1,A,East Hills village (pt.),New York,23,23,23,23,23,23,23,23,23,23,23,23
071,36,059,47636,34000,00000,1,A,Mineola village (pt.),New York,16,22,22,22,22,22,22,22,22,22,22,22
071,36,043,37275,27199,00000,1,A,Ilion village (pt.),New York,22,23,23,23,23,23,22,22,22,22,22,22
071,36,029,82084,15011,00000,1,A,Williamsville village (pt.),New York,18,13,13,13,13,13,13,13,13,13,13,13
162,36,000,20379,00000,00000,0,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
157,36,103,20379,00000,00000,0,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
071,36,103,20379,66839,00000,1,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
071,36,071,26319,29553,00000,1,A,Florida village (pt.),New York,4,5,5,5,5,5,5,5,5,5,5,5
071,36,099,78553,66333,00000,1,A,Waterloo village (pt.),New York,4,4,4,4,4,4,4,4,4,4,4,4
071,36,057,26924,12122,00000,1,A,Fort Plain village (pt.),New York,3,4,4,4,4,4,4,4,4,4,4,4
071,36,053,31709,23305,00000,1,A,Hamilton village (pt.),New York,5,3,3,3,3,3,3,3,3,3,3,3
061,36,003,00000,54523,00000,1,F,Oil Springs Reservation,New York,1,1,1,1,1,1,1,1,1,1,1,1
157,36,099,28640,00000,00000,0,A,Geneva city (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,099,28640,28640,00000,1,A,Geneva city (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,089,46019,51330,00000,1,A,Massena village (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,071,15297,29553,00000,1,A,Chester village (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,059,28178,53000,00000,1,A,Garden City village (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,053,31709,44435,00000,1,A,Hamilton village (pt.),New York,2,0,0,0,0,0,0,0,0,0,0,0
071,36,051,19664,70101,00000,1,A,Dansville village (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
071,36,039,36167,38638,00000,1,A,Hunter village (pt.),New York,A,0,0,0,0,0,0,0,0,0,0,0
071,36,037,18201,19719,00000,1,A,Corfu village (pt.),New York,0,0,0,0,0,0,0,0,0,0,0,0
061,36,099,00000,28640,00000,0,F,Geneva city,New York,0,0,0,0,0,0,0,0,0,0,0,0
061,36,063,00000,75011,00000,1,F,Tonawanda Reservation,New York,0,0,0,0,0,0,0,0,0,0,0,0
061,36,013,00000,14072,00000,1,S,Chautauqua Lake UT,New York,0,0,0,0,0,0,0,0,0,0,0,0
061,36,009,00000,54523,00000,1,F,Oil Springs Reservation,New York,0,0,0,0,0,0,0,0,0,0,0,0

ゼロ以外の最後の列で最小数の10行を印刷したいと思います。

071,36,043,37275,27199,00000,1,A,Ilion village (pt.),New York,22,23,23,23,23,23,22,22,22,22,22,22
071,36,029,82084,15011,00000,1,A,Williamsville village (pt.),New York,18,13,13,13,13,13,13,13,13,13,13,13
162,36,000,20379,00000,00000,0,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
157,36,103,20379,00000,00000,0,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
071,36,103,20379,66839,00000,1,A,Dering Harbor village,New York,11,11,11,11,11,11,11,11,11,11,11,11
071,36,071,26319,29553,00000,1,A,Florida village (pt.),New York,4,5,5,5,5,5,5,5,5,5,5,5
071,36,099,78553,66333,00000,1,A,Waterloo village (pt.),New York,4,4,4,4,4,4,4,4,4,4,4,4
071,36,057,26924,12122,00000,1,A,Fort Plain village (pt.),New York,3,4,4,4,4,4,4,4,4,4,4,4
071,36,053,31709,23305,00000,1,A,Hamilton village (pt.),New York,5,3,3,3,3,3,3,3,3,3,3,3
061,36,003,00000,54523,00000,1,F,Oil Springs Reservation,New York,1,1,1,1,1,1,1,1,1,1,1,1

これまで使用したコマンドは次のとおりです。

cat [filename] | grep 'New York' | sort -n -r -t, -k20 

しかし、残りの作業を行う方法がわかりません。そのためにはgrepが必要だと確信しています。

答え1

awk -F, '$10 == "New York" && $NF > 0' input | sort -t, -rnk22 | tail -10

awkは、10番目の列が0より大きく、New York最後の列が0より大きい行のみを印刷します。

sort は、列 22 を数値に従って逆順にソートします。

tail -10は最後の10行だけを印刷します。

答え2

ミラーを使用して:

$ mlr --csv -N filter '$10 == "New York" && $22 > 0' then sort -nr 22 then tail file

使用gawk:

$ awk -F, -v var=10 
'$10 == "New York" && $NF > 0 {ar[$0]=$NF}
END{asorti(ar, b, "@val_num_asc"); 
for(i=var;i>=1;i--)print b[i]}' file

$ csvsql -I -H  --query 'SELECT * FROM file WHERE j == "New York" AND v > 0 ORDER BY CAST(v AS INTEGER) DESC' file | tail

-Hヘッダファイルがないために使用されます。これにより、基本ヘッダー(a、b、c..)が生成されます。

$ csvsql -I -H --query 'SELECT * FROM file WHERE j == "New York" AND v > 0 ORDER BY CAST(v AS INTEGER) DESC LIMIT 10 OFFSET ( SELECT count(*) FROM file WHERE j == "New York" AND v > 0 ) - 10' file | csvformat -K1

関連情報