OECD統計からいくつかのデータを抽出しようとしています。これは以下のようにCSVファイルとして提供されます(抜粋)。
"COUNTRY","Country","DAGEGR","Age groups","DSEX","Gender","DSTATUS","Status of population","YEAR","Year","Value","Flag Codes","Flags"
"AUS","Australia","900000","All ages","90","Total males+females","90","All population","2002","2002",19640979,,
"AUS","Australia","900000","All ages","90","Total males+females","90","All population","2003","2003",19872646,,
"AUS","Australia","900000","All ages","90","Total males+females","90","All population","2004","2004",20091504,,
"AUS","Australia","900000","All ages","90","Total males+females","90","All population","2005","2005",20339759,,
"AUS","Australia","900000","All ages","90","Total males+females","90","All population","2006","2006",20605488,,
"AUS","Australia","900000","All ages","90","Total males+females","90","All population","2007","2007",21015042,,
"AUS","Australia","900000","All ages","90","Total males+females","90","All population","2008","2008",21431781,,
"AUS","Australia","900000","All ages","90","Total males+females","90","All population","2009","2009",21874920,,
"AUS","Australia","900000","All ages","90","Total males+females","90","All population","2010","2010",22342398,,
"AUS","Australia","900000","All ages","90","Total males+females","90","All population","2011","2011",22620554,,
"AUS","Australia","900000","All ages","90","Total males+females","90","All population","2012","2012",22683573,,
"LUX","Luxembourg","900000","All ages","90","Total males+females","90","All population","2002","2002",444050,,
"LUX","Luxembourg","900000","All ages","90","Total males+females","90","All population","2003","2003",448300,,
"LUX","Luxembourg","900000","All ages","90","Total males+females","90","All population","2004","2004",451600,,
"LUX","Luxembourg","900000","All ages","90","Total males+females","90","All population","2005","2005",455000,,
"LUX","Luxembourg","900000","All ages","90","Total males+females","90","All population","2006","2006",469086,,
"LUX","Luxembourg","900000","All ages","90","Total males+females","90","All population","2007","2007",476187,,
"LUX","Luxembourg","900000","All ages","90","Total males+females","90","All population","2008","2008",483799,,
"LUX","Luxembourg","900000","All ages","90","Total males+females","90","All population","2009","2009",493500,,
"LUX","Luxembourg","900000","All ages","90","Total males+females","90","All population","2010","2010",502066,,
"LUX","Luxembourg","900000","All ages","90","Total males+females","90","All population","2011","2011",511840,,
"LUX","Luxembourg","900000","All ages","90","Total males+females","90","All population","2012","2012",524853,,
"ESP","Spain","900000","All ages","90","Total males+females","90","All population","2002","2002",40409330,,
"ESP","Spain","900000","All ages","90","Total males+females","90","All population","2003","2003",41550584,,
"ESP","Spain","900000","All ages","90","Total males+females","90","All population","2004","2004",42345342,,
"ESP","Spain","900000","All ages","90","Total males+females","90","All population","2005","2005",43038035,,
"ESP","Spain","900000","All ages","90","Total males+females","90","All population","2006","2006",43758250,,
"ESP","Spain","900000","All ages","90","Total males+females","90","All population","2007","2007",44474631,,
"ESP","Spain","900000","All ages","90","Total males+females","90","All population","2008","2008",45283259,,
"ESP","Spain","900000","All ages","90","Total males+females","90","All population","2009","2009",45828172,,
"ESP","Spain","900000","All ages","90","Total males+females","90","All population","2010","2010",45989016,,
"ESP","Spain","900000","All ages","90","Total males+females","90","All population","2011","2011",46152926,,
"ESP","Spain","900000","All ages","90","Total males+females","90","All population","2012","2012",46818221,,
私が望むのは、2007年にスペイン(ESP)にどれだけの住民がいたかを自動的に読み取ることです。これを達成するには、次のコマンドを使用します。
cat RPOP_16012023145346836.csv | cut -d "," -f 1,9,11 | sort
このコマンドは、国、年、居住者の数を返します。例:
"AUS","2002",19640979
"AUS","2003",19872646
"AUS","2004",20091504
"AUS","2005",20339759
"AUS","2006",20605488
"AUS","2007",21015042
"AUS","2008",21431781
"AUS","2009",21874920
"AUS","2010",22342398
"AUS","2011",22620554
"AUS","2012",22683573
"AUT","2002",8139310
"AUT","2003",8067289
"AUT","2004",8140122
"AUT","2005",8206524
"AUT","2006",8265925
"AUT","2007",8298923
"AUT","2008",8331930
"AUT","2009",8355260
"AUT","2010",8375290
"AUT","2011",8404252
"AUT","2012",8443018
"ESP","2002",40409330
"ESP","2003",41550584
"ESP","2004",42345342
"ESP","2005",43038035
"ESP","2006",43758250
"ESP","2007",44474631
"ESP","2008",45283259
"ESP","2009",45828172
"ESP","2010",45989016
"ESP","2011",46152926
"ESP","2012",46818221
"LUX","2002",444050
"LUX","2003",448300
"LUX","2004",451600
"LUX","2005",455000
"LUX","2006",469086
"LUX","2007",476187
"LUX","2008",483799
"LUX","2009",493500
"LUX","2010",502066
"LUX","2011",511840
"LUX","2012",524853
パイプを使用して、この結果を国(列1 = ESP)と年(列2 = 2007)に基づいて3番目の列(列3 =人口)を返すステートメントに渡したいと思います。残念ながら、これを行うための適切なコマンドはありません。誰かが私を助けることができますか?
予想される出力は次のとおりです。44474631
答え1
Miller()を使用して最初に2つの名前付きフィールドを使用し、データをmlr
フィルタリングして2007年にスペインに関連するレコードのみを見つけ、そのレコードからフィールドを削除します。出力はタイトルなしで表示されます。COUNTRY
YEAR
Value
mlr --csv --headerless-csv-output \
filter '$COUNTRY == "ESP" && $YEAR == 2007' then \
cut -f Value \
RPOP_16012023145346836.csv
質問のデータが与えられると、以下が出力されます。
44474631
フィルタ式の代わりに
$COUNTRY == "ESP" && $YEAR == 2007
...あなたが利用できる
$Country == "Spain" && $Year == 2007
...名前が指定されたフィールドも存在するためです。
答え2
実際に単に数値を返すことが目的であれば、元の44474631
入力ファイルですべての操作を使用できます。awk
awk -F, '$1=="\"ESP\"" && $9=="\"2007\"" {print $11}' RPOP_16012023145346836.csv
これはフィールド区切り文字を設定し、,
列11の内容を印刷します。ただし、列1が同じ"ESP"
(引用符を含む)列9が同じ場合"2007"
(引用符を含む)にのみ適用されます。
答え3
awkを使用してこれを行うことができます。
$ sed 's/"//g' input_file | awk -F, '($1=="ESP" && $9==2007){print $1,$9,$11}'
ESP 2007 44474631