2番目の列の最初の2文字を切り取ります。

2番目の列の最初の2文字を切り取ります。

以下の米国およびカナダの州/州のリストを含むファイルがあります。

    id,name,abbreviation,country,type,sort,status,occupied,notes,fips_state,assoc_press,standard_federal_region,census_region,census_region_name,census_division,census_division_name,circuit_court
"1","Alabama","AL","USA","state","10","current","occupied","","1","Ala.","IV","3","South","6","East South Central","11"
"2","Alaska","AK","USA","state","10","current","occupied","","2","Alaska","X","4","West","9","Pacific","9"
"3","Arizona","AZ","USA","state","10","current","occupied","","4","Ariz.","IX","4","West","8","Mountain","9"
"4","Arkansas","AR","USA","state","10","current","occupied","","5","Ark.","VI","3","South","7","West South Central","8"
"5","California","CA","USA","state","10","current","occupied","","6","Calif.","IX","4","West","9","Pacific","9"
"6","Colorado","CO","USA","state","10","current","occupied","","8","Colo.","VIII","4","West","8","Mountain","10"
"7","Connecticut","CT","USA","state","10","current","occupied","","9","Conn.","I","1","Northeast","1","New England","2"
"8","Delaware","DE","USA","state","10","current","occupied","","10","Del.","III","3","South","5","South Atlantic","3"
"9","Florida","FL","USA","state","10","current","occupied","","12","Fla.","IV","3","South","5","South Atlantic","11"
"10","Georgia","GA","USA","state","10","current","occupied","","13","Ga.","IV","3","South","5","South Atlantic","11"
"11","Hawaii","HI","USA","state","10","current","occupied","","15","Hawaii","IX","4","West","9","Pacific","9"
"12","Idaho","ID","USA","state","10","current","occupied","","16","Idaho","X","4","West","8","Mountain","9"
"13","Illinois","IL","USA","state","10","current","occupied","","17","Ill.","V","2","Midwest","3","East North Central","7"
"14","Indiana","IN","USA","state","10","current","occupied","","18","Ind.","V","2","Midwest","3","East North Central","7"
"15","Iowa","IA","USA","state","10","current","occupied","","19","Iowa","VII","2","Midwest","4","West North Central","8"
"16","Kansas","KS","USA","state","10","current","occupied","","20","Kan.","VII","2","Midwest","4","West North Central","10"
"17","Kentucky","KY","USA","state","10","current","occupied","","21","Ky.","IV","3","South","6","East South Central","6"
"18","Louisiana","LA","USA","state","10","current","occupied","","22","La.","VI","3","South","7","West South Central","5"
"19","Maine","ME","USA","state","10","current","occupied","","23","Maine","I","1","Northeast","1","New England","1"
"20","Maryland","MD","USA","state","10","current","occupied","","24","Md.","III","3","South","5","South Atlantic","4"
"21","Massachusetts","MA","USA","state","10","current","occupied","","25","Mass.","I","1","Northeast","1","New England","1"
"22","Michigan","MI","USA","state","10","current","occupied","","26","Mich.","V","2","Midwest","3","East North Central","6"
"23","Minnesota","MN","USA","state","10","current","occupied","","27","Minn.","V","2","Midwest","4","West North Central","8"
"24","Mississippi","MS","USA","state","10","current","occupied","","28","Miss.","IV","3","South","6","East South Central","5"
"25","Missouri","MO","USA","state","10","current","occupied","","29","Mo.","VII","2","Midwest","4","West North Central","8"
"26","Montana","MT","USA","state","10","current","occupied","","30","Mont.","VIII","4","West","8","Mountain","9"
"27","Nebraska","NE","USA","state","10","current","occupied","","31","Neb.","VII","2","Midwest","4","West North Central","8"
"28","Nevada","NV","USA","state","10","current","occupied","","32","Nev.","IX","4","West","8","Mountain","9"
"29","New Hampshire","NH","USA","state","10","current","occupied","","33","N.H.","I","1","Northeast","1","New England","1"
"30","New Jersey","NJ","USA","state","10","current","occupied","","34","N.J.","II","1","Northeast","2","Mid-Atlantic","3"
"31","New Mexico","NM","USA","state","10","current","occupied","","35","N.M.","VI","4","West","8","Mountain","10"
"32","New York","NY","USA","state","10","current","occupied","","36","N.Y.","II","1","Northeast","2","Mid-Atlantic","2"
"33","North Carolina","NC","USA","state","10","current","occupied","","37","N.C.","IV","3","South","5","South Atlantic","4"
"34","North Dakota","ND","USA","state","10","current","occupied","","38","N.D.","VIII","2","Midwest","4","West North Central","8"
"35","Ohio","OH","USA","state","10","current","occupied","","39","Ohio","V","2","Midwest","3","East North Central","6"
"36","Oklahoma","OK","USA","state","10","current","occupied","","40","Okla.","VI","3","South","7","West South Central","10"
"37","Oregon","OR","USA","state","10","current","occupied","","41","Ore.","X","4","West","9","Pacific","9"
"38","Pennsylvania","PA","USA","state","10","current","occupied","","42","Pa.","III","1","Northeast","2","Mid-Atlantic","3"
"39","Rhode Island","RI","USA","state","10","current","occupied","","44","R.I.","I","1","Northeast","1","New England","1"
"40","South Carolina","SC","USA","state","10","current","occupied","","45","S.C.","IV","3","South","5","South Atlantic","4"
"41","South Dakota","SD","USA","state","10","current","occupied","","46","S.D.","VIII","2","Midwest","4","West North Central","8"
"42","Tennessee","TN","USA","state","10","current","occupied","","47","Tenn.","IV","3","South","6","East South Central","6"
"43","Texas","TX","USA","state","10","current","occupied","","48","Texas","VI","3","South","7","West South Central","5"
"44","Utah","UT","USA","state","10","current","occupied","","49","Utah","VIII","4","West","8","Mountain","10"
"45","Vermont","VT","USA","state","10","current","occupied","","50","Vt.","I","1","Northeast","1","New England","2"
"46","Virginia","VA","USA","state","10","current","occupied","","51","Va.","III","3","South","5","South Atlantic","4"
"47","Washington","WA","USA","state","10","current","occupied","","53","Wash.","X","4","West","9","Pacific","9"
"48","West Virginia","WV","USA","state","10","current","occupied","","54","W.Va.","III","3","South","5","South Atlantic","4"
"49","Wisconsin","WI","USA","state","10","current","occupied","","55","Wis.","V","2","Midwest","3","East North Central","7"
"50","Wyoming","WY","USA","state","10","current","occupied","","56","Wyo.","VIII","4","West","8","Mountain","10"
"51","Washington DC","DC","USA","capitol","10","current","occupied","","11","","III","3","South","5","South Atlantic","D.C."
"60","Alberta","AB","Canada","province","30","current","occupied","","","","","","","","",""
"61","British Columbia","BC","Canada","province","30","current","occupied","","","","","","","","",""
"62","Manitoba","MB","Canada","province","30","current","occupied","","","","","","","","",""
"63","New Brunswick","NB","Canada","province","30","current","occupied","","","","","","","","",""
"64","Newfoundland and Labrador","NL","Canada","province","30","current","occupied","","","","","","","","",""
"65","Nova Scotia","NS","Canada","province","30","current","occupied","","","","","","","","",""
"66","Ontario","ON","Canada","province","30","current","occupied","","","","","","","","",""
"67","Prince Edward Island","PE","Canada","province","30","current","occupied","","","","","","","","",""
"68","Quebec","QC","Canada","province","30","current","occupied","","","","","","","","",""
"69","Saskatchewan","SK","Canada","province","30","current","occupied","","","","","","","","",""

私はこれをしたい:

name,country
Alabama,US
...
Wyoming,US
Alberta,Ca
Saskatchewan,Ca

まず米国州、次にカリフォルニア州です。

私の解決策は次のとおりです。

#!/bin/sh

cat north_america.csv | head -n1 | cut -d',' -f2,4 > title
cat north* | tail -n +2 | cut -d',' -f2,4 | tr -d '"' | sort -t','  -k 2  | head -n10 > Canada
cat north* | tail -n +2 | cut -d',' -f2,4 | tr -d '"' | sort -t','  -k 2  | tail -n +11  > USA

cat USA | rev | cut -c-1 --complement | rev > file1
cat Canada | rev | cut -c 1-4 --complement | rev > file2

cat title > states
cat file1 >> states
cat file2 >> states

私の質問は、どういうわけか2番目の列の最初の2文字を「切り取る」ことができるということです。 「頭」と「尾」の代わりに使います。

cat north* | tail -n +2 | cut -d',' -f2,4 | tr -d '"' | sort -t','  -k2,2r >> states

それから「cut」コマンドを発行します。しかし、どうすればいいのかわかりません。私はheadとtailを使用し、ファイルを2つのファイルに分割したくありません。私はより簡単なアプローチを取りたいです。

どんな提案にも感謝します。

答え1

このために必要なものは次のとおりです。

awk -F, -vOFS="," '{print $2,$4}' file 

-F,フィールド区切り記号をに設定,し、-vOFS=","出力フィールド区切り記号をに設定します,。次に、各行の2番目と4番目のフィールドのみを印刷します。サンプルファイルでは、次が返されます。

$ awk -F, -vOFS="," '{print $2,$4}' file 
name,country
"Alabama","USA"
"Alaska","USA"
"Arizona","USA"
"Arkansas","USA"
"California","USA"
"Colorado","USA"
"Connecticut","USA"
"Delaware","USA"
"Florida","USA"
"Georgia","USA"
"Hawaii","USA"
"Idaho","USA"
"Illinois","USA"
"Indiana","USA"
"Iowa","USA"
"Kansas","USA"
"Kentucky","USA"
"Louisiana","USA"
"Maine","USA"
"Maryland","USA"
"Massachusetts","USA"
"Michigan","USA"
"Minnesota","USA"
"Mississippi","USA"
"Missouri","USA"
"Montana","USA"
"Nebraska","USA"
"Nevada","USA"
"New Hampshire","USA"
"New Jersey","USA"
"New Mexico","USA"
"New York","USA"
"North Carolina","USA"
"North Dakota","USA"
"Ohio","USA"
"Oklahoma","USA"
"Oregon","USA"
"Pennsylvania","USA"
"Rhode Island","USA"
"South Carolina","USA"
"South Dakota","USA"
"Tennessee","USA"
"Texas","USA"
"Utah","USA"
"Vermont","USA"
"Virginia","USA"
"Washington","USA"
"West Virginia","USA"
"Wisconsin","USA"
"Wyoming","USA"
"Washington DC","USA"
"Alberta","Canada"
"British Columbia","Canada"
"Manitoba","Canada"
"New Brunswick","Canada"
"Newfoundland and Labrador","Canada"
"Nova Scotia","Canada"
"Ontario","Canada"
"Prince Edward Island","Canada"
"Quebec","Canada"
"Saskatchewan","Canada"

引用符を削除するには、以下を渡すことができますtr

awk -F, -vOFS="," '{print $2,$4}' file | tr -d \"

出力を受け取る正確に示されているように(したがってno"USnot USA)、次のものを使用できます(GNU仮定)。CaCanadased

awk -F, -vOFS="," '{print $2,$4}' file | sed 's/"//g; s/USA/US/; s/Canada/Ca/'

またはGNUがない場合sed

awk -F, -vOFS="," '{print $2,$4}' file | sed -e 's/"//g' -e 's/USA/US/' -e 's/Canada/Ca/'

関連情報