sed または grep を使用したファイルのパターンのフィルタリング

sed または grep を使用したファイルのパターンのフィルタリング

extended.logコマンドを使用するawkか、grepそのファイル内のすべての一意のユーザー名を新しいファイルに保存したいと思いますsed

以下は、タブで区切られたファイルのフィールド名です。私はこのフィールド(12番目のフィールド)の値だけが欲しいです"username"

"record_id"     "client_id"     "request_id"    "date_time"     "elapsed_time"  "status"        "size"  "upload"        "download"      "bypassed"      "client_ip"     "username"      "method"        "url"   "http_referer"  "useragent"     "mime"  "filter_name"   "filtering_reason"      "interface"     "cachecode"     "peercode"      "peer"  "request_host"  "request_tld"   "referer_host"  "referer_tld"   "range" "time_profiles" "user_groups"   "request_profiles"      "application_signatures"        "categories"    "response_profiles"     "upload_content_types"  "download_content_types"        "profiles"

以下はファイルの内容の例です。

"SVZerDLJhIj6G3PA.6575.1466420105.346.1837.1"   "1837"  "1"     "20/Jun/2016:16:25:05"  "4"     "200"   "0"     "-"     "0"     "-"     "192.168.12.13" "[email protected]""GET"   "-"     "-"     "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"   "-"     "-"     "-"     "192.168.14.11:8080"    "TCP_MISS"      "DIRECT"        "safesquid"      "192.168.14.11:8080"    "-"     "-"     "-"     "0"     ""      "NO_AUTHENTICATION"     ""      ""      ""      ""      ""      ""      ""
"SVZerDLJhIj6G3PA.6575.1466420107.357.1838.1"   "1838"  "1"     "20/Jun/2016:16:25:07"  "4"     "200"   "0"     "-"     "0"     "-"     "192.168.12.13" "[email protected]""GET"   "-"     "-"     "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"   "-"     "-"     "-"     "192.168.14.11:8080"    "TCP_MISS"      "DIRECT"        "safesquid"      "192.168.14.11:8080"    "-"     "-"     "-"     "0"     ""      "NO_AUTHENTICATION"     ""      ""      ""      ""      ""      ""      ""
"SVZerDLJhIj6G3PA.6575.1466420109.367.1840.1"   "1840"  "1"     "20/Jun/2016:16:25:09"  "4"     "200"   "0"     "-"     "0"     "-"     "192.168.12.13" "[email protected]""GET"   "-"     "-"     "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"   "-"     "-"     "-"     "192.168.14.11:8080"    "TCP_MISS"      "DIRECT"        "safesquid"      "192.168.14.11:8080"    "-"     "-"     "-"     "0"     ""      "NO_AUTHENTICATION"     ""      ""      ""      ""      ""      ""      ""
"SVZerDLJhIj6G3PA.6575.1466420111.377.1841.1"   "1841"  "1"     "20/Jun/2016:16:25:11"  "4"     "200"   "0"     "-"     "0"     "-"     "192.168.12.13" "[email protected]""GET"   "-"     "-"     "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"   "-"     "-"     "-"     "192.168.14.11:8080"    "TCP_MISS"      "DIRECT"        "safesquid"      "192.168.14.11:8080"    "-"     "-"     "-"     "0"     ""      "NO_AUTHENTICATION"     ""      ""      ""      ""      ""      ""      ""
"SVZerDLJhIj6G3PA.6575.1466420113.387.1842.1"   "1842"  "1"     "20/Jun/2016:16:25:13"  "5"     "200"   "0"     "-"     "0"     "-"     "192.168.12.13" "[email protected]""GET"   "-"     "-"     "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"   "-"     "-"     "-"     "192.168.14.11:8080"    "TCP_MISS"      "DIRECT"        "safesquid"      "192.168.14.11:8080"    "-"     "-"     "-"     "0"     ""      "NO_AUTHENTICATION"     ""      ""      ""      ""      ""      ""      ""
"SVZerDLJhIj6G3PA.6575.1466420115.399.1843.1"   "1843"  "1"     "20/Jun/2016:16:25:15"  "5"     "200"   "0"     "-"     "0"     "-"     "192.168.12.13" "[email protected]""GET"   "-"     "-"     "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"   "-"     "-"     "-"     "192.168.14.11:8080"    "TCP_MISS"      "DIRECT"        "safesquid"      "192.168.14.11:8080"    "-"     "-"     "-"     "0"     ""      "NO_AUTHENTICATION"     ""      ""      ""      ""      ""      ""      ""
"SVZerDLJhIj6G3PA.6575.1466420117.410.1844.1"   "1844"  "1"     "20/Jun/2016:16:25:17"  "4"     "200"   "0"     "-"     "0"     "-"     "192.168.12.13" "[email protected]""GET"   "-"     "-"     "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"   "-"     "-"     "-"     "192.168.14.11:8080"    "TCP_MISS"      "DIRECT"        "safesquid"      "192.168.14.11:8080"    "-"     "-"     "-"     "0"     ""      "NO_AUTHENTICATION"     ""      ""      ""      ""      ""      ""      ""
"SVZerDLJhIj6G3PA.6575.1466420119.421.1845.1"   "1845"  "1"     "20/Jun/2016:16:25:19"  "4"     "200"   "0"     "-"     "0"     "-"     "192.168.12.13" "[email protected]""GET"   "-"     "-"     "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"   "-"     "-"     "-"     "192.168.14.11:8080"    "TCP_MISS"      "DIRECT"        "safesquid"      "192.168.14.11:8080"    "-"     "-"     "-"     "0"     ""      "NO_AUTHENTICATION"     ""      ""      ""      ""      ""      ""      ""
"SVZerDLJhIj6G3PA.6575.1466420121.431.1846.1"   "1846"  "1"     "20/Jun/2016:16:25:21"  "4"     "200"   "0"     "-"     "0"     "-"     "192.168.12.13" "[email protected]""GET"   "-"     "-"     "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"   "-"     "-"     "-"     "192.168.14.11:8080"    "TCP_MISS"      "DIRECT"        "safesquid"      "192.168.14.11:8080"    "-"     "-"     "-"     "0"     ""      "NO_AUTHENTICATION"     ""      ""      ""      ""      ""      ""      ""
"SVZerDLJhIj6G3PA.6575.1466420123.445.1847.1"   "1847"  "1"     "20/Jun/2016:16:25:23"  "4"     "200"   "0"     "-"     "0"     "-"     "192.168.12.13" "[email protected]""GET"   "-"     "-"     "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"   "-"     "-"     "-"     "192.168.14.11:8080"    "TCP_MISS"      "DIRECT"        "safesquid"      "192.168.14.11:8080"    "-"     "-"     "-"     "0"     ""      "NO_AUTHENTICATION"     ""      ""      ""      ""      ""      ""      ""
"SVZerDLJhIj6G3PA.6575.1466420108.240.1839.1"   "1839"  "1"     "20/Jun/2016:16:25:23"  "15623" "200"   "2826"  "0"     "2826"  "-"     "192.168.0.14"  "[email protected]""CONNECT"        "connect://livehelp.safesquid.com:443/" "-"     "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"    "-"     "-"     "-"     "192.168.14.11:8080"    "TCP_MISS"      "DIRECT"        "livehelp.safesquid.com"        "livehelp.safesquid.com"        "safesquid.com" "-"     "-"      "1K-10K"        ""      "NO_AUTHENTICATION"     "uncachable request,BUSINESS SITES REQ" ""      "computersandsoftware"  ""      ""      ""      "uncachable"

答え1

努力する

 sed -e 's/^.*"\([^" ]*\)"".*/\1/' log | sort | uniq

 egrep -o  '[^"]+@[^"]+' log | sort | uniq

どこ

  • -o一致するパターンのみを印刷
  • [^X]+次の任意の数(> 0)の文字X

参考にしてください

  • sedソリューションリレーファイルのタイプミス/機能(二重引用符)
  • grepソリューションリレー[Eメール保護]模様
  • awk(またはperl)は、n番目のフィールドを抽出するのに適しています。

答え2

awkタブ区切りファイルに使用:

awk -F '\t' '{ print $12 }' file

これにより、12番目のフィールドが抽出されます。必要に応じて、出力を新しいファイルにリダイレクトできます。

利用可能なデータから両側の二重引用符を削除するには

awk -F '\t' '{ sub("^\"", "", $12); sub("\"$", "", $12); print $12 }' file

これにより、印刷する前に12番目のフィールドの最初と最後の文字(二重引用符の場合)を削除するために2回の置換が行われます。

最初の行をスキップするには(ヘッダー行の場合):

awk -F '\t' 'FNR > 1 { sub("^\"", "", $12); sub("\"$", "", $12); print $12 }' file

一意のユーザー名のみを取得するには、以下を使用してくださいawk

awk -F '\t' 'FNR > 1 && !( $12 in seen ) { seen[$12]++; sub("^\"", "", $12); sub("\"$", "", $12); print $12 }' file

これは、12番目のフィールドとして入力された配列を使用して解決されたユーザー名を追跡します。 12番目のフィールドのデータが配列のキーでない場合は、まだ表示されていません。

もう一つのアプローチは!seen[$12]!( $12 in seen )

sort一意でソートされたユーザー名を取得するには、次のようにします。

awk -F '\t' 'FNR > 1 { sub("^\"", "", $12); sub("\"$", "", $12); print $12 }' file | sort -u

関連情報