linux +マシン番号に基づいてファイルの行を並べ替える

linux +マシン番号に基づいてファイルの行を並べ替える

次のファイルがあります

    more /home/list.in

    master01.fsdns.com AMBARI_METRICS STARTED
    master02.fsdns.com AMBARI_METRICS STARTED
    master03.fsdns.com AMBARI_METRICS STARTED
    worker01.fsdns.com AMBARI_METRICS STARTED
    worker02.fsdns.com AMBARI_METRICS STARTED
    worker03.fsdns.com AMBARI_METRICS STARTED
    worker05.fsdns.com AMBARI_METRICS STARTED
    worker06.fsdns.com AMBARI_METRICS STARTED
    worker07.fsdns.com AMBARI_METRICS STARTED
    worker08.fsdns.com AMBARI_METRICS STARTED
    worker09.fsdns.com AMBARI_METRICS STARTED

    master01.fsdns.com YARN STARTED
    master02.fsdns.com YARN STARTED
    master03.fsdns.com YARN STARTED
    worker01.fsdns.com YARN STARTED
    worker02.fsdns.com YARN STARTED
    worker03.fsdns.com YARN STARTED
    worker05.fsdns.com YARN STARTED
    worker06.fsdns.com YARN STARTED
    worker07.fsdns.com YARN STARTED
    worker08.fsdns.com YARN STARTED
    worker09.fsdns.com YARN STARTED

    master01.fsdns.com HDFS STARTED
    master02.fsdns.com HDFS STARTED
    master03.fsdns.com HDFS STARTED
    worker01.fsdns.com HDFS STARTED
    worker02.fsdns.com HDFS STARTED
    worker03.fsdns.com HDFS STARTED
    worker05.fsdns.com HDFS STARTED
    worker06.fsdns.com HDFS STARTED
    worker07.fsdns.com HDFS STARTED
    worker08.fsdns.com HDFS STARTED
    worker09.fsdns.com HDFS STARTED

list.inファイルを次の構造に並べ替えようとしています(予想結果)。

したがって、マシン番号に関連するすべての行は同じグループに属します。

期待されるパフォーマンス

    master01.fsdns.com AMBARI_METRICS STARTED
    master01.fsdns.com YARN STARTED
    master01.fsdns.com HDFS  STARTED

    master02.fsdns.com AMBARI_METRICS STARTED
    master02.fsdns.com YARN STARTED
    master02.fsdns.com HDFS STARTED

    master03.fsdns.com AMBARI_METRICS STARTED
    master03.fsdns.com YARN STARTED
    master03.fsdns.com HDFS STARTED
    .
    .
    .
    .
    . 
    worker09.fsdns.com AMBARI_METRICS STARTED
    worker09.fsdns.com YARN STARTED
    worker09.fsdns.com HDFS STARTED

私は今まで何を試しましたか

 for i in 01 02 03 04 05 06 07 
 do
  grep  worker$i /tmp/list.in
 done


 worker01.fsdns.com AMBARI_METRICS STARTED
 worker01.fsdns.com YARN STARTED
 worker01.fsdns.com HDFS STARTED
 worker02.fsdns.com AMBARI_METRICS STARTED
 worker02.fsdns.com YARN STARTED
 worker02.fsdns.com HDFS STARTED
 worker03.fsdns.com AMBARI_METRICS STARTED
 worker03.fsdns.com YARN STARTED
 worker03.fsdns.com HDFS STARTED
 worker05.fsdns.com AMBARI_METRICS STARTED
 worker05.fsdns.com YARN STARTED
 worker05.fsdns.com HDFS STARTED
 worker06.fsdns.com AMBARI_METRICS STARTED
 worker06.fsdns.com YARN STARTED
 worker06.fsdns.com HDFS STARTED
 worker07.fsdns.com AMBARI_METRICS STARTED
 worker07.fsdns.com YARN STARTED
 worker07.fsdns.com HDFS STARTED

答え1

空行が重要でない場合、単純なソートコマンドは次のようになります。

sort -t. -k1 /home/list.in

結果(前に空白行を含む):

master01.fsdns.com AMBARI_METRICS STARTED
master01.fsdns.com HDFS STARTED
master01.fsdns.com YARN STARTED
master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com HDFS STARTED
master02.fsdns.com YARN STARTED
master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com HDFS STARTED
master03.fsdns.com YARN STARTED
worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com HDFS STARTED
worker01.fsdns.com YARN STARTED
worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com HDFS STARTED
worker02.fsdns.com YARN STARTED
worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com HDFS STARTED
worker03.fsdns.com YARN STARTED
worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com HDFS STARTED
worker05.fsdns.com YARN STARTED
worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com HDFS STARTED
worker06.fsdns.com YARN STARTED
worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com HDFS STARTED
worker07.fsdns.com YARN STARTED
worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com HDFS STARTED
worker08.fsdns.com YARN STARTED
worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com HDFS STARTED
worker09.fsdns.com YARN STARTED

答え2

$ sort -k1,1 list.in  | 
    awk '
      /^[[:space:]]*$/ { next };
      lasthost == "" { lasthost = $1 };
      $1 == lasthost { print $0; next };
      {print "\n" $0 ; lasthost=$1 }' 
master01.fsdns.com AMBARI_METRICS STARTED
master01.fsdns.com HDFS STARTED
master01.fsdns.com YARN STARTED

master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com HDFS STARTED
master02.fsdns.com YARN STARTED

master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com HDFS STARTED
master03.fsdns.com YARN STARTED

worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com HDFS STARTED
worker01.fsdns.com YARN STARTED

worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com HDFS STARTED
worker02.fsdns.com YARN STARTED

worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com HDFS STARTED
worker03.fsdns.com YARN STARTED

worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com HDFS STARTED
worker05.fsdns.com YARN STARTED

worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com HDFS STARTED
worker06.fsdns.com YARN STARTED

worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com HDFS STARTED
worker07.fsdns.com YARN STARTED

worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com HDFS STARTED
worker08.fsdns.com YARN STARTED

worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com HDFS STARTED
worker09.fsdns.com YARN STARTED

awkスクリプトは、$ 1フィールドに示されている最後のホスト名を追跡し、変更時に現在の入力行の前に改行文字を印刷します。また、完全に空白または空白文字のみを含む行はスキップされます。

最初のレコードの前に空白行が印刷されるのを防ぐために、変数が空であること(つまり未定義)を確認し、lasthostそうであれば設定します。

答え3

これは働きます:

awk '$1{a[$1];b[$2]}
END{asorti(a);for( i in a){for(j in b){printf("%s %s\n",a[i],j)};printf("\n")}}' file

$1空でない最初のフィールドの配列
{a[$1];b[$2]}a と b を作成します。
END{すべてのファイルを読み取ったら、
asorti(a)サーバー上の各コンピューターの配列aをソートし、ソートされた値を印刷し、入力ファイルの新しい(空白)値を印刷します。
for( i in a ){
for(j in b){
printf("%s %s\n",a[i],j)};
printf("\n")}
}' file

答え4

同じ目的を達成するには、awkとsedを使用してください。テスト後の効果はとても良いです

i=`awk -F "." '{print $1}' l.txt  | sed '/^$/d' | sed  "s/\s+//g" | sort -u`; for j in $i; do sed -n "/$j/p" l.txt; done

出力

master01.fsdns.comAMBARI_METRICS STARTED
master01.fsdns.com YARN STARTED
master01.fsdns.com HDFS STARTED
master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com YARN STARTED
master02.fsdns.com HDFS STARTED
master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com YARN STARTED
master03.fsdns.com HDFS STARTED
worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com YARN STARTED
worker01.fsdns.com HDFS STARTED
worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com YARN STARTED
worker02.fsdns.com HDFS STARTED
worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com YARN STARTED
worker03.fsdns.com HDFS STARTED
worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com YARN STARTED
worker05.fsdns.com HDFS STARTED
worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com YARN STARTED
worker06.fsdns.com HDFS STARTED
worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com YARN STARTED
worker07.fsdns.com HDFS STARTED
worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com YARN STARTED
worker08.fsdns.com HDFS STARTED
worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com YARN STARTED
worker09.fsdns.com HDFS STARTED

関連情報