「pattern2」が最後に表示された後、最初に「pattern1」が表示された行を削除しますか？

Question 1

これは1つのexライナーです。（ex前作とスクリプト形式ではそうですvi。）

printf '%s\n' '$?pattern2?/pattern1/d' x | ex file.txt

保存xと終了。%p変更したファイルのみを印刷したい場合は、次のように変更してください。いいえ変更を保存します（テストに適しています）。

$はファイルの最後の行を表し、は現在の位置から始まり、?pattern2?逆方向検索の最初の結果を表すアドレス、は行削除コマンドです。pattern2/pattern1/d

ex順方向および逆方向アドレス指定が必要な場合に使用します。

viVimでも同じことを対話的に実行できます。

vim file.txt

その後、入力してください。

:$?pattern2?/pattern1/d

そしてEnterを押します。

その後、保存して:xEnterを押して終了します。

Answer

これは1つのexライナーです。（ex前作とスクリプト形式ではそうですvi。）

printf '%s\n' '$?pattern2?/pattern1/d' x | ex file.txt

保存xと終了。%p変更したファイルのみを印刷したい場合は、次のように変更してください。いいえ変更を保存します（テストに適しています）。

$はファイルの最後の行を表し、は現在の位置から始まり、?pattern2?逆方向検索の最初の結果を表すアドレス、は行削除コマンドです。pattern2/pattern1/d

ex順方向および逆方向アドレス指定が必要な場合に使用します。

viVimでも同じことを対話的に実行できます。

vim file.txt

その後、入力してください。

:$?pattern2?/pattern1/d

そしてEnterを押します。

その後、保存して:xEnterを押して終了します。

Question 2

これには無差別的なアプローチがあります。データを読み取り、2回繰り返します。最初はパターン2の最後の発生を検索し、2番目はパターン1の最初の発生を検索します。

#!/usr/bin/perl

# usage:  perl remove-pattern.pl [file]
use strict;

# reads the contents of the text file completely
# removes end of line character and spurious control-M's
sub load {
   my $file = shift;
   open my $in, "<", $file or die "unable to open $file : $!";
   my @file_contents = <$in>;
   foreach ( @file_contents ) { 
      chomp; 
      s/\cM//g; 
   }
   return @file_contents;
}

#  gets the first file from the command line
#  after the perl script
my $ifile = shift;

# read the text file
my @file_contents = &load($ifile);

# set 2 variables for the index into the array 
my $p2 = -1;
my $p1 = -1;

# loop through the file contents and find the last
# of pattern2 (could go reverse the data and find the 
# first of pattern2
for( my $i = 0;$i < @file_contents; ++$i ) {
   if( $file_contents[$i] =~ /pattern2/) {
      $p2 = $i 
   } 
}

# start at the location of the last of pattern2
# and find the first of pattern1
for( my $i = $p2; $i < @file_contents; ++$i ) {
   if($file_contents[$i] =~ /pattern1/) {
     $p1 = $i ;
     last;
   }
}

# create an output file name
my $ofile = $ifile . ".filtered";

# open the output file for writing
open my $out, ">", $ofile or die "unable to open $ofile : $!"; 

# loop through the file contents and don't print the index if it matches
# p1.  print all others
for( my $i = 0;$i < @file_contents; ++$i ) {
   print $out "$file_contents[$i]\n" if ($i != $p1);
}


--- data.txt  ---
bla bla
pattern2
bla
pattern1
pattern2
bla
bla pattern1 bla
bla
pattern1

上記のPerlスクリプト名が"remove-pattern.pl"の場合、data.txt入力ファイルが与えられたら、次のコマンドを使用して実行されます。 %> perl delete-pattern.pl data.txt

生成された出力ファイル "data.txt.filtered"

--- data.txt.filtered ---
bla bla
pattern2
bla
pattern1
pattern2
bla
bla
pattern1

Answer

これには無差別的なアプローチがあります。データを読み取り、2回繰り返します。最初はパターン2の最後の発生を検索し、2番目はパターン1の最初の発生を検索します。

#!/usr/bin/perl

# usage:  perl remove-pattern.pl [file]
use strict;

# reads the contents of the text file completely
# removes end of line character and spurious control-M's
sub load {
   my $file = shift;
   open my $in, "<", $file or die "unable to open $file : $!";
   my @file_contents = <$in>;
   foreach ( @file_contents ) { 
      chomp; 
      s/\cM//g; 
   }
   return @file_contents;
}

#  gets the first file from the command line
#  after the perl script
my $ifile = shift;

# read the text file
my @file_contents = &load($ifile);

# set 2 variables for the index into the array 
my $p2 = -1;
my $p1 = -1;

# loop through the file contents and find the last
# of pattern2 (could go reverse the data and find the 
# first of pattern2
for( my $i = 0;$i < @file_contents; ++$i ) {
   if( $file_contents[$i] =~ /pattern2/) {
      $p2 = $i 
   } 
}

# start at the location of the last of pattern2
# and find the first of pattern1
for( my $i = $p2; $i < @file_contents; ++$i ) {
   if($file_contents[$i] =~ /pattern1/) {
     $p1 = $i ;
     last;
   }
}

# create an output file name
my $ofile = $ifile . ".filtered";

# open the output file for writing
open my $out, ">", $ofile or die "unable to open $ofile : $!"; 

# loop through the file contents and don't print the index if it matches
# p1.  print all others
for( my $i = 0;$i < @file_contents; ++$i ) {
   print $out "$file_contents[$i]\n" if ($i != $p1);
}


--- data.txt  ---
bla bla
pattern2
bla
pattern1
pattern2
bla
bla pattern1 bla
bla
pattern1

上記のPerlスクリプト名が"remove-pattern.pl"の場合、data.txt入力ファイルが与えられたら、次のコマンドを使用して実行されます。 %> perl delete-pattern.pl data.txt

生成された出力ファイル "data.txt.filtered"

--- data.txt.filtered ---
bla bla
pattern2
bla
pattern1
pattern2
bla
bla
pattern1

Question 3

行の行番号を見つけるには:

lineno=$( nl file | tac | awk '/pattern1/ {last = $1} /pattern2/ {print last; exit}' )

nlファイルに行番号を追加し、
tac行を反転し、
行awk番号を印刷するために使用されます。最後「モード1」今後これ最初「モード2」。

その後、この行を削除してください。

sed -i "${lineno}d" file

Answer

行の行番号を見つけるには:

lineno=$( nl file | tac | awk '/pattern1/ {last = $1} /pattern2/ {print last; exit}' )

nlファイルに行番号を追加し、
tac行を反転し、
行awk番号を印刷するために使用されます。最後「モード1」今後これ最初「モード2」。

その後、この行を削除してください。

sed -i "${lineno}d" file

Question 4

ファイルを一度だけ転送し、メモリ内の行数を最小限に抑えたい場合は、ステートマシンawkアプローチを使用できます。これは最短の解決策ではありませんが、簡単に見つけ、読み取り、保守できます。状態名を数字に置き換えることで（潜在的に）より効率的にすることができます。

PATTERN1=pattern1 PATTERN2=pattern2 awk '
  BEGIN {
    p1 = ENVIRON["PATTERN1"]
    p2 = ENVIRON["PATTERN2"]
    state = "init"
  }
  state == "init" {
    if ($0 ~ p2) state = "p2_found"
    print
    next
  }
  state == "p2_found" {
    if ($0 ~ p1) {
      state = "p1_found"
      p1_line = $0
      printf "%s", hold
      hold = ""
    } else if ($0 ~ p2) {
      # we can print the text held since the last p2
      printf "%s", hold
      hold = $0 RS
    } else hold = hold $0 RS
    next
  }
  state == "p1_found" {
    if ($0 ~ p2) {
      state = "p2_found"
      # the line that matched p1 is not discarded
      printf "%s\n%s", p1_line, hold;
      hold = ""
    }
    hold = hold $0 RS
  }
  END {
    # here we are not printing p1_line which is how it is discarded
    printf "%s", hold
  }'

pattern1（sumに一致する行がないとしますpattern2。）

Answer

ファイルを一度だけ転送し、メモリ内の行数を最小限に抑えたい場合は、ステートマシンawkアプローチを使用できます。これは最短の解決策ではありませんが、簡単に見つけ、読み取り、保守できます。状態名を数字に置き換えることで（潜在的に）より効率的にすることができます。

PATTERN1=pattern1 PATTERN2=pattern2 awk '
  BEGIN {
    p1 = ENVIRON["PATTERN1"]
    p2 = ENVIRON["PATTERN2"]
    state = "init"
  }
  state == "init" {
    if ($0 ~ p2) state = "p2_found"
    print
    next
  }
  state == "p2_found" {
    if ($0 ~ p1) {
      state = "p1_found"
      p1_line = $0
      printf "%s", hold
      hold = ""
    } else if ($0 ~ p2) {
      # we can print the text held since the last p2
      printf "%s", hold
      hold = $0 RS
    } else hold = hold $0 RS
    next
  }
  state == "p1_found" {
    if ($0 ~ p2) {
      state = "p2_found"
      # the line that matched p1 is not discarded
      printf "%s\n%s", p1_line, hold;
      hold = ""
    }
    hold = hold $0 RS
  }
  END {
    # here we are not printing p1_line which is how it is discarded
    printf "%s", hold
  }'

pattern1（sumに一致する行がないとしますpattern2。）

「pattern2」が最後に表示された後、最初に「pattern1」が表示された行を削除しますか？

答え1

答え2

答え3

答え4

関連情報