単語を含む行数の計算

Question 1

$ perl -MList::Util=uniq -alne '
  map { $h{$_}++ } uniq @F }{ for $k (sort keys %h) {print "$k: $h{$k}"}
' file
0: 1
1: 1
2: 1
a: 1
different: 1
hello: 1
is: 3
man: 2
one: 1
possible: 1
the: 3
this: 1
world: 2

Answer

別のPerlバリアント、使用リスト::ユーティリティ

$ perl -MList::Util=uniq -alne '
  map { $h{$_}++ } uniq @F }{ for $k (sort keys %h) {print "$k: $h{$k}"}
' file
0: 1
1: 1
2: 1
a: 1
different: 1
hello: 1
is: 3
man: 2
one: 1
possible: 1
the: 3
this: 1
world: 2

Question 2

Bashでは簡単に：

declare -A wordcount
while read -ra words; do 
    # unique words on this line
    declare -A uniq
    for word in "${words[@]}"; do 
        uniq[$word]=1
    done
    # accumulate the words
    for word in "${!uniq[@]}"; do 
        ((wordcount[$word]++))
    done
    unset uniq
done < file

データを見てください：

$ declare -p wordcount
declare -A wordcount='([possible]="1" [one]="1" [different]="1" [this]="1" [a]="1" [hello]="1" [world]="2" [man]="2" [0]="1" [1]="1" [2]="1" [is]="3" [the]="3" )'

そして必要に応じてフォーマットします。

$ printf "%s\n" "${!wordcount[@]}" | sort | while read key; do echo "$key:${wordcount[$key]}"; done
0:1
1:1
2:1
a:1
different:1
hello:1
is:3
man:2
one:1
possible:1
the:3
this:1
world:2

Answer

Bashでは簡単に：

declare -A wordcount
while read -ra words; do 
    # unique words on this line
    declare -A uniq
    for word in "${words[@]}"; do 
        uniq[$word]=1
    done
    # accumulate the words
    for word in "${!uniq[@]}"; do 
        ((wordcount[$word]++))
    done
    unset uniq
done < file

データを見てください：

$ declare -p wordcount
declare -A wordcount='([possible]="1" [one]="1" [different]="1" [this]="1" [a]="1" [hello]="1" [world]="2" [man]="2" [0]="1" [1]="1" [2]="1" [is]="3" [the]="3" )'

そして必要に応じてフォーマットします。

$ printf "%s\n" "${!wordcount[@]}" | sort | while read key; do echo "$key:${wordcount[$key]}"; done
0:1
1:1
2:1
a:1
different:1
hello:1
is:3
man:2
one:1
possible:1
the:3
this:1
world:2

Question 3

これは非常に単純なPerlスクリプトです：

#!/usr/bin/perl -w
use strict;

my %words = ();
while (<>) {
  chomp;
  my %linewords = ();
  map { $linewords{$_}=1 } split / /;
  foreach my $word (keys %linewords) {
    $words{$word}++;
  }
}

foreach my $word (sort keys %words) {
  print "$word:$words{$word}\n";
}

基本的な考え方は、各行に対して入力を繰り返し、単語に分割し、その単語をハッシュ（関連配列）に保存して重複項目を削除し、単語配列を繰り返してその項目のカウンター全体に追加することです。一つの言葉。最後に、単語とその数が報告されます。

Answer

これは非常に単純なPerlスクリプトです：

#!/usr/bin/perl -w
use strict;

my %words = ();
while (<>) {
  chomp;
  my %linewords = ();
  map { $linewords{$_}=1 } split / /;
  foreach my $word (keys %linewords) {
    $words{$word}++;
  }
}

foreach my $word (sort keys %words) {
  print "$word:$words{$word}\n";
}

基本的な考え方は、各行に対して入力を繰り返し、単語に分割し、その単語をハッシュ（関連配列）に保存して重複項目を削除し、単語配列を繰り返してその項目のカウンター全体に追加することです。一つの言葉。最後に、単語とその数が報告されます。

Question 4

もう1つの簡単な代替方法はPython（> 3.6）を使用することです。解決策は@Larryが彼の記事で言及したのと同じ問題です。コメント。

from collections import Counter

with open("words.txt") as f:
    c = Counter(word for line in [line.strip().split() for line in f] for word in set(line))
    for word, occurrence in sorted(c.items()):
        print(f'{word}:{occurrence}')
        # for Python 2.7.x compatibility you can replace the above line with 
        # the following one:
        # print('{}:{}'.format(word, occurrence))

上記の内容をより明示的に表現すると、次のようになります。

from collections import Counter


FILENAME = "words.txt"


def find_unique_words():
    with open(FILENAME) as f:
        lines = [line.strip().split() for line in f]

    unique_words = Counter(word for line in lines for word in set(line))
    return sorted(unique_words.items())


def print_unique_words():
    unique_words = find_unique_words()
    for word, occurrence in unique_words:
        print(f'{word}:{occurrence}')


def main():
    print_unique_words()


if __name__ == '__main__':
    main()

出力：

0:1
1:1
2:1
a:1
different:1
hello:1
is:3
man:2
one:1
possible:1
the:3
this:1
world:2

上記も想定しています。ワード.txtと同じディレクトリにありますscript.py。これは、ここで提供されている他の解決策とは大きく異なりませんが、誰かが役に立つと思うかもしれません。

Answer

もう1つの簡単な代替方法はPython（> 3.6）を使用することです。解決策は@Larryが彼の記事で言及したのと同じ問題です。コメント。

from collections import Counter

with open("words.txt") as f:
    c = Counter(word for line in [line.strip().split() for line in f] for word in set(line))
    for word, occurrence in sorted(c.items()):
        print(f'{word}:{occurrence}')
        # for Python 2.7.x compatibility you can replace the above line with 
        # the following one:
        # print('{}:{}'.format(word, occurrence))

上記の内容をより明示的に表現すると、次のようになります。

from collections import Counter


FILENAME = "words.txt"


def find_unique_words():
    with open(FILENAME) as f:
        lines = [line.strip().split() for line in f]

    unique_words = Counter(word for line in lines for word in set(line))
    return sorted(unique_words.items())


def print_unique_words():
    unique_words = find_unique_words()
    for word, occurrence in unique_words:
        print(f'{word}:{occurrence}')


def main():
    print_unique_words()


if __name__ == '__main__':
    main()

出力：

0:1
1:1
2:1
a:1
different:1
hello:1
is:3
man:2
one:1
possible:1
the:3
this:1
world:2

上記も想定しています。ワード.txtと同じディレクトリにありますscript.py。これは、ここで提供されている他の解決策とは大きく異なりませんが、誰かが役に立つと思うかもしれません。

単語を含む行数の計算

答え1

答え2

答え3

答え4

関連情報