特定の文字列に一致する複数の列の値に基づいて新しい列を作成するには?

特定の文字列に一致する複数の列の値に基づいて新しい列を作成するには?

私のデータフレームは次のとおりです。

df=data.frame(
  eye_problemsdisorders_f6148_0_1=c("A","C","D",NA,"D","A","C",NA,"B","A"),
  eye_problemsdisorders_f6148_0_2=c("B","C",NA,"A","C","B",NA,NA,"A","D"),
  eye_problemsdisorders_f6148_0_3=c("C","A","D","D","B","A",NA,NA,"A","B"),
  eye_problemsdisorders_f6148_0_4=c("D","D",NA,"B","A","C",NA,"C","A","B"),
 eye_problemsdisorders_f6148_0_5=c("C","C",NA,"D","B","C",NA,"D","D","B")

実際、文字列 "eye_problemsdisorders_f6148"と常に一致しない列と、より多くの行があります。

私が望むのは、「case」という新しい列を作成することです。ここで、文字列「A」がすべての列で少なくとも1回発生するすべての行は値「1」を持ち、そうでない場合は値は「0」になります。したがって、上記の例では、「case」列の値は1,1,0,1,1,1,0,0,1,1です。

答え1

与えられた

> df=data.frame(
+   eye_problemsdisorders_f6148_0_1=c("A","C","D",NA,"D","A","C",NA,"B","A"),
+   eye_problemsdisorders_f6148_0_2=c("B","C",NA,"A","C","B",NA,NA,"A","D"),
+   eye_problemsdisorders_f6148_0_3=c("C","A","D","D","B","A",NA,NA,"A","B"),
+   eye_problemsdisorders_f6148_0_4=c("D","D",NA,"B","A","C",NA,"C","A","B"),
+   eye_problemsdisorders_f6148_0_5=c("C","C",NA,"D","B","C",NA,"D","D","B")
+ )

それから

> f = function(x) any(x == "A", na.rm = TRUE)
> 
> apply(df, MARGIN = 1, FUN = f)
 [1]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE
> 

論理値TRUEFALSE数値に変換し10新しい列として追加します。

> df$case <- as.numeric(apply(df, MARGIN = 1, FUN = f))
> 
> 
> df
   eye_problemsdisorders_f6148_0_1 eye_problemsdisorders_f6148_0_2
1                                A                               B
2                                C                               C
3                                D                            <NA>
4                             <NA>                               A
5                                D                               C
6                                A                               B
7                                C                            <NA>
8                             <NA>                            <NA>
9                                B                               A
10                               A                               D
   eye_problemsdisorders_f6148_0_3 eye_problemsdisorders_f6148_0_4
1                                C                               D
2                                A                               D
3                                D                            <NA>
4                                D                               B
5                                B                               A
6                                A                               C
7                             <NA>                            <NA>
8                             <NA>                               C
9                                A                               A
10                               B                               B
   eye_problemsdisorders_f6148_0_5 case
1                                C    1
2                                C    1
3                             <NA>    0
4                                D    1
5                                B    1
6                                C    1
7                             <NA>    0
8                                D    0
9                                D    1
10                               B    1

答え2

また短い回答に投票します。しかし、ここに一つがあります。

awk '{if ($0 ~ /A/) {printf 1} else {printf 0}}' datafile

awkは改行を印刷するので、ここにprintfが必要です。必要に応じて必要に応じてコンマを追加できます。

関連情報