計算平均值時，總和是不同的每個條件

Question 1

我和R一起工作。

在這里我分享我的數據樣本。。。

structure(list(column_a = c("1_1", "1_1", "1_2", "1_2", "1_2", 
"2_1", "2_2", "2_2", "3_1", "3_2"), column_b = c("kitchen", "tree", 
"hate", "kind", "table", "dog", "human", "car", "moon", "rage"
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))

   column_a column_b
1       1_1  kitchen
2       1_1     tree
3       1_2     hate
4       1_2     kind
5       1_2    table
6       2_1      dog
7       2_2    human
8       2_2      car
9       3_1     moon
10      3_2     rage

我需要計算條件（1u1、1u2等）產生的單詞總數的平均值。我唯一的問題是以“\u1”結束的條件的總數是50，以“\u2”結束的條件的總數是100。

因此，由于條件“1_1”產生了兩個單詞（在樣本中），我應該用50來計算平均值，這是2/50=0.04。但是，在計算“1μ2”條件下的平均值時，我需要除以100，即3/100=0.03。

我需要創建一個列，其中包含條件產生的單詞總數的平均值，考慮到某些條件需要用50來計算，而其他條件需要用100來計算。我如何做到這一點，并且仍然將結果放在同一列中？

Question 2

您可以使用case_when包含一個新的列（divide），它可以是50或100，并將組（column_a）中的行數除以它。

library(dplyr)

df %>%
  mutate(divide = case_when(endsWith(column_a, '_1') ~ 50, 
                            endsWith(column_a, '_2') ~ 100)) %>%
  group_by(column_a) %>%
  mutate(value = n()/divide) %>%
  ungroup

#   column_a column_b divide value
#   <chr>    <chr>     <dbl> <dbl>
# 1 1_1      kitchen      50  0.04
# 2 1_1      tree         50  0.04
# 3 1_2      hate        100  0.03
# 4 1_2      kind        100  0.03
# 5 1_2      table       100  0.03
# 6 2_1      dog          50  0.02
# 7 2_2      human       100  0.02
# 8 2_2      car         100  0.02
# 9 3_1      moon         50  0.02
#10 3_2      rage        100  0.01

與add_count類似-

library(dplyr)

df %>%
  mutate(divide = case_when(endsWith(column_a, '_1') ~ 50, 
                            endsWith(column_a, '_2') ~ 100)) %>%
  add_count(column_a) %>%
  mutate(value = n/divide) 
  ungroup

Answer 1

您可以使用case_when包含一個新的列（divide），它可以是50或100，并將組（column_a）中的行數除以它。

library(dplyr)

df %>%
  mutate(divide = case_when(endsWith(column_a, '_1') ~ 50, 
                            endsWith(column_a, '_2') ~ 100)) %>%
  group_by(column_a) %>%
  mutate(value = n()/divide) %>%
  ungroup

#   column_a column_b divide value
#   <chr>    <chr>     <dbl> <dbl>
# 1 1_1      kitchen      50  0.04
# 2 1_1      tree         50  0.04
# 3 1_2      hate        100  0.03
# 4 1_2      kind        100  0.03
# 5 1_2      table       100  0.03
# 6 2_1      dog          50  0.02
# 7 2_2      human       100  0.02
# 8 2_2      car         100  0.02
# 9 3_1      moon         50  0.02
#10 3_2      rage        100  0.01

與add_count類似-

library(dplyr)

df %>%
  mutate(divide = case_when(endsWith(column_a, '_1') ~ 50, 
                            endsWith(column_a, '_2') ~ 100)) %>%
  add_count(column_a) %>%
  mutate(value = n/divide) 
  ungroup

計算平均值時，總和是不同的每個條件

熱門問答

sortable&Nestable Category如何將其發送到服務器？

用邏輯提取python中的某些列

pdf格式的美人魚Quarto

Win32 API應用程序問題

類型

如何使用Spring JPA和StartingWIth查詢MongoDB

使用TypeScript編寫Angular應用時，如何實現Mustache樣式的數據綁定

我應該如何計算PHP中字符串開頭的字符出現次數

jpa怎樣根據查詢中間表獲取多表數據？

Zhilin Liu’s research interest and research potential are beyond doubt

memory clear怎么設置

一段循環語句,最優優化的情況下