r - dplyr (version 0.4.3) - Issue with group_by and slice -
i've created simple example objective get, each group, top n rows in terms of column , calculate average value of column.
in example i'm using mtcars dataset, grouping variable am (0s , 1s), column i'm sorting/arranging hp , average values want of column disp.
it simple , straight-forward process , arranging , picking top n rows in terms of hp can use (a) top_n or (b) arrange , slice. in case (a) works well, in case (b) seems after slice have group again.
is reasonable thing when using slice? why grouping lost?
library(dplyr) # version 0.4.3 # example dataframe dt = data.frame(mtcars) # case (a) # using top_n dt %>% group_by(am) %>% top_n(n=4,hp) %>% summarise(meandisp = mean(disp)) # meandisp # (dbl) (dbl) # 1 0 402.500 # 2 1 223.025 # case (b) # using slice dt %>% group_by(am) %>% arrange(desc(hp)) %>% slice(1:4) %>% summarise(meandisp = mean(disp)) # error: corrupt 'grouped_df', contains 8 rows, , 32 rows in groups # using slice , grouping again dt %>% group_by(am) %>% arrange(desc(hp)) %>% slice(1:4) %>% group_by(am) %>% summarise(meandisp = mean(disp)) # meandisp # (dbl) (dbl) # 1 0 402.500 # 2 1 223.025
Comments
Post a Comment