r - dplyr (version 0.4.3) - Issue with group_by and slice -
i've created simple example objective get, each group, top n rows in terms of column , calculate average value of column.
in example i'm using mtcars
dataset, grouping variable am
(0s , 1s), column i'm sorting/arranging hp
, average values want of column disp
.
it simple , straight-forward process , arranging , picking top n rows in terms of hp
can use (a) top_n
or (b) arrange
, slice
. in case (a) works well, in case (b) seems after slice
have group again.
is reasonable thing when using slice
? why grouping lost?
library(dplyr) # version 0.4.3 # example dataframe dt = data.frame(mtcars) # case (a) # using top_n dt %>% group_by(am) %>% top_n(n=4,hp) %>% summarise(meandisp = mean(disp)) # meandisp # (dbl) (dbl) # 1 0 402.500 # 2 1 223.025 # case (b) # using slice dt %>% group_by(am) %>% arrange(desc(hp)) %>% slice(1:4) %>% summarise(meandisp = mean(disp)) # error: corrupt 'grouped_df', contains 8 rows, , 32 rows in groups # using slice , grouping again dt %>% group_by(am) %>% arrange(desc(hp)) %>% slice(1:4) %>% group_by(am) %>% summarise(meandisp = mean(disp)) # meandisp # (dbl) (dbl) # 1 0 402.500 # 2 1 223.025
Comments
Post a Comment