data.table - R Data Aggregation With WHERE Clause on Group -


as example, have data.table shown below. want simple aggregation b=sum(b). c, want value of record in c b maximum. desired output shown below (data.aggr). leads few questions:

1) there way data.table?

2) there simpler way in plyr?

3) in plyr output object got change data.table data.frame. can avoid behavior?

library(plyr) library(data.table)  dt <- data.table(a=c('a', 'a', 'a', 'b', 'b'), b=c(1, 2, 3, 4, 5),                   c=c('m', 'n', 'p', 'q', 'r')) dt #    b c # 1: 1 m # 2: 2 n # 3: 3 p # 4: b 4 q # 5: b 5 r dt.split <- split(dt, dt$a) dt.aggr <- ldply(lapply(dt.split,       fun=function(dt){ dt[, .(b=sum(b), c=dt[b==max(b), c]),      by=.(a)] }), .id='a') dt.aggr #   b c # 1 6 p # 2 b 9 r class(dt.aggr) # [1] "data.frame" 

this simple operation within data.table scope

dt[, .(b = sum(b), c = c[which.max(b)]), = a] #    b c # 1: 6 p # 2: b 9 r 

a similar option be

dt[order(b), .(b = sum(b), c = c[.n]), = a] 

Comments

Popular posts from this blog

android - MPAndroidChart - How to add Annotations or images to the chart -

javascript - Add class to another page attribute using URL id - Jquery -

firefox - Where is 'webgl.osmesalib' parameter? -