r - Create a rolling index of pairs over groups -


i need create (with r) rolling index of pairs data set includes groups. consider following data set:

times <- c(4,3,2) v1 <- unlist(lapply(times, function(x) seq(1, x))) df <- data.frame(group = rep(1:length(times), times = times),                   v1 = v1,                   rolling_index = c(1,1,2,2,3,3,4,5,5))  df   group v1 rolling_index 1     1  1             1 2     1  2             1 3     1  3             2 4     1  4             2 5     2  1             3 6     2  2             3 7     2  3             4 8     3  1             5 9     3  2             5 

the data frame have includes variables group , v1. within each group v1 designates running index (that may or may not start @ 1).

i want create new indexing variable looks rolling_index. variable groups rows within same group , consecutive v1 value, creating new rolling index. new index must consecutive on groups. if there uneven amount of rows within group (e.g. group 2), last, single row gets own rolling index value.

you can try

library(data.table) setdt(df)[,  gr:=as.numeric(gl(.n, 2, .n)), group][,       rollindex:=cumsum(c(true,abs(diff(gr))>0))][,gr:= null] #    group v1 rolling_index rollindex #1:     1  1             1         1 #2:     1  2             1         1 #3:     1  3             2         2 #4:     1  4             2         2 #5:     2  1             3         3 #6:     2  2             3         3 #7:     2  3             4         4 #8:     3  1             5         5 #9:     3  2             5         5 

or using base r

 indx1 <- !duplicated(df$group)  indx2 <- with(df, ave(group, group, fun=function(x)                            gl(length(x), 2, length(x))))  cumsum(c(true,diff(indx2)>0)|indx1)  #[1] 1 1 2 2 3 3 4 5 5 

update

the above methods based on 'group' column. suppose have sequence column ('v1') group showed in example, creation of rolling index easier

 cumsum(!!df$v1 %%2)  #[1] 1 1 2 2 3 3 4 5 5 

as mentioned in post, if 'v1' column not start @ '1' groups, can sequence 'group' , cumsum above

 cumsum(!!with(df, ave(seq_along(group), group, fun=seq_along))%%2)  #[1] 1 1 2 2 3 3 4 5 5 

Comments

Popular posts from this blog

IF statement in MySQL trigger -

c++ - What does MSC in "// appease MSC" comments mean? -

javascript - Blogger related post gadget image Resize s72-c [ Need Expert Help ] -