r - Mutate with dplyr strange error -
i'm trying create new variables mutate in dplyr , can't understand error, i've tried , have not stumbled upon issue in past.
i have large data set, on million observations. provide 20 first observations.
this how data looks like:
data1 <- read.table(header=true, text="idnr visit time year end event survival 7 1 04/09/06 2006 31/12/06 0 118 7 2 04/09/06 2007 31/12/07 0 483 7 3 04/09/06 2008 31/12/08 0 849 7 4 04/09/06 2009 31/12/09 0 1214 7 5 04/09/06 2010 31/12/10 0 1579 7 6 04/09/06 2011 31/12/11 0 1944 20 1 24/10/03 2003 31/12/03 0 68 20 2 24/10/03 2004 31/12/04 0 434 20 3 24/10/03 2005 31/12/05 0 799 20 4 24/10/03 2006 31/12/06 0 1164 20 5 24/10/03 2007 31/12/07 0 1529 20 6 24/10/03 2008 31/12/08 0 1895 20 7 24/10/03 2009 31/12/09 0 2260 20 8 24/10/03 2010 31/12/10 0 2625 20 9 24/10/03 2011 31/12/11 0 2990 87 1 17/01/06 2006 31/12/06 0 348 87 2 17/01/06 2007 31/12/07 0 713 87 3 17/01/06 2008 31/12/08 0 1079 87 4 17/01/06 2009 31/12/09 0 1444 87 5 17/01/06 2010 31/12/10 0 1809")
i must date , time variables not have format in dataset, i't coded posixct format ("%y-%m-%d"). i't somehow reformats when attach i't stackoverflow , apply "code" citations.
okey, problem i'm trying create new survival time variables in same dataset, 1 cox regression model stop , start time (survival stop time , new start variable should called survcox).
also im trying poisson regression offset variable (i.e survival time variable) should called survpois. code i'm trying use;
data2 <- data1 %>% group_by(idnr) %>% mutate(survcox = ifelse(visit==1, 0, lag(survival)), year_aar = substr(data1$year, 1,4), first_day = as.posixct(paste0(year_aar, "-01-01-")), survpois = as.numeric(data1$end - first_day)+1) %>% mutate(survpois = ifelse(year_aar > first_day, as.numeric(end - year_aar), survpois)) %>% ungroup()
i receive error in step!
error: incompatible size (1345000), expecting 6 (the group size) or 1
i have no idea why error, i't means , why code doesn't work.
all can appreciated, in advance!
it's because reference variable data1$year
doesn't fit in grouped data (and in data1$end
too)
Comments
Post a Comment